I can’t find the proper way to add dependencies to my Azure Container Instance for ML Inference.

I basically started by following this tutorial : Train and deploy an image classification model with an example Jupyter Notebook

It works fine.

Now I want to deploy my trained TensorFlow model for inference. I tried many ways, but I was never able to add python dependencies to the Environment.

From the TensorFlow curated environment

Using AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference :

from azureml.core import Workspace


# connect to your workspace
ws = Workspace.from_config()

# names
experiment_name = "my-experiment"
model_name = "my-model"
env_version="1"
env_name="my-env-"+env_version
service_name = str.lower(model_name + "-service-" + env_version)


# create environment for the deploy
from azureml.core.environment import Environment, DEFAULT_CPU_IMAGE
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import AciWebservice

# get a curated environment
env = Environment.get(
    workspace=ws, 
    name="AzureML-tensorflow-2.4-ubuntu18.04-py37-cpu-inference",
# )
custom_env = env.clone(env_name)
custom_env.inferencing_stack_version='latest'

# add packages
conda_dep = CondaDependencies()
python_packages = ['joblib', 'numpy', 'os', 'json', 'tensorflow']
for package in python_packages:
    conda_dep.add_pip_package(package)
    conda_dep.add_conda_package(package)

# Adds dependencies to PythonSection of env
custom_env.python.user_managed_dependencies=True
custom_env.python.conda_dependencies=conda_dep

custom_env.register(workspace=ws)

# create deployment config i.e. compute resources
aciconfig = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=1,
    tags={"experiment": experiment_name, "model": model_name},
)

from azureml.core.model import InferenceConfig
from azureml.core.model import Model

# get the registered model
model = Model(ws, model_name)

# create an inference config i.e. the scoring script and environment
inference_config = InferenceConfig(entry_script="score.py", environment=custom_env)

# deploy the service
service = Model.deploy(
    workspace=ws,
    name=service_name,
    models=[model],
    inference_config=inference_config,
    deployment_config=aciconfig,
)

service.wait_for_deployment(show_output=True)

I get the following log :

AzureML image information: tensorflow-2.4-ubuntu18.04-py37-cpu-inference:20220110.v1


PATH environment variable: /opt/miniconda/envs/amlenv/bin:/opt/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH environment variable: 

Pip Dependencies
---------------
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2022-01-24T10:21:09,855130300+00:00 - iot-server/finish 1 0
2022-01-24T10:21:09,856870100+00:00 - Exit code 1 is normal. Not restarting iot-server.
absl-py==0.15.0
applicationinsights==0.11.10
astunparse==1.6.3
azureml-inference-server-http==0.4.2
cachetools==4.2.4
certifi==2021.10.8
charset-normalizer==2.0.10
click==8.0.3
Flask==1.0.3
flatbuffers==1.12
gast==0.3.3
google-auth==2.3.3
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.32.0
gunicorn==20.1.0
h5py==2.10.0
idna==3.3
importlib-metadata==4.10.0
inference-schema==1.3.0
itsdangerous==2.0.1
Jinja2==3.0.3
Keras-Preprocessing==1.1.2
Markdown==3.3.6
MarkupSafe==2.0.1
numpy==1.19.5
oauthlib==3.1.1
opt-einsum==3.3.0
pandas==1.1.5
protobuf==3.19.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
python-dateutil==2.8.2
pytz==2021.3
requests==2.27.1
requests-oauthlib==1.3.0
rsa==4.8
six==1.15.0
tensorboard==2.7.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.4.0
tensorflow-estimator==2.4.0
termcolor==1.1.0
typing-extensions==3.7.4.3
urllib3==1.26.8
Werkzeug==2.0.2
wrapt==1.12.1
zipp==3.7.0


Entry script directory: /var/azureml-app/.

Dynamic Python package installation is disabled.
Starting AzureML Inference Server HTTP.

Azure ML Inferencing HTTP server v0.4.2


Server Settings
---------------
Entry Script Name: score.py
Model Directory: /var/azureml-app/azureml-models/my-model/1
Worker Count: 1
Worker Timeout (seconds): 300
Server Port: 31311
Application Insights Enabled: false
Application Insights Key: None


Server Routes
---------------
Liveness Probe: GET   127.0.0.1:31311/
Score:          POST  127.0.0.1:31311/score

Starting gunicorn 20.1.0
Listening at: http://0.0.0.0:31311 (69)
Using worker: sync
Booting worker with pid: 100
Exception in worker process
Traceback (most recent call last):
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
    worker.init_process()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/workers/base.py", line 134, in init_process
    self.load_wsgi()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
    return self.load_wsgiapp()
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/gunicorn/util.py", line 359, in import_app
    mod = importlib.import_module(module)
  File "/opt/miniconda/envs/amlenv/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/entry.py", line 1, in <module>
    import create_app
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/create_app.py", line 4, in <module>
    from routes_common import main
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/routes_common.py", line 32, in <module>
    from aml_blueprint import AMLBlueprint
  File "/opt/miniconda/envs/amlenv/lib/python3.7/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 28, in <module>
    main_module_spec.loader.exec_module(main)
  File "/var/azureml-app/score.py", line 4, in <module>
    import joblib
ModuleNotFoundError: No module named 'joblib'
Worker exiting (pid: 100)
Shutting down: Master
Reason: Worker failed to boot.
2022-01-24T10:21:13,851467800+00:00 - gunicorn/finish 3 0
2022-01-24T10:21:13,853259700+00:00 - Exit code 3 is not normal. Killing image.

From a Conda specification

Same as before, but with a fresh environment from Conda specification and changing the env_version number :

# ...


env_version="2"

# ...

custom_env = Environment.from_conda_specification(name=env_name, file_path="my-env.yml")
custom_env.docker.base_image = DEFAULT_CPU_IMAGE

# ...

with my-env.yml :

name: my-env
dependencies:
- python

- pip:
  - azureml-defaults
  - azureml-sdk
  - sklearn
  - numpy
  - matplotlib
  - joblib
  - uuid
  - requests
  - tensorflow

I get this log :

2022-01-24T11:06:54,887886931+00:00 - iot-server/run 
2022-01-24T11:06:54,891839877+00:00 - rsyslog/run 
2022-01-24T11:06:54,893640998+00:00 - gunicorn/run 
2022-01-24T11:06:54,912032812+00:00 - nginx/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2022-01-24T11:06:55,398420960+00:00 - iot-server/finish 1 0
2022-01-24T11:06:55,414425146+00:00 - Exit code 1 is normal. Not restarting iot-server.

PATH environment variable: /opt/miniconda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PYTHONPATH environment variable: 

Pip Dependencies
---------------
brotlipy==0.7.0
certifi==2020.6.20
cffi @ file:///tmp/build/80754af9/cffi_1605538037615/work
chardet @ file:///tmp/build/80754af9/chardet_1605303159953/work
conda==4.9.2
conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1603018138503/work
cryptography @ file:///tmp/build/80754af9/cryptography_1605544449973/work
idna @ file:///tmp/build/80754af9/idna_1593446292537/work
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1605545627475/work
PySocks @ file:///tmp/build/80754af9/pysocks_1594394576006/work
requests @ file:///tmp/build/80754af9/requests_1592841827918/work
ruamel-yaml==0.15.87
six @ file:///tmp/build/80754af9/six_1605205313296/work
tqdm @ file:///tmp/build/80754af9/tqdm_1605303662894/work
urllib3 @ file:///tmp/build/80754af9/urllib3_1603305693037/work

Starting HTTP server
2022-01-24T11:06:59,701365128+00:00 - gunicorn/finish 127 0
./run: line 127: exec: gunicorn: not found
2022-01-24T11:06:59,706177784+00:00 - Exit code 127 is not normal. Killing image.

I really don’t know what I’m missing, and I’ve been searching for too long already (Azure docs, SO, …).

Thanks for your help !

Edit : Non-exhaustive list of solutions I tried :

Answer

OK, I got it working : I started over from scratch and it worked.

I have no idea what was wrong in all my preceding tries, and that is terrible.

Multiple problems and how I (think I) solved them :

joblib : I actually didn’t need it to load my Keras model. But the problem was not with this specific library, rather that I couldn’t add dependencies to the inference environment.
Environment : finally, I was only able to make things work with a custom env : Environment.from_conda_specification(name=version, file_path="conda_dependencies.yml") . I haven’t been able to add my libraries (or specify a specific package version) to a “currated environment”. I don’t know why though…
TensorFlow : last problem I had was that I trained and registered my model in AzureML Notebook’s azureml_py38_PT_TF kernel (tensorflow==2.7.0), and tried to load it in the inference Docker image (tensorflow==2.4.0). So I had to specify the version of TensorFlow I wanted to use in the inference image (which required the previous point to be solved).

What finally worked :

notebook.ipynb

import uuid
from azureml.core import Workspace, Environment, Model
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig


version = "test-"+str(uuid.uuid4())[:8]

env = Environment.from_conda_specification(name=version, file_path="conda_dependencies.yml")
inference_config = InferenceConfig(entry_script="score.py", environment=env)

ws = Workspace.from_config()
model = Model(ws, model_name)

aci_config = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=1,
)

service = Model.deploy(
    workspace=ws,
    name=version,
    models=[model],
    inference_config=inference_config,
    deployment_config=aci_config,
    overwrite=True,
)

service.wait_for_deployment(show_output=True)

conda_dependencies.yml

channels:
- conda-forge
dependencies:
- python=3.8
- pip:
  - azureml-defaults
  - azureml-sdk
  - numpy
  - tensorflow==2.7.0

score.py

import os
import json
import numpy as np
import tensorflow as tf


def init():
    global model

    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model/data/model")
    model = tf.keras.models.load_model(model_path)



def run(raw_data):
    data = np.array(json.loads(raw_data)["data"])
    y_hat = model.predict(data)

    return y_hat.tolist()

AzureML Environment for Inference : can’t add pip packages to dependencies

From the TensorFlow curated environment

From a Conda specification

Advertisement

Answer