-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Describe the bug
When deploying a multi model endpoint using the SageMaker MXNet container (which uses this library) I receive the following error when I try to invoke the endpoint: "AttributeError: 'NoneType' object has no attribute 'transform'.
To reproduce
Train a simple MXNet model in SageMaker using the example here.
At the end of the notebook, you verify that you can receive real time predictions from a SageMaker Endpoint using the model/code.
Next, add the following code to the notebook to deploy the same model as a multi model endpoint:
mme = MultiDataModel(name='mxnet-mme-final', model_data_prefix='<enter your own bucket + prefix>',
model=model, sagemaker_session=sagemaker_session)
mme.deploy(initial_instance_count=1, instance_type='ml.t2.medium')
Copy the model artifact to the model_data_prefix so it's accessible by MME.
Once it finishes deploying, try to invoke it using sagemaker_runtime in boto3, you will get the same error.
Expected behavior
I'd expect to be able to deploy the same model to MME as I can as a single endpoint using the SageMaker MXNet container. The SageMaker documentation says that MXNet container supports MME.
CloudWatch Endpoint Logs at time of failure
2021-02-11 19:25:28,443 [INFO ] pool-1-thread-1 ACCESS_LOG - /10.32.0.2:49282 "GET /ping HTTP/1.1" 200 1
2021-02-11 19:25:29,687 [WARN ] W-9000-29028cdaefc80ac4662997191 com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-29028cdaefc80ac4662997191
2021-02-11 19:25:29,807 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /home/model-server/tmp/.mms.sock.9000 --handler sagemaker_mxnet_serving_container.handler_service --model-path /opt/ml/models/29028cdaefc80ac4662997191ec53602/model --model-name 29028cdaefc80ac4662997191ec53602 --preload-model false --tmp-dir /home/model-server/tmp
2021-02-11 19:25:29,809 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.mms.sock.9000
2021-02-11 19:25:29,809 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - [PID] 69
2021-02-11 19:25:29,809 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - MMS worker started.
2021-02-11 19:25:29,809 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Python runtime: 3.6.10
2021-02-11 19:25:29,809 [INFO ] epollEventLoopGroup-3-1 com.amazonaws.ml.mms.wlm.ModelManager - Model 29028cdaefc80ac4662997191ec53602 loaded.
2021-02-11 19:25:29,818 [INFO ] W-9000-29028cdaefc80ac4662997191 com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
2021-02-11 19:25:29,833 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
2021-02-11 19:25:31,055 [INFO ] W-9000-29028cdaefc80ac4662997191-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model 29028cdaefc80ac4662997191ec53602 loaded io_fd=86fe85fffe5a7fee-00000026-00000002-7cbbb181288e375a-efae222a
2021-02-11 19:25:31,057 [INFO ] W-9000-29028cdaefc80ac4662997191 com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 1195
2021-02-11 19:25:31,058 [INFO ] W-9000-29028cdaefc80ac4662997191 ACCESS_LOG - /10.32.0.2:49282 "POST /models HTTP/1.1" 200 1400
2021-02-11 19:25:31,061 [WARN ] W-9000-29028cdaefc80ac4662997191 com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-29028cdaefc80ac4662997191-1
2021-02-11 19:25:31,106 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Invoking custom service failed.
2021-02-11 19:25:31,106 [INFO ] W-9000-29028cdaefc80ac4662997191 com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2
2021-02-11 19:25:31,106 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-02-11 19:25:31,106 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/usr/local/lib/python3.6/site-packages/mms/service.py", line 108, in predict
2021-02-11 19:25:31,106 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ret = self._entry_point(input_batch, self.context)
2021-02-11 19:25:31,106 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/usr/local/lib/python3.6/site-packages/sagemaker_inference/default_handler_service.py", line 50, in handle
2021-02-11 19:25:31,106 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - return self._service.transform(data, context)
2021-02-11 19:25:31,107 [INFO ] W-29028cdaefc80ac4662997191-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - AttributeError: 'NoneType' object has no attribute 'transform'
2021-02-11 19:25:31,109 [INFO ] W-9000-29028cdaefc80ac4662997191 ACCESS_LOG - /10.32.0.2:49282 "POST /models/29028cdaefc80ac4662997191ec53602/invoke HTTP/1.1" 503 14
System information
A description of your system. Please provide:
- Toolkit version: Latest? The version that is installed in the SageMaker container.
- Framework version: 1.6.0
- Python version: 3.6
- CPU or GPU: CPU
- Custom Docker image (Y/N): N
Additional context
I think I may know the cause of the bug.
Looking in to the handler_service.py code, the initialize() method retrieves the appropriate Transformer object and sets it to self._service. It then calls the parent class constructor, but it doesn't pass in the transformer object. The parent constructor accepts a transformer as a parameter (see here), but the parent class defaults the value of self._service to None, so when handle() is called by MMS, it uses the self._service set by the parent class, resulting in the error shown above.
If you look at SageMaker containers that successfully deploy MME, they pass the Transformer() in to the default handler constructor.
sagemaker-mxnet-inference-toolkit/src/sagemaker_mxnet_serving_container/handler_service.py
Lines 64 to 79 in d180ab6
| def initialize(self, context): | |
| """Calls the Transformer method that validates the user module against | |
| the SageMaker inference contract. | |
| """ | |
| properties = context.system_properties | |
| model_dir = properties.get("model_dir") | |
| # add model_dir/code to python path | |
| code_dir_path = "{}:".format(model_dir + '/code') | |
| if PYTHON_PATH_ENV in os.environ: | |
| os.environ[PYTHON_PATH_ENV] = code_dir_path + os.environ[PYTHON_PATH_ENV] | |
| else: | |
| os.environ[PYTHON_PATH_ENV] = code_dir_path | |
| self._service = self._user_module_transformer(model_dir) | |
| super(HandlerService, self).initialize(context) |