Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
aeaa267
added pytorch DLCs
philschmid Apr 29, 2021
29a7346
added TF container and cleand PT, move build_artifcats
philschmid Apr 29, 2021
b4e67ca
moved build artifcats to separate folder
philschmid Apr 29, 2021
5a48fde
removed todos
philschmid Apr 29, 2021
69c36f8
added inference buildspec
philschmid Apr 29, 2021
392d693
added build spec for tf
philschmid Apr 29, 2021
d503b84
[test][huggingface_pytorch] Updated number of tests in smmp test to 5…
mansimane Apr 29, 2021
ec861ef
[tensorflow][build][test] update TF2.3 for pillow to 8.2.0 (#1072)
jeet4320 Apr 29, 2021
faa3383
[tensorflow, pytorch][build][sagemaker] Updated smdataparallel binary…
karan6181 Apr 29, 2021
6022443
update pillow (#1079)
jeet4320 Apr 29, 2021
b5931e9
[tensforlow] Release TF2.3 cuda110 training cpu and gpu (#1078)
jeet4320 Apr 29, 2021
b986d13
[test] Fix smclarify test flakiness (#1082)
saimidu Apr 30, 2021
80c6494
Fix to execute efa tests on mainline (#1083)
jeet4320 Apr 30, 2021
ba5df80
[sagemaker] Fix typo in sagemaker test code (#1085)
jeet4320 May 1, 2021
1d92ebd
Add back efa configs to SM tests (#1086)
jeet4320 May 2, 2021
caa2d9b
Skip temporarily to revert it (#1087)
jeet4320 May 2, 2021
e8b9565
[benchmark][efa][sagemaker] Fix TF2 SM benchmark and add EFA configs …
jeet4320 May 4, 2021
d6f0e97
[pytorch][tensorflow][build][test] TF2.4.1 PT1.8.1 Set RDMAV_FORK_SAF…
jeet4320 May 6, 2021
e2e8994
[test] Add remote override for tests (#1093)
saimidu May 7, 2021
33037e9
[pytorch][tensorflow][build][test] Build OpenMPI without libfabric su…
indhub May 7, 2021
25ad625
[tensorflow][pytorch][release][training] update release for TF2.4.1 a…
jeet4320 May 7, 2021
2cf5475
[release] release tf2.3.2 dlc images (#1100)
junpuf May 11, 2021
c1404e6
[release] tf 2.3.2 release wave 2 (#1102)
junpuf May 12, 2021
8469a38
[test][benchmark][sagemaker][tensorflow,mxnet] Fix log file names (#1…
saimidu May 12, 2021
7041cf1
[tensorflow, pytorch][build] Update TF 2.4 and PT 1.8 DLC to use smde…
ndodda-amazon May 12, 2021
1b38f43
[test][sagemaker][huggingface] Deriving version for transormers SMDP …
mansimane May 13, 2021
52ad9fa
[test][pytorch][ec2] Fix flakiness in NCCL Version test (#1107)
saimidu May 14, 2021
10c4632
[tensorflow, pytorch][build] Update TF 2.3 and PT 1.7 DLC to use smde…
ndodda-amazon May 14, 2021
15e3fc8
[build] patch openssl (#1106)
junpuf May 17, 2021
f9ee2fa
[build][pytorch][neuron] Upgrade Pillow version in PT Neuron DLC (#1094)
saimidu May 18, 2021
88dab73
[test] upgrade dependency check version (#1112)
tejaschumbalkar May 21, 2021
fe4864d
[huggingface_tensorflow, huggingface_pytorch] update for Transformers…
philschmid May 24, 2021
bb7976d
[pytorch][build] Unpin Pillow and upgrade to Pillow 8.x on PT 1.7 doc…
saimidu May 25, 2021
3b5a624
[release] Change transformer versions in HF TF2.4.1 and PT1.7.1 (#1121)
jeet4320 May 26, 2021
6dda696
[mxnet18][neuron] - Add neuron mxnet18 based dlc image (#1105)
aws-vrnatham May 28, 2021
13dd1e5
[pytorch][neuron] support pt1.7 and also use u18 (#1101)
aws-vrnatham May 28, 2021
45a7c29
update TS to 0.4.0 for inference PT1.8.1 (#1124)
lxning Jun 1, 2021
3355daf
[release] Release PT 1.7.1 for Neuron UL18 (#1131)
jeet4320 Jun 1, 2021
22afaf1
added beta version of toolkit
philschmid Jun 2, 2021
4b34bdf
[release] Add PT 1.6 GPU cu110 and PT 1.8 (#1136)
saimidu Jun 3, 2021
0afc92c
[test][sagemaker][pytorch] Run smddp_smdmp test on correct region (#1…
saimidu Jun 3, 2021
a0c30df
[test][sagemaker][pytorch] Add us-east-2 to no_p3 regions (#1140)
saimidu Jun 3, 2021
da71db7
[huggingface_pytorch] Safety check on PT 1.6 (#1133)
saimidu Jun 4, 2021
9d40d45
[test][ec2][pytorch] Run NCCL version test only on PT 1.7 or above DL…
saimidu Jun 4, 2021
cd17e8b
[test][sagemaker][huggingface] Add placeholder tests for inference (#…
saimidu Jun 4, 2021
7097ce8
[test][sagemaker] Add ap-northeast-2 to NO_PR_REGIONS (#1142)
saimidu Jun 7, 2021
48c0e36
[release] Add PT 1.6 HF 4.6.1 to release images (#1143)
saimidu Jun 7, 2021
9da07a4
[test][sagemaker] Add ap-northeast-1 to NO_P3_REGIONS (#1144)
saimidu Jun 8, 2021
90dfb6f
added pytorch DLCs
philschmid Apr 29, 2021
71a1d46
added TF container and cleand PT, move build_artifcats
philschmid Apr 29, 2021
a5cd5dd
moved build artifcats to separate folder
philschmid Apr 29, 2021
9bde308
removed todos
philschmid Apr 29, 2021
c1bfe80
added inference buildspec
philschmid Apr 29, 2021
ec6c250
added build spec for tf
philschmid Apr 29, 2021
03e2dc7
added beta version of toolkit
philschmid Jun 2, 2021
bbdb15d
rebasing
philschmid Jun 8, 2021
a7e8eee
local_serving test pytorch
philschmid Jun 8, 2021
dcd0e21
cpu and gpu tests
philschmid Jun 8, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"python.pythonPath": "/Users/philipp/anaconda3/envs/sm/bin/python"
}
5 changes: 5 additions & 0 deletions huggingface/build_artifacts/inference/config.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
vmargs=-XX:+UseContainerSupport -XX:InitialRAMPercentage=8.0 -XX:MaxRAMPercentage=10.0 -XX:-UseLargePages -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-XX:-UseContainerSupport? Did we test this configuration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we address this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have done all the current testing with this configuration so far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the option -XX:+UseContainerSupport above with -XX:-UseContainerSupport .. + changes to -

model_store=/opt/ml/model
load_models=ALL
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
28 changes: 28 additions & 0 deletions huggingface/build_artifacts/inference/mms-entrypoint.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
from __future__ import absolute_import

import shlex
import subprocess
import sys


if sys.argv[1] == "serve":
from sagemaker_huggingface_inference_toolkit import serving

serving.main()
else:
subprocess.check_call(shlex.split(" ".join(sys.argv[1:])))

# prevent docker exit
subprocess.call(["tail", "-f", "/dev/null"])
4 changes: 2 additions & 2 deletions huggingface/pytorch/buildspec-1-6.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ images:
tag_python_version: &TAG_PYTHON_VERSION py36
cuda_version: &CUDA_VERSION cu110
os_version: &OS_VERSION ubuntu18.04
transformers_version: &TRANSFORMERS_VERSION 4.5.0
datasets_version: &DATASETS_VERSION 1.5.0
transformers_version: &TRANSFORMERS_VERSION 4.6.1
datasets_version: &DATASETS_VERSION 1.6.2
tag: !join [ *VERSION, '-', 'transformers', *TRANSFORMERS_VERSION, '-', *DEVICE_TYPE, '-', *TAG_PYTHON_VERSION, '-',
*CUDA_VERSION, '-', *OS_VERSION ]
docker_file: !join [ docker/, *SHORT_VERSION, /, *DOCKER_PYTHON_VERSION, /,
Expand Down
52 changes: 50 additions & 2 deletions huggingface/pytorch/buildspec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,21 @@ repository_info:
repository_name: &REPOSITORY_NAME !join ["pr", "-", "huggingface", "-", *BASE_FRAMEWORK, "-", *TRAINING_IMAGE_TYPE]
repository: &REPOSITORY !join [ *ACCOUNT_ID, .dkr.ecr., *REGION, .amazonaws.com/,
*REPOSITORY_NAME ]
inference_repository: &INFERENCE_REPOSITORY
image_type: &INFERENCE_IMAGE_TYPE inference
root: !join [ "huggingface/", *BASE_FRAMEWORK, "/", *INFERENCE_IMAGE_TYPE ]
repository_name: &REPOSITORY_NAME !join ["pr", "-", "huggingface", "-", *BASE_FRAMEWORK, "-", *INFERENCE_IMAGE_TYPE]
repository: &REPOSITORY !join [ *ACCOUNT_ID, .dkr.ecr., *REGION, .amazonaws.com/,
*REPOSITORY_NAME ]

context:
inference_context: &INFERENCE_CONTEXT
mms-entrypoint:
source: ../build_artifacts/inference/mms-entrypoint.py
target: mms-entrypoint.py
config:
source: ../build_artifacts/inference/config.properties
target: config.properties

images:
BuildHuggingFacePytorchGpuPy37Cu110TrainingDockerImage:
Expand All @@ -23,9 +38,42 @@ images:
tag_python_version: &TAG_PYTHON_VERSION py36
cuda_version: &CUDA_VERSION cu110
os_version: &OS_VERSION ubuntu18.04
transformers_version: &TRANSFORMERS_VERSION 4.5.0
datasets_version: &DATASETS_VERSION 1.5.0
transformers_version: &TRANSFORMERS_VERSION 4.6.1
datasets_version: &DATASETS_VERSION 1.6.2
tag: !join [ *VERSION, '-', 'transformers', *TRANSFORMERS_VERSION, '-', *DEVICE_TYPE, '-', *TAG_PYTHON_VERSION, '-',
*CUDA_VERSION, '-', *OS_VERSION ]
docker_file: !join [ docker/, *SHORT_VERSION, /, *DOCKER_PYTHON_VERSION, /,
*CUDA_VERSION, /Dockerfile., *DEVICE_TYPE ]
BuildHuggingFacePytorchCPUPTInferencePy3DockerImage:
<<: *INFERENCE_REPOSITORY
build: &HUGGINGFACE_PYTORCH_CPU_INFERENCE_PY3 false
image_size_baseline: 4899
device_type: &DEVICE_TYPE cpu
python_version: &DOCKER_PYTHON_VERSION py3
tag_python_version: &TAG_PYTHON_VERSION py36
os_version: &OS_VERSION ubuntu18.04
transformers_version: &TRANSFORMERS_VERSION 4.6.0
inference_toolkit_version: &INFERENCE_TOOLKIT_VERSION 0.0.1.dev0
tag: !join [ *VERSION, '-', 'transformers', *TRANSFORMERS_VERSION, '-', *DEVICE_TYPE, '-', *TAG_PYTHON_VERSION, '-',
*CUDA_VERSION, '-', *OS_VERSION ]
docker_file: !join [ docker/, *SHORT_VERSION, /, *DOCKER_PYTHON_VERSION, /,
*CUDA_VERSION, /Dockerfile., *DEVICE_TYPE ]
context:
<<: *INFERENCE_CONTEXT
BuildHuggingFacePytorchGpuPy37Cu110InferenceDockerImage:
<<: *INFERENCE_REPOSITORY
build: &HUGGINGFACE_PYTORCH_GPU_INFERENCE_PY3 false
image_size_baseline: 14000
device_type: &DEVICE_TYPE gpu
python_version: &DOCKER_PYTHON_VERSION py3
tag_python_version: &TAG_PYTHON_VERSION py36
cuda_version: &CUDA_VERSION cu111
os_version: &OS_VERSION ubuntu18.04
transformers_version: &TRANSFORMERS_VERSION 4.6.0
inference_toolkit_version: &INFERENCE_TOOLKIT_VERSION 0.0.1.dev0
tag: !join [ *VERSION, '-', 'transformers', *TRANSFORMERS_VERSION, '-', *DEVICE_TYPE, '-', *TAG_PYTHON_VERSION, '-',
*CUDA_VERSION, '-', *OS_VERSION ]
docker_file: !join [ docker/, *SHORT_VERSION, /, *DOCKER_PYTHON_VERSION, /,
*CUDA_VERSION, /Dockerfile., *DEVICE_TYPE ]
context:
<<: *INFERENCE_CONTEXT
134 changes: 134 additions & 0 deletions huggingface/pytorch/inference/docker/1.7/py3/Dockerfile.cpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
FROM ubuntu:18.04

LABEL maintainer="Amazon AI"
LABEL dlc_major_version="1"

# Specify accept-bind-to-port LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT
# https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
# Specify multi-models LABEL to indicate container is capable of loading and serving multiple models concurrently
# https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html
LABEL com.amazonaws.sagemaker.capabilities.multi-models=true

ARG MMS_VERSION=1.1.2
ARG PYTHON=python3
ARG PYTHON_VERSION=3.6.13
ARG OPEN_MPI_VERSION=4.0.1
# HF ARGS
ARG PT_INFERENCE_URL=https://aws-pytorch-binaries.s3-us-west-2.amazonaws.com/r1.7.1_inference/20210112-183245/c1130f2829b03c0997b9813211a7c0f600fc569a/cpu/torch-1.7.1-cp36-cp36m-manylinux1_x86_64.whl
ARG TRANSFORMERS_VERSION
ARG HF_INFERENCE_TOOLKIT_VERSION


ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
LD_LIBRARY_PATH="/opt/conda/lib/:${LD_LIBRARY_PATH}:/usr/local/lib" \
PYTHONIOENCODING=UTF-8 \
LANG=C.UTF-8 \
LC_ALL=C.UTF-8 \
TEMP=/home/model-server/tmp \
DEBIAN_FRONTEND=noninteractive

ENV PATH /opt/conda/bin:$PATH

RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ca-certificates \
build-essential \
openssl \
openjdk-8-jdk-headless \
vim \
wget \
curl \
unzip \
git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

RUN curl -L -o ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p /opt/conda \
&& rm ~/miniconda.sh \
&& /opt/conda/bin/conda update conda \
&& /opt/conda/bin/conda install -c conda-forge \
python=$PYTHON_VERSION \
&& /opt/conda/bin/conda install -y \
# conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
ruamel_yaml==0.15.100 \
cython==0.29.12 \
mkl-include==2019.4 \
mkl==2019.4 \
botocore \
&& /opt/conda/bin/conda clean -ya

RUN pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \
&& ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \
&& pip install packaging==20.4 \
enum-compat==0.0.3 \
"cryptography>3.2"

RUN wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-$OPEN_MPI_VERSION.tar.gz \
&& gunzip -c openmpi-$OPEN_MPI_VERSION.tar.gz | tar xf - \
&& cd openmpi-$OPEN_MPI_VERSION \
&& ./configure --prefix=/home/.openmpi \
&& make all install \
&& cd .. \
&& rm openmpi-$OPEN_MPI_VERSION.tar.gz \
&& rm -rf openmpi-$OPEN_MPI_VERSION

# The ENV variables declared below are changed in the previous section
# Grouping these ENV variables in the first section causes
# ompi_info to fail. This is only observed in CPU containers
ENV PATH="$PATH:/home/.openmpi/bin"
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/.openmpi/lib/"
RUN ompi_info --parsable --all | grep mpi_built_with_cuda_support:value

WORKDIR /

RUN pip install --no-cache-dir \
multi-model-server==$MMS_VERSION \
sagemaker-inference

RUN useradd -m model-server \
&& mkdir -p /home/model-server/tmp \
&& chown -R model-server /home/model-server

COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
COPY config.properties /etc/sagemaker-mms.properties

RUN chmod +x /usr/local/bin/dockerd-entrypoint.py

ADD https://raw.githubusercontent.com/aws/deep-learning-containers/master/src/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN HOME_DIR=/root \
&& curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \
&& unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \
&& cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \
&& chmod +x /usr/local/bin/testOSSCompliance \
&& chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \
&& ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \
&& rm -rf ${HOME_DIR}/oss_compliance*


#################################
# Hugging Face specific section #
#################################


RUN curl https://aws-dlc-licenses.s3.amazonaws.com/pytorch-1.7/license.txt -o /license.txt

# Uninstall and re-install torch and torchvision from the PyTorch website
RUN pip uninstall -y torch \
&& pip install --no-cache-dir -U $PT_INFERENCE_URL

# install Hugging Face libraries and its dependencies
RUN pip install --no-cache-dir \
transformers[sentencepiece]==${TRANSFORMERS_VERSION} \
protobuf==3.12.0 \
sagemaker-huggingface-inference-toolkit==${HF_INFERENCE_TOOLKIT_VERSION}

EXPOSE 8080 8081
ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
CMD ["serve"]
128 changes: 128 additions & 0 deletions huggingface/pytorch/inference/docker/1.7/py3/cu110/Dockerfile.gpu
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
FROM nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04

LABEL maintainer="Amazon AI"
LABEL dlc_major_version="1"

# Specify accept-bind-to-port LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT
# https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
# Specify multi-models LABEL to indicate container is capable of loading and serving multiple models concurrently
# https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html
LABEL com.amazonaws.sagemaker.capabilities.multi-models=true

ARG MMS_VERSION=1.1.2
ARG PYTHON=python3
ARG PYTHON_VERSION=3.6.13
ARG OPEN_MPI_VERSION=4.0.1
# HF ARGS
ARG PT_INFERENCE_URL=https://aws-pytorch-binaries.s3-us-west-2.amazonaws.com/r1.7.1_inference/20210112-183245/c1130f2829b03c0997b9813211a7c0f600fc569a/gpu/torch-1.7.1-cp36-cp36m-manylinux1_x86_64.whl
ARG TRANSFORMERS_VERSION
ARG HF_INFERENCE_TOOLKIT_VERSION

ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
LD_LIBRARY_PATH="/opt/conda/lib/:${LD_LIBRARY_PATH}:/usr/local/lib" \
PYTHONIOENCODING=UTF-8 \
LANG=C.UTF-8 \
LC_ALL=C.UTF-8 \
TEMP=/home/model-server/tmp \
DEBIAN_FRONTEND=noninteractive

ENV PATH /opt/conda/bin:$PATH

RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ca-certificates \
build-essential \
openssl \
openjdk-8-jdk-headless \
vim \
wget \
curl \
unzip \
git \
libnuma1 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

RUN curl -L -o ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& chmod +x ~/miniconda.sh \
&& ~/miniconda.sh -b -p /opt/conda \
&& rm ~/miniconda.sh \
&& /opt/conda/bin/conda update conda \
&& /opt/conda/bin/conda install -c conda-forge \
python=$PYTHON_VERSION \
&& /opt/conda/bin/conda install -y \
# conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
ruamel_yaml==0.15.100 \
cython==0.29.12 \
botocore \
mkl-include==2019.4 \
mkl==2019.4 \
&& /opt/conda/bin/conda clean -ya

RUN pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \
&& ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \
&& pip install packaging==20.4 \
enum-compat==0.0.3 \
"cryptography>3.2"

RUN wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-$OPEN_MPI_VERSION.tar.gz \
&& gunzip -c openmpi-$OPEN_MPI_VERSION.tar.gz | tar xf - \
&& cd openmpi-$OPEN_MPI_VERSION \
&& ./configure --prefix=/home/.openmpi \
&& make all install \
&& cd .. \
&& rm openmpi-$OPEN_MPI_VERSION.tar.gz \
&& rm -rf openmpi-$OPEN_MPI_VERSION

ENV PATH="$PATH:/home/.openmpi/bin"
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/.openmpi/lib/"

WORKDIR /

RUN pip install --no-cache-dir \
multi-model-server==$MMS_VERSION \
sagemaker-inference

RUN useradd -m model-server \
&& mkdir -p /home/model-server/tmp \
&& chown -R model-server /home/model-server

COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
COPY config.properties /etc/sagemaker-mms.properties

RUN chmod +x /usr/local/bin/dockerd-entrypoint.py

ADD https://raw.githubusercontent.com/aws/deep-learning-containers/master/src/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN HOME_DIR=/root \
&& curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \
&& unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \
&& cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \
&& chmod +x /usr/local/bin/testOSSCompliance \
&& chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \
&& ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \
&& rm -rf ${HOME_DIR}/oss_compliance*

#################################
# Hugging Face specific section #
#################################

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/pytorch-1.7/license.txt -o /license.txt

# Uninstall and re-install torch and torchvision from the PyTorch website
RUN pip uninstall -y torch \
&& pip install --no-cache-dir -U $PT_INFERENCE_URL

# install Hugging Face libraries and its dependencies
RUN pip install --no-cache-dir \
transformers[sentencepiece]==${TRANSFORMERS_VERSION} \
protobuf==3.12.0 \
sagemaker-huggingface-inference-toolkit==${HF_INFERENCE_TOOLKIT_VERSION}

EXPOSE 8080 8081
ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
CMD ["serve"]
Loading