[HuggingFace Inference] Add new base HuggingFace inference containers #1040

vdantu · 2021-04-16T20:53:43Z

Issue #, if available:

PR Checklist

I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker]
(If applicable) I've documented below the DLC image/dockerfile this relates to
(If applicable) I've documented below the tests I've run on the DLC image
(If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See https://www.apache.org/legal/resolved.html.
(If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them.

Pytest Marker Checklist

(If applicable) I have added the marker @pytest.mark.model("<model-type>") to the new tests which I have added, to specify the Deep Learning model that is used in the test (use "N/A" if the test doesn't use a model)
(If applicable) I have added the marker @pytest.mark.integration("<feature-being-tested>") to the new tests which I have added, to specify the feature that will be tested
(If applicable) I have added the marker @pytest.mark.multinode(<integer-num-nodes>) to the new tests which I have added, to specify the number of nodes used on a multi-node test
(If applicable) I have added the marker @pytest.mark.processor(<"cpu"/"gpu"/"eia"/"neuron">) to the new tests which I have added, if a test is specifically applicable to only one processor type

EIA/NEURON Checklist

When creating a PR:

I've modified src/config/build_config.py in my PR branch by setting ENABLE_EI_MODE = True or ENABLE_NEURON_MODE = True

When PR is reviewed and ready to be merged:

I've reverted the code change on the config file mentioned above

Benchmark Checklist

When creating a PR:

I've modified src/config/test_config.py in my PR branch by setting ENABLE_BENCHMARK_DEV_MODE = True

When PR is reviewed and ready to be merged:

I've reverted the code change on the config file mentioned above

Reviewer Checklist

For reviewer, before merging, please cross-check:

I've verified the code change on the config file mentioned above has already been reverted

Description:

Tests run:

DLC image/dockerfile:

Additional context:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

philschmid

I added a few comments. I think we can optimize the base container even further. But I already starting testing with it for PyTorch and TensorFlow

huggingface/base/inference/docker/Dockerfile.cpu

philschmid · 2021-04-19T09:40:44Z

huggingface/base/inference/docker/Dockerfile.cpu

+ARG PYTHON=python3
+ARG PYTHON_VERSION=3.7.10
+ARG PT_INFERENCE_URL=https://aws-pytorch-binaries.s3-us-west-2.amazonaws.com/r1.7.1_inference/20210112-183245/c1130f2829b03c0997b9813211a7c0f600fc569a/cpu/torch-1.7.1-cp36-cp36m-manylinux1_x86_64.whl
+ARG PT_TORCHVISION_URL=https://torchvision-build.s3-us-west-2.amazonaws.com/1.7.1/cpu/torchvision-0.8.2-cp36-cp36m-linux_x86_64.whl


The same goes for torchvision, which we don't need at all.

Nope, we don't need this and we will have HF container bring in the PT/TF frameworks.

huggingface/base/inference/docker/Dockerfile.cpu

philschmid · 2021-04-19T09:41:24Z

huggingface/base/inference/docker/Dockerfile.cpu

+    python=$PYTHON_VERSION \
+ && /opt/conda/bin/conda install -y \
+    # conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
+    ruamel_yaml==0.15.100 \


Do we need ruamel_yaml for the inference container?

huggingface/base/inference/docker/Dockerfile.cpu

philschmid · 2021-04-19T09:46:49Z

huggingface/base/inference/docker/cu110/Dockerfile.gpu

+    python=$PYTHON_VERSION \
+ && /opt/conda/bin/conda install -y \
+    # conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
+    ruamel_yaml==0.15.100 \


Do we need ruamel_yaml for the inference container?

huggingface/base/inference/docker/cu110/Dockerfile.gpu

huggingface/base/inference/artifacts/mms-entrypoint.py

philschmid · 2021-04-19T11:32:35Z

huggingface/base/inference/artifacts/config.properties

@@ -0,0 +1,5 @@
+vmargs=-XX:+UseContainerSupport -XX:InitialRAMPercentage=8.0 -XX:MaxRAMPercentage=10.0 -XX:-UseLargePages -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError


I would this file place in build_artifacts instead of artifacts to keep it the same across all frameworks.
https://github.com/vdantu/deep-learning-containers/tree/hugging_face/pytorch/inference/docker/build_artifacts

Also, can you remove the __init__.py from artifacts

Sure, we can name it either artifacts or build_artifacts. artifacts and build_artifacts are both used interchangeably.

philschmid · 2021-04-19T11:32:58Z

huggingface/base/inference/artifacts/mms-entrypoint.py

@@ -0,0 +1,25 @@
+# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.


I would this file place in build_artifacts instead of artifacts to keep it the same across all frameworks.
https://github.com/vdantu/deep-learning-containers/tree/hugging_face/pytorch/inference/docker/build_artifacts

There are framework repositories with "artifacts" and "build-artifacts" .. We could move it to either, I don't have any preference.

philschmid · 2021-04-19T11:33:59Z

huggingface/base/inference/docker/Dockerfile.cpu

+ && mkdir -p /home/model-server/tmp \
+ && chown -R model-server /home/model-server
+
+COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py


This should probably be adjusted to the right directory for the build context. Not sure what the context of docker build is but I doubt it its artifacts/

philschmid · 2021-04-19T11:34:03Z

huggingface/base/inference/docker/Dockerfile.cpu

+ && chown -R model-server /home/model-server
+
+COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
+COPY config.properties /home/model-server


This should probably be adjusted to the right directory for the build context. Not sure what the context of docker build is but I doubt it its artifacts/

huggingface/base/inference/docker/Dockerfile.cpu

philschmid · 2021-04-19T11:38:30Z

huggingface/base/inference/docker/Dockerfile.cpu

+RUN /opt/conda/bin/conda install -c \
+    conda-forge \
+ && /opt/conda/bin/conda install -y \
+    requests==2.22.0 \
+ && /opt/conda/bin/conda clean -ya \
+ && pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \
+ && ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \
+ && pip install packaging==20.4 \
+    enum-compat==0.0.3 \
+    "cryptography>3.2"


Suggested change

RUN /opt/conda/bin/conda install -c \

conda-forge \

&& /opt/conda/bin/conda install -y \

requests==2.22.0 \

&& /opt/conda/bin/conda clean -ya \

&& pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \

&& ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \

&& pip install packaging==20.4 \

enum-compat==0.0.3 \

"cryptography>3.2"

RUN pip install --upgrade pip --trusted-host pypi.org --trusted-host files.pythonhosted.org \

&& ln -s /opt/conda/bin/pip /usr/local/bin/pip3 \

&& pip install packaging==20.4 \

enum-compat==0.0.3 \

"cryptography>3.2"

philschmid · 2021-04-19T11:46:01Z

huggingface/base/inference/docker/Dockerfile.cpu

+
+ARG MMS_VERSION=1.1.2
+ARG PYTHON=python3
+ARG PYTHON_VERSION=3.7.10


Why have you chosen PYTHON_VERSION=3.7.10 when the other PyTorch inference DLC use 3.6.13 and the aws optimized PyTorch only supports py36?

Suggested change

ARG PYTHON_VERSION=3.7.10

ARG PYTHON_VERSION=3.6.13

Yup. I need to revert this.

philschmid · 2021-04-19T11:54:15Z

huggingface/base/inference/docker/cu110/Dockerfile.gpu

+
+ARG MMS_VERSION=1.1.2
+ARG PYTHON=python3
+ARG PYTHON_VERSION=3.7.10


Why have you chosen PYTHON_VERSION=3.7.10 when the other PyTorch inference DLC use 3.6.13 and the aws optimized PyTorch only supports py36?

Suggested change

ARG PYTHON_VERSION=3.7.10

ARG PYTHON_VERSION=3.6.13

philschmid · 2021-04-19T11:55:32Z

huggingface/base/inference/docker/cu110/Dockerfile.gpu

+    openjdk-8-jdk \
+    openjdk-8-jre \
+    vim \
+    wget \


Suggested change

wget \

wget \

git \

philschmid · 2021-04-19T11:55:47Z

huggingface/base/inference/docker/Dockerfile.cpu

+    openjdk-8-jdk \
+    openjdk-8-jre \
+    vim \
+    wget \


Suggested change

wget \

wget \

git \

philschmid · 2021-04-19T11:59:33Z

Also when I compared the dockerfile with the old PyTorch 1.4.0. I noticed that you removed amongst other mkl-include and mkl, which are still included in the latest PyTorch and TensorFlow versions. What is the reason behind this removal?
30bfb8a

philschmid · 2021-04-19T12:06:12Z

huggingface/base/inference/docker/Dockerfile.cpu

+
+EXPOSE 8080 8081
+ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
+CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]


Suggested change

CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]

CMD ["serve"]

Not sure why you start the model-server like that and not use the dockerd-entrypoint.py with the inference toolkit.

philschmid · 2021-04-19T12:06:22Z

huggingface/base/inference/docker/cu110/Dockerfile.gpu

+
+EXPOSE 8080 8081
+ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
+CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]


Suggested change

CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]

CMD ["serve"]

Not sure why you start the model-server like that and not use the dockerd-entrypoint.py with the inference toolkit.

philschmid · 2021-04-19T12:07:20Z

huggingface/base/inference/docker/cu110/Dockerfile.gpu

+ && ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \
+ && rm -rf ${HOME_DIR}/oss_compliance*
+
+RUN curl https://aws-dlc-licenses.s3.amazonaws.com/pytorch-1.7/license.txt -o /license.txt


Not sure why should include the license in the base container. I would rather add it to the PyTorch-specific one and for TensorFlow the same. Since the base container is not including the license for that.

philschmid · 2021-04-19T12:07:52Z

huggingface/base/inference/docker/Dockerfile.cpu

+ && ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \
+ && rm -rf ${HOME_DIR}/oss_compliance*
+
+RUN curl https://aws-dlc-licenses.s3.amazonaws.com/pytorch-1.7/license.txt -o /license.txt


Not sure why should include the license in the base container. I would rather add it to the PyTorch-specific one and for TensorFlow the same. Since the base container is not including the license for that.

philschmid · 2021-04-19T12:19:05Z

While testing the DLC with PyTorch, more specific with
https://aws-pytorch-binaries.s3-us-west-2.amazonaws.com/r1.7.1_inference/20210112-183245/c1130f2829b03c0997b9813211a7c0f600fc569a/cpu/torch-1.7.1-cp36-cp36m-manylinux1_x86_64.whl version of Pytorch I get the following error.

  File "/opt/conda/lib/python3.6/site-packages/huggingface_hub/hub_mixin.py", line 15, in <module>
    import torch
  File "/opt/conda/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/opt/conda/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/opt/conda/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory

When I use the base torch==1.7.1+cpu it works. Can you please look into this.

philschmid

Thanks for starting with refactoring. Sadly it seems like all the comments got erased and not all of them were resolved. I went again through each dockerfile line by line and added comments and thoughts/questions.
Furthermore, I saw your commit message Added py36 and py37 docker files to readily use aws TF and PT wheels.
If we need py37 for TensorFlow and py36 for PyTorch then we could either make PYTHON_VERSION as build-args, so we need only on dockerfile, if this is the only difference. Or we could basically build without a base container and add the framework-specific commands to them too.
It makes no sense to have 4 BASE container files and 4 Framework-specific files when the only difference is the Python version. I would rather only have the 4 Framework specific container.

I also saw that you used force-push i am not sure if this was the reason why the comments got erased, but this should be avoided I think.

philschmid · 2021-04-21T07:28:15Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+RUN wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && gunzip -c openmpi-$OPEN_MPI_VERSION.tar.gz | tar xf - \
+ && cd openmpi-$OPEN_MPI_VERSION \
+ && ./configure --prefix=/home/.openmpi \
+ && make all install \
+ && cd .. \
+ && rm openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && rm -rf openmpi-$OPEN_MPI_VERSION


I am not sure why we should need OPEN_MPI in an inference CPU container. Is it used by some sort of SageMaker or MMS feature?

Currently aws-fx wheel depends on this. We can revisit this once (if?) we remove this dependency in the future.

philschmid · 2021-04-21T07:28:21Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+ARG MMS_VERSION=1.1.2
+ARG PYTHON=python3
+ARG PYTHON_VERSION=3.6.13
+ARG OPEN_MPI_VERSION=4.0.1


Why did you add OPEN_MPI_VERSION compared to the initial one?

This is to get our aws-pytorch wheel working.

philschmid · 2021-04-21T07:29:04Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+    openssl \
+    openjdk-8-jdk-headless \
+    vim \
+    wget \


Suggested change

wget \

wget \

git \

Do we need git for building/installing HF inference toolkit? If yes, we should add it on top of base container spec. We could later remove it when we publish the toolkits.

philschmid · 2021-04-21T07:30:15Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+    vim \
+    wget \
+    curl \
+    unzip \


Compared to the initial dockerfile you have removed several packages.

zlib1g-dev \ libreadline-gplv2-dev \ libncursesw5-dev \ libssl-dev \ libsqlite3-dev \ libgdbm-dev \ libc6-dev \ libbz2-dev \ tk-dev \ libffi-dev \

Are they not needed?

Yes. Functionally these don't seem to be required. Will revisit some (like libffi) later. We technically shouldn't need any dev packages in prod containers.

philschmid · 2021-04-21T07:32:35Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+    python=$PYTHON_VERSION \
+ && /opt/conda/bin/conda install -y \
+    # conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
+    ruamel_yaml==0.15.100 \


Do we need ruamel_yaml for the inference container?

The comment suggested that we need this for conda. Considering we are on 4.10.1 we might not need this anyomre. But, we need to verify before removing.

philschmid · 2021-04-21T08:14:22Z

huggingface/base/inference/docker/py3.7/cu110/Dockerfile.gpu

+    # conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
+    ruamel_yaml==0.15.100 \
+    cython==0.29.12 \
+    botocore \


First of all, is botocore required? If yes I would suggest pinning the version. It is also new compared to the initial one you provided.

philschmid · 2021-04-21T08:14:37Z

huggingface/base/inference/docker/py3.7/cu110/Dockerfile.gpu

+RUN wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && gunzip -c openmpi-$OPEN_MPI_VERSION.tar.gz | tar xf - \
+ && cd openmpi-$OPEN_MPI_VERSION \
+ && ./configure --prefix=/home/.openmpi \
+ && make all install \
+ && cd .. \
+ && rm openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && rm -rf openmpi-$OPEN_MPI_VERSION


I am not sure why we should need OPEN_MPI in an inference CPU container. Is it used by some sort of SageMaker or MMS feature?

philschmid · 2021-04-21T08:14:49Z

huggingface/base/inference/docker/py3.7/cu110/Dockerfile.gpu

+ENV PATH="$PATH:/home/.openmpi/bin"
+ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/.openmpi/lib/"


related to OPEN_MPI comment

philschmid · 2021-04-21T08:15:16Z

huggingface/base/inference/docker/py3.7/cu110/Dockerfile.gpu

+COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
+COPY config.properties /home/model-server


This should probably be adjusted to the right directory for the build context. Not sure what the context of docker build is but I doubt it is `artifacts/

philschmid · 2021-04-21T08:15:28Z

huggingface/base/inference/docker/py3.7/cu110/Dockerfile.gpu

+
+EXPOSE 8080 8081
+ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
+CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]


Not sure why you start the model-server like that and not use the dockerd-entrypoint.py with the inference toolkit.

Suggested change

CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]

CMD ["serve"]

vdantu

I see some duplicate comments, was that intentional?

vdantu · 2021-04-22T19:04:44Z

huggingface/base/inference/artifacts/config.properties

@@ -0,0 +1,5 @@
+vmargs=-XX:+UseContainerSupport -XX:InitialRAMPercentage=8.0 -XX:MaxRAMPercentage=10.0 -XX:-UseLargePages -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError


Sure, we can name it either artifacts or build_artifacts. artifacts and build_artifacts are both used interchangeably.

vdantu · 2021-04-22T19:05:21Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+ARG MMS_VERSION=1.1.2
+ARG PYTHON=python3
+ARG PYTHON_VERSION=3.6.13
+ARG OPEN_MPI_VERSION=4.0.1


This is to get our aws-pytorch wheel working.

vdantu · 2021-04-22T19:07:19Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+    openssl \
+    openjdk-8-jdk-headless \
+    vim \
+    wget \


Do we need git for building/installing HF inference toolkit? If yes, we should add it on top of base container spec. We could later remove it when we publish the toolkits.

vdantu · 2021-04-22T20:12:04Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+    vim \
+    wget \
+    curl \
+    unzip \


Yes. Functionally these don't seem to be required. Will revisit some (like libffi) later. We technically shouldn't need any dev packages in prod containers.

vdantu · 2021-04-22T21:29:06Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+    wget \
+    curl \
+    unzip \
+ && apt-get clean \


Yup. This was required for getting GPU version of aws-pytorch wheel run on GPU containers. Will revisit if we can remove this.

vdantu · 2021-04-22T21:34:21Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+RUN wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && gunzip -c openmpi-$OPEN_MPI_VERSION.tar.gz | tar xf - \
+ && cd openmpi-$OPEN_MPI_VERSION \
+ && ./configure --prefix=/home/.openmpi \
+ && make all install \
+ && cd .. \
+ && rm openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && rm -rf openmpi-$OPEN_MPI_VERSION


Currently aws-fx wheel depends on this. We can revisit this once (if?) we remove this dependency in the future.

vdantu · 2021-04-22T21:37:35Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
+COPY config.properties /home/model-server


To build the inference containers, we need to add logic to buildspec.yml. That will tell the container building pipelines how to pull in the required artifacts and Dockerfiles and build the container.

vdantu · 2021-04-22T21:38:49Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+
+EXPOSE 8080 8081
+ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
+CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]


We will have ENTRYPOINT in the Dockerfile which will be called with serve. Why can't we follow something like this

vdantu · 2021-04-22T21:41:32Z

huggingface/base/inference/docker/py3.6/cu110/Dockerfile.gpu

+    # conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
+    ruamel_yaml==0.15.100 \
+    cython==0.29.12 \
+    botocore \


Good point, we can pin this to current latest version.

vdantu · 2021-04-22T21:43:15Z

huggingface/base/inference/docker/py3.7/cu110/Dockerfile.gpu

@@ -0,0 +1,107 @@
+FROM nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04


I am currently testing this. The CUDA interoperability is more nuanced than I thought. 😄 . Will update this if endpoint creation fails.

philschmid · 2021-04-29T08:54:16Z

can be closed, due to #1077

icywang86rui · 2021-04-29T17:06:13Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

@@ -0,0 +1,110 @@
+FROM ubuntu:18.04


We should use public ecr. Docker hub will throttle pull requests

icywang86rui · 2021-04-29T17:15:33Z

huggingface/base/inference/docker/py3.6/Dockerfile.cpu

+
+EXPOSE 8080 8081
+ENTRYPOINT ["python", "/usr/local/bin/dockerd-entrypoint.py"]
+CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]


I think we don't need the CMD since these images are for SageMaker only(don't need to support DLC use case). SageMaker overrides anyway - https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-run-image

icywang86rui · 2021-04-29T17:16:11Z

huggingface/base/inference/docker/py3.6/cu110/Dockerfile.gpu

@@ -0,0 +1,107 @@
+FROM nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04


Same here, use ECR public

icywang86rui · 2021-04-29T17:21:15Z

huggingface/base/inference/docker/py3.6/cu110/Dockerfile.gpu

+    # conda 4.10.0 requires ruamel_yaml to be installed. Currently pinned at latest.
+    ruamel_yaml==0.15.100 \
+    cython==0.29.12 \
+    botocore \


Why do we need boto?

icywang86rui · 2021-04-29T17:22:02Z

huggingface/base/inference/docker/py3.6/cu110/Dockerfile.gpu

+RUN wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && gunzip -c openmpi-$OPEN_MPI_VERSION.tar.gz | tar xf - \
+ && cd openmpi-$OPEN_MPI_VERSION \
+ && ./configure --prefix=/home/.openmpi \
+ && make all install \
+ && cd .. \
+ && rm openmpi-$OPEN_MPI_VERSION.tar.gz \
+ && rm -rf openmpi-$OPEN_MPI_VERSION
+
+ENV PATH="$PATH:/home/.openmpi/bin"
+ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/.openmpi/lib/"


I think it might be required as a dlc image. SageMaker probably doesn't need this.

saimidu · 2021-06-28T22:48:23Z

Closing this PR now. Let's create a new PR if we want to consider this approach again.

Add new base HuggingFace inference containers

e671651

vdantu force-pushed the hugging_face branch 3 times, most recently from 34286af to 30bfb8a Compare April 19, 2021 05:20

philschmid reviewed Apr 19, 2021

View reviewed changes

huggingface/base/inference/artifacts/mms-entrypoint.py Outdated Show resolved Hide resolved

philschmid reviewed Apr 19, 2021

View reviewed changes

huggingface/base/inference/docker/Dockerfile.cpu Outdated Show resolved Hide resolved

philschmid reviewed Apr 19, 2021

View reviewed changes

vdantu force-pushed the hugging_face branch 10 times, most recently from 994bc2e to a842576 Compare April 21, 2021 00:24

vdantu force-pushed the hugging_face branch 3 times, most recently from eaa1b29 to 4f02ada Compare April 21, 2021 01:03

Pruning packages

8be4671

vdantu force-pushed the hugging_face branch from 4f02ada to 8be4671 Compare April 21, 2021 05:38

Added py36 and py37 docker files to readily use aws TF and PT wheels

d4ea399

philschmid suggested changes Apr 21, 2021

View reviewed changes

vdantu commented Apr 22, 2021

View reviewed changes

vdantu changed the title ~~[WIP][HuggingFace Inference] Add new base HuggingFace inference containers~~ [HuggingFace Inference] Add new base HuggingFace inference containers Apr 22, 2021

icywang86rui reviewed Apr 29, 2021

View reviewed changes

saimidu closed this Jun 28, 2021

		@@ -0,0 +1,5 @@
		vmargs=-XX:+UseContainerSupport -XX:InitialRAMPercentage=8.0 -XX:MaxRAMPercentage=10.0 -XX:-UseLargePages -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError

		@@ -0,0 +1,25 @@
		# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.

	CMD ["multi-model-server", "--start", "--mms-config", "/home/model-server/config.properties"]
	CMD ["serve"]

		ENV PATH="$PATH:/home/.openmpi/bin"
		ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/.openmpi/lib/"

		COPY mms-entrypoint.py /usr/local/bin/dockerd-entrypoint.py
		COPY config.properties /home/model-server

		@@ -0,0 +1,107 @@
		FROM nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04

[HuggingFace Inference] Add new base HuggingFace inference containers #1040

[HuggingFace Inference] Add new base HuggingFace inference containers #1040

Uh oh!

Conversation

vdantu commented Apr 16, 2021

PR Checklist

Pytest Marker Checklist

EIA/NEURON Checklist

Benchmark Checklist

Reviewer Checklist

Uh oh!

philschmid left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vdantu Apr 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philschmid Apr 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philschmid commented Apr 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

philschmid commented Apr 19, 2021

Uh oh!

philschmid left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

vdantu Apr 20, 2021 •

edited

Loading

philschmid Apr 19, 2021 •

edited

Loading

philschmid commented Apr 19, 2021 •

edited

Loading