Skip to content

Can't build vLLM from Docker due to AWQ's minimum architecture requirements - TORCH_CUDA_ARCH_LIST does not help #1070

@TheBloke

Description

@TheBloke

@casper-hansen @WoosukKwon

I'm trying to build a test vLLM Docker container with the latest vLLM commit.

My Docker container has this:

ARG CUDA_VERSION="11.8.0"
ARG CUDNN_VERSION="8"
ARG UBUNTU_VERSION="22.04"

# Base NVidia CUDA Ubuntu image
FROM nvidia/cuda:$CUDA_VERSION-cudnn$CUDNN_VERSION-devel-ubuntu$UBUNTU_VERSION AS base

ENV PATH="/usr/local/cuda/bin:${PATH}"

ENV TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX;8.9;9.0"
RUN pip3 install torch --index-url https://download.pytorch.org/whl/cu118

RUN git clone https://github.com/vllm-project/vllm && \
    cd vllm && \
    git checkout ff36139ffc66294c19b503c1e52dc42c2cd265f6 && \
    pip3 install -r requirements.txt && \
    pip3 install -e . && \
    pip3 install --no-cache-dir huggingface-hub hf_transfer && \
    pip3 cache purge

This structure worked fine to build vLLM before, and to build other servers/apps that use Torch.

The line ENV TORCH_CUDA_ARCH_LIST="8.0;8.6+PTX;8.9;9.0" usually works fine to ensure that an app can built in a Docker even though Docker cannot see any GPU.

But trying to build vLLM fails with this:

#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 892; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 892; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 900; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 900; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 908; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 908; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 916; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 916; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 924; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 924; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 928; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 932; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 936; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 940; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 944; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 948; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 952; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 956; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 964; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 964; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 972; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 972; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 980; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 980; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 988; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 988; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 996; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 996; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1000; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1004; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1008; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1012; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1016; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1020; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1024; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1028; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1834; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1834; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1842; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1842; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1850; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1850; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1854; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1858; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1862; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1866; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1874; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1874; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1882; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1882; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1890; error   : Feature 'ldmatrix' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1890; error   : Modifier '.m8n8' requires .target sm_75 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1894; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1898; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1902; error   : Feature '.m16n8k16' requires .target sm_80 or higher
#11 509.5       ptxas /tmp/tmpxft_00000226_00000000-11_gemm_kernels.compute_70.ptx, line 1906; error   : Feature '.m16n8k16' requires .target sm_80 or higher

Any ideas how I can avoid this error in the Docker? No GPU is available at the time of building, but the GPU will be available at runtime. I guess I need some flag to tell the build to build for sm_80 and higher, like TORCH_CUDA_ARCH_LIST?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions