Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .buildkite/release-pipeline.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,21 @@ steps:
env:
DOCKER_BUILDKIT: "1"

- label: "Build arm64 wheel - CUDA 13.0"
depends_on: ~
id: build-wheel-arm64-cuda-13-0
agents:
queue: arm64_cpu_queue_postmerge
commands:
# #NOTE: torch_cuda_arch_list is derived from upstream PyTorch build files here:
# https://github.com/pytorch/pytorch/blob/main/.ci/aarch64_linux/aarch64_ci_build.sh#L7
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg VLLM_MAIN_CUDA_VERSION=13.0 --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Setting CUDA_VERSION=13.0.1 for this arm64 build will likely cause it to fail. The docker/Dockerfile has a hardcoded PyTorch version for CUDA 12.8 (torch==2.8.0.dev20250318+cu128) for arm64 platforms (see docker/Dockerfile lines 344-352). The build process will attempt to find this cu128 package in the cu130 PyTorch index, which will not work. To fix this, the hardcoded PyTorch version in docker/Dockerfile needs to be updated or made dynamic to support CUDA 13.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The BUILD_BASE_IMAGE is set to nvidia/cuda:13.0.1-devel-ubuntu22.04. This contradicts the project's stated goal of using an older Ubuntu version for builds to maintain broad glibc compatibility, as mentioned in docker/Dockerfile (lines 18-21). Using ubuntu22.04 may limit the portability of the generated wheel. Other arm64 builds in this pipeline use the default ubuntu20.04-based image. If this change is not intentional, consider removing the --build-arg BUILD_BASE_IMAGE to use the default.

      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg VLLM_MAIN_CUDA_VERSION=13.0 --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --tag vllm-ci:build-image --target build --progress plain -f docker/Dockerfile ."

- "mkdir artifacts"
- "docker run --rm -v $(pwd)/artifacts:/artifacts_host vllm-ci:build-image bash -c 'cp -r dist /artifacts_host && chmod -R a+rw /artifacts_host'"
- "bash .buildkite/scripts/upload-wheels.sh"
env:
DOCKER_BUILDKIT: "1"

# aarch64 build
- label: "Build arm64 CPU wheel"
depends_on: ~
Expand Down Expand Up @@ -93,6 +108,16 @@ steps:
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=12.9.1 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m) --target vllm-openai --progress plain -f docker/Dockerfile ."
- "docker push public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)"

- label: "Build release image (arm64) - CUDA 13.0"
depends_on: ~
id: build-release-image-arm64-cuda-13-0
agents:
queue: arm64_cpu_queue_postmerge
commands:
- "aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws/q9t5s3a7"
- "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg BUILD_BASE_IMAGE=nvidia/cuda:13.0.1-devel-ubuntu22.04 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cuda13.0 --target vllm-openai --progress plain -f docker/Dockerfile ."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Similar to the wheel build step, setting CUDA_VERSION=13.0.1 for this arm64 build will likely cause a failure. The docker/Dockerfile uses a hardcoded PyTorch version for CUDA 12.8 (torch==2.8.0.dev20250318+cu128) for arm64 platforms (lines 344-352), which is incompatible with the cu130 index that will be used. The hardcoded version in docker/Dockerfile needs to be adjusted for CUDA 13.0 support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The BUILD_BASE_IMAGE is set to an ubuntu22.04-based image, which may reduce the glibc compatibility of the resulting Docker image and the artifacts within. This is inconsistent with the project's documented approach in docker/Dockerfile (lines 18-21) and other arm64 builds in this file. Please consider removing the --build-arg BUILD_BASE_IMAGE argument if using ubuntu22.04 is not a strict requirement for CUDA 13.0.

      - "DOCKER_BUILDKIT=1 docker build --build-arg max_jobs=16 --build-arg USE_SCCACHE=1 --build-arg GIT_REPO_CHECK=1 --build-arg CUDA_VERSION=13.0.1 --build-arg FLASHINFER_AOT_COMPILE=true --build-arg torch_cuda_arch_list='8.7 8.9 9.0 10.0+PTX 12.0' --build-arg INSTALL_KV_CONNECTORS=true --tag public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cuda13.0 --target vllm-openai --progress plain -f docker/Dockerfile ."

- "docker push public.ecr.aws/q9t5s3a7/vllm-release-repo:$BUILDKITE_COMMIT-$(uname -m)-cuda13.0"

# Add job to create multi-arch manifest
- label: "Create multi-arch manifest"
depends_on:
Expand Down