From d57b48024de89dcd32bb0681728836adea434db0 Mon Sep 17 00:00:00 2001 From: arakowsk-amd <182798202+arakowsk-amd@users.noreply.github.com> Date: Wed, 5 Feb 2025 20:48:27 -0800 Subject: [PATCH 1/3] Update README.md 20250205_aiter --- docs/dev-docker/README.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/dev-docker/README.md b/docs/dev-docker/README.md index 1ce6da2da95d..d12ba2591d9f 100644 --- a/docs/dev-docker/README.md +++ b/docs/dev-docker/README.md @@ -12,7 +12,7 @@ The pre-built image includes: - ROCmâ„¢ 6.3.1 - vLLM 0.6.6 -- PyTorch 2.6dev (nightly) +- PyTorch 2.7dev (nightly) ## Pull latest Docker Image @@ -20,6 +20,10 @@ Pull the most recent validated docker image with `docker pull rocm/vllm-dev:main ## What is New +20250205_aiter: +- [AITER](https://github.com/ROCm/aiter) support +- Performance improvement for custom paged attention +- Reduced memory overhead bug fix 20250124: - Fix accuracy issue with 405B FP8 Triton FA - Fixed accuracy issue with TP8 @@ -475,7 +479,7 @@ To reproduce the release docker: ```bash git clone https://github.com/ROCm/vllm.git cd vllm - git checkout 8e87b08c2a284c1a20eb3d8e0fbdc84918bf27dc + git checkout 9dc3394c9ee4da250be28d7bd08babf098d51081 docker build -f Dockerfile.rocm -t --build-arg BUILD_HIPBLASLT=1 --build-arg USE_CYTHON=1 . ``` From 4da60712376523ba5451bbcfdf7f9a038a0e8f07 Mon Sep 17 00:00:00 2001 From: arakowsk-amd <182798202+arakowsk-amd@users.noreply.github.com> Date: Wed, 5 Feb 2025 20:52:45 -0800 Subject: [PATCH 2/3] whitespace --- docs/dev-docker/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/dev-docker/README.md b/docs/dev-docker/README.md index d12ba2591d9f..a1a292493b4b 100644 --- a/docs/dev-docker/README.md +++ b/docs/dev-docker/README.md @@ -21,7 +21,7 @@ Pull the most recent validated docker image with `docker pull rocm/vllm-dev:main ## What is New 20250205_aiter: -- [AITER](https://github.com/ROCm/aiter) support +- [AITER](https://github.com/ROCm/aiter) support - Performance improvement for custom paged attention - Reduced memory overhead bug fix 20250124: From 796a79360859fd37963d2b8b75d6cb515dc4fe7d Mon Sep 17 00:00:00 2001 From: arakowsk-amd <182798202+arakowsk-amd@users.noreply.github.com> Date: Wed, 5 Feb 2025 21:31:18 -0800 Subject: [PATCH 3/3] adding VLLM_USE_AITER=0 advice --- docs/dev-docker/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/dev-docker/README.md b/docs/dev-docker/README.md index a1a292493b4b..d54efc1d557b 100644 --- a/docs/dev-docker/README.md +++ b/docs/dev-docker/README.md @@ -481,6 +481,7 @@ To reproduce the release docker: cd vllm git checkout 9dc3394c9ee4da250be28d7bd08babf098d51081 docker build -f Dockerfile.rocm -t --build-arg BUILD_HIPBLASLT=1 --build-arg USE_CYTHON=1 . + export VLLM_USE_AITER=0 ``` ### AITER