Skip to content

Conversation

tjtanaa
Copy link

@tjtanaa tjtanaa commented Feb 24, 2025

Please direct your PRs to the upstream vllm (https://github.com/vllm-project/vllm.git)

Accepting PRs into the ROCm fork (https://github.com/ROCm/vllm) will require a clear previously communicated exception

maleksan85 and others added 30 commits February 5, 2025 03:58
…lling (vllm-project#12713)

Signed-off-by: Aleksandr Malyshev <[email protected]>
Co-authored-by: Aleksandr Malyshev <[email protected]>
Merged via CLI script
WoosukKwon and others added 25 commits February 16, 2025 10:02
Signed-off-by: Tyler Michael Smith <[email protected]>
Co-authored-by: Yu Chin Fabian Lim <[email protected]>
* Enabling ROCm CI on MI250 machines:
- correct build target
- correct queue

Signed-off-by: Alexei V. Ivanov <[email protected]>

---------

Signed-off-by: Alexei V. Ivanov <[email protected]>
* Optimization for quantized gemm skinny sizes

* lint fix

* Add support for bf16/fp16

* code cleanup

* code cleanup

* lint fix2

* cleanup

* Moved the logic into tuned gemm to preserve API compatibility

---------

Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: Gregory Shtrasberg <[email protected]>
* Removing gfx940 and gfx941 targets. These have been deprecated in favor of gfx942 for MI300X

Signed-off-by: Gregory Shtrasberg <[email protected]>

* Remove from custom kernels as well

---------

Signed-off-by: Gregory Shtrasberg <[email protected]>
* Advance torch commit to be past pytorch/pytorch#144942 to fix tunable ops

* Make sure to use the submodule commit compatible with the main aiter commit
@tjtanaa tjtanaa marked this pull request as ready for review February 24, 2025 10:04
@hongxiayang hongxiayang merged commit d7fefdf into ROCm:llama_fp8_12062024 Feb 25, 2025
1 of 2 checks passed
@vllmellm vllmellm deleted the merge-main-to-llama-fp8 branch March 12, 2025 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.