[ROCm] [AITER] [Bugfix] Patch for AITER commit 648764942e552a8bb5fe16026703716a81f05374
#18990
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a bugfix after PR (#18596). ONLY Merge this after PR (#18596).
This PR also include upgrading the AITER commit of the Dockerfile.
PR (#18596) introduced the use of AITER MHA which depends on a new AITER commit
648764942e552a8bb5fe16026703716a81f05374
.AITER commit: ROCm/aiter@a02a93d has introduced a new enum value in a breaking changes manner.
lm_eval after fix
Qwen/Qwen3-235B-A22B-FP8
mistralai/Mixtral-8x7B-Instruct-v0.1
mistralai/Mixtral-8x7B-Instruct-v0.1 dynamic fp8 quantization
deepseek-ai/DeepSeek-V3
Note:
Even if the AITER commit:
648764942e552a8bb5fe16026703716a81f05374
introduced a new input argumentmin_seqlen_q
toflash_attn_varlen_func
. It seems that the default value is set to0
which retains compatibility with how the MHA is used in the ROCm MLA v1 class. Refer to https://github.com/ROCm/aiter/blob/2c1a21adad9c5b5e02619c7dd05d63f9afda3642/aiter/ops/mha.py#L1369