[Model] Add GPT-OSS model code and config #625

ashishtanwer · 2025-08-07T08:47:28Z

Rocm port of the official vllm commit
de98252 by Woosuk Kwon

* fused_moe config for DSv3 on MI300X updated * Add tuning script and post processing script Signed-off-by: Randall Smith <[email protected]> * Add modification to fp8_utils for tuning Signed-off-by: Randall Smith <[email protected]> * update tuning script and add the configs Signed-off-by: Randall Smith <[email protected]> * slightly better tunings Signed-off-by: Randall Smith <[email protected]> * benchmark_moe.py is updated to generate more accurate MoE configs and a specific MoE config for DSv3 is added * Bug in sgl_moe_align_block_size() is fixed by Greg * Generate fp8_w8a8 config for MI300XHF * tunings that don't give garbage output Signed-off-by: Randall Smith <[email protected]> * More accurate tunings Signed-off-by: Randall Smith <[email protected]> * More accurate tunings and reject inaccurate configs Signed-off-by: Randall Smith <[email protected]> * add new tunings Signed-off-by: Randall Smith <[email protected]> * rename tuning script and add benchmark script to use for optimizing blockwise quant Signed-off-by: Randall Smith <[email protected]> * remove white space from file names Signed-off-by: Randall Smith <[email protected]> * remove white space from file names Signed-off-by: Randall Smith <[email protected]> * Remove some unnecessary changes Signed-off-by: Randall Smith <[email protected]> * don't use space in file names Signed-off-by: Randall Smith <[email protected]> * remove XHF tunings Signed-off-by: Randall Smith <[email protected]> * remove OAM from file name Signed-off-by: Randall Smith <[email protected]> * rmeove OAM from file names Signed-off-by: Randall Smith <[email protected]> * yapf Signed-off-by: Randall Smith <[email protected]> * update config name Signed-off-by: Randall Smith <[email protected]> * remove benchmark_moe.py changes Signed-off-by: Randall Smith <[email protected]> * remove is_contiguous Signed-off-by: Randall Smith <[email protected]> * use more recent fp8_utils.py Signed-off-by: Randall Smith <[email protected]> * remove is_contiguous Signed-off-by: Randall Smith <[email protected]> --------- Signed-off-by: Randall Smith <[email protected]> Co-authored-by: qli88 <[email protected]>

…ed to each following path for their ownership to apply (ROCm#427)

Signed-off-by: isotr0py <[email protected]>

…2_17

…rge_25_02_17

Upstream merge 25 02 17

…odeowners (ROCm#431)

* Enabling ROCm CI on MI250 machines: - correct build target - correct queue Signed-off-by: Alexei V. Ivanov <[email protected]> --------- Signed-off-by: Alexei V. Ivanov <[email protected]>

* Optimization for quantized gemm skinny sizes * lint fix * Add support for bf16/fp16 * code cleanup * code cleanup * lint fix2 * cleanup * Moved the logic into tuned gemm to preserve API compatibility --------- Co-authored-by: Gregory Shtrasberg <[email protected]> Co-authored-by: Gregory Shtrasberg <[email protected]>

* Removing gfx940 and gfx941 targets. These have been deprecated in favor of gfx942 for MI300X Signed-off-by: Gregory Shtrasberg <[email protected]> * Remove from custom kernels as well --------- Signed-off-by: Gregory Shtrasberg <[email protected]>

Signed-off-by: Divakar Verma <[email protected]>

* Advance torch commit to be past pytorch/pytorch#144942 to fix tunable ops * Make sure to use the submodule commit compatible with the main aiter commit

…t is fixed (ROCm#443)

Signed-off-by: Sage Moore <[email protected]>

…2_24

…m_merge_25_02_24

Upstream merge 25 02 24

* Using aiter branch that can be built into a whl with PREBUILD_KERNELS=1 * Using fail fast on aiter build to see compilation errors in the log since it fails silently * Check for build success without installing whl

* Using proposed fix from ROCm/aiter#115 * Build fix

…_06_20

Upstream merge 2025 06 23

Upstream merge 2025 06 25

Upstream merge 2025 06 30

* Updated README.md for June 24 Docker release * Added additional throughput results * Fixed some throughput results

* Minor changes to command line examples * README changes and added throughput results Still waiting on latency * Added latency results * Update README.md * Update README.md

* Update test-pipeline.yaml Disabling the "Tensorizer Test". The test is seen to generate exceptions while still reporting as successful. That needs to be verified before re-enabling the test in the production environment. Signed-off-by: Alexei V. Ivanov <[email protected]> * Fixing pre-commit complaints. Signed-off-by: Alexei V. Ivanov <[email protected]> * . Signed-off-by: Alexei V. Ivanov <[email protected]> --------- Signed-off-by: Alexei V. Ivanov <[email protected]>

…symbol exposure (vllm-project#21647)" This reverts commit 9ba1c88. Signed-off-by: Gregory Shtrasberg <[email protected]>

…merge_2025_07_29

Upstream merge 2025 07 29

Rocm port of the official vllm commit de98252 by Woosuk Kwon

gshtras and others added 30 commits February 13, 2025 11:53

Applying weight padding to deepseek (ROCm#421)

aa63571

Removing bad config (ROCm#425)

2679970

The order in the file is important. One needs to be explicitly be add…

b96c11c

…ed to each following path for their ownership to apply (ROCm#427)

avoid calling hf_list_repo_files for local model

ccaff7f

Signed-off-by: isotr0py <[email protected]>

annotation

7cc05dd

Signed-off-by: isotr0py <[email protected]>

Merge remote-tracking branch 'upstream/main' into upstream_merge_25_0…

ce342c7

…2_17

Merge remote-tracking branch 'Isotr0py/local-lookup' into upstream_me…

669fc3f

…rge_25_02_17

Merge pull request ROCm#430 from ROCm/upstream_merge_25_02_17

365687d

Upstream merge 25 02 17

Updating PR template to point people to the upstream repo. Updating c…

4fd2f5b

…odeowners (ROCm#431)

Enabling the ROCm-vLLM CI on MI250 machines (ROCm#432)

17b26bd

* Enabling ROCm CI on MI250 machines: - correct build target - correct queue Signed-off-by: Alexei V. Ivanov <[email protected]> --------- Signed-off-by: Alexei V. Ivanov <[email protected]>

Restricting FP8 wvSplitk to MI300x (ROCm#439)

b63a984

Remove mi300a (ROCm#440)

39456f3

* Removing gfx940 and gfx941 targets. These have been deprecated in favor of gfx942 for MI300X Signed-off-by: Gregory Shtrasberg <[email protected]> * Remove from custom kernels as well --------- Signed-off-by: Gregory Shtrasberg <[email protected]>

resolve diff for mixtral8x7B configs (ROCm#437)

5a6afcc

Signed-off-by: Divakar Verma <[email protected]>

Torch version bump to fix tunable ops (ROCm#442)

ff13c7a

* Advance torch commit to be past pytorch/pytorch#144942 to fix tunable ops * Make sure to use the submodule commit compatible with the main aiter commit

Using AITER branch with fixed whl. Disabling PREBUILD_KERNELS until i…

cea7419

…t is fixed (ROCm#443)

Bump hipblaslt version. Minor fixes to printing the versions (ROCm#447)

118296d

Bumping the version in the right place (ROCm#448)

18689d8

init

07336d2

Signed-off-by: Sage Moore <[email protected]>

init

c226a30

Signed-off-by: Sage Moore <[email protected]>

update logs

ae3594e

Signed-off-by: Sage Moore <[email protected]>

Merge remote-tracking branch 'upstream/main' into upstream_merge_25_0…

92a2279

…2_24

Merge remote-tracking branch 'nm/sage/deepseek-rocm-fix' into upstrea…

8230388

…m_merge_25_02_24

Merge branch 'main' into upstream_merge_25_02_24

d619b41

Fix test that was missed by local linters

46c1c97

Merge pull request ROCm#449 from ROCm/upstream_merge_25_02_24

ba6f019

Upstream merge 25 02 24

Stable aiter build (ROCm#450)

b5a4a37

* Using aiter branch that can be built into a whl with PREBUILD_KERNELS=1 * Using fail fast on aiter build to see compilation errors in the log since it fails silently * Check for build success without installing whl

Remove batch padding on ROCm (ROCm#451)

f932181

Aiter whl fix branch (ROCm#452)

386763c

* Using proposed fix from ROCm/aiter#115 * Build fix

gshtras and others added 19 commits June 20, 2025 21:31

Remove unused vars

9d5b854

Merge remote-tracking branch 'upstream/main' into upstream_merge_2025…

b382964

…_06_20

linter

324b18a

Merge pull request ROCm#581 from ROCm/upstream_merge_2025_06_23

c4258f4

Upstream merge 2025 06 23

Merge remote-tracking branch 'upstream/main'

52741bd

Merge remote-tracking branch 'hyoon1/remove_unused_var'

4ed2d76

Merge pull request ROCm#583 from ROCm/upstream_merge_2025_06_25

1f85814

Upstream merge 2025 06 25

Merge remote-tracking branch 'upstream/main'

d171777

Merge pull request ROCm#586 from ROCm/upstream_merge_2025_06_30

0f7ec48

Upstream merge 2025 06 30

Updated README.md for June 24 Docker release (ROCm#589)

5486e7b

* Updated README.md for June 24 Docker release * Added additional throughput results * Fixed some throughput results

Minor changes to command line examples (ROCm#594)

f94ec9b

* Minor changes to command line examples * README changes and added throughput results Still waiting on latency * Added latency results * Update README.md * Update README.md

Merge remote-tracking branch 'upstream/main'

753b68c

Revert "[AMD][CI/Build] Fix the AMD issue caused by inappropriate of …

10aaf0b

…symbol exposure (vllm-project#21647)" This reverts commit 9ba1c88. Signed-off-by: Gregory Shtrasberg <[email protected]>

cleanup

3a64780

Merge remote-tracking branch 'origin/revert_wrong_fix' into upstream_…

4fe15a8

…merge_2025_07_29

Merge pull request ROCm#613 from ROCm/upstream_merge_2025_07_29

b6ddf62

Upstream merge 2025 07 29

Update the base dockerfile to match the one actually built (ROCm#623)

dfe3216

Add GPT-OSS model code and config

c564e73

Rocm port of the official vllm commit de98252 by Woosuk Kwon

ashishtanwer requested review from Alexei-V-Ivanov-AMD, shajrawi, gshtras, maleksan85, sunway513, hongxiayang, charlifu and mawong-amd as code owners August 7, 2025 08:47

ashishtanwer changed the title ~~Add GPT-OSS model code and config~~ [Model] Add GPT-OSS model code and config Aug 7, 2025

gshtras force-pushed the main branch 2 times, most recently from 1d2c43d to eb9d4de Compare September 9, 2025 16:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model] Add GPT-OSS model code and config #625

[Model] Add GPT-OSS model code and config #625

Uh oh!

ashishtanwer commented Aug 7, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

[Model] Add GPT-OSS model code and config #625

Are you sure you want to change the base?

[Model] Add GPT-OSS model code and config #625

Uh oh!

Conversation

ashishtanwer commented Aug 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ashishtanwer commented Aug 7, 2025 •

edited by github-actions bot

Loading