[BugFix] Fix ascend config check #1092

wangxiyuan · 2025-06-05T15:18:54Z

Fix the ascend config check logic:

refactor check_ascend_config to make it clear:
1. torchair graph should not work with enforce_eager=True
2. aclgraph should not work with torchair graph
add refresh config for rlhf case
fix a typo in model runner
change expert_tensor_parallel_size default to 0 to keep the same as before

Yikun

LGTM if CI passed

NeverRaR · 2025-06-05T15:55:58Z

vllm_ascend/ascend_config.py


    # for V1 Engine, aclgraph doesn't work with deepseek model and only qwen model is well tested.
-    if envs.VLLM_USE_V1 and vllm_config.model_config is not None and not enforce_eager:
+    if envs.VLLM_USE_V1 and vllm_config.model_config is not None and not enforce_eager and not ascend_config.torchair_graph_config.enabled:


this check should be moved to platform.py:check_and_update_config

NeverRaR · 2025-06-05T15:58:57Z

vllm_ascend/ascend_config.py

            model_type = vllm_config.model_config.hf_config.model_type
            if "deepseek" not in model_type:
                raise NotImplementedError(
                    "Torchair graph mode only works with deepseek model.")


should be:

if ascend_config.torchair_graph_config.enabled: if envs.VLLM_MLA_DISABLE: logger.warning( "Torchair graph mode is still experimental and not supported for V1 without mla currently, " "it has been disabled automatically.") ascend_config.torchair_graph_config.enabled = False elif vllm_config.model_config: model_type = vllm_config.model_config.hf_config.model_type if "deepseek" not in model_type: raise NotImplementedError( "Torchair graph mode only works with deepseek model.")

ascend_config. ascend_scheduler_config.enabled = False -> ascend_config.torchair_graph_config.enabled = False
if vllm_config.model_config: -> elif vllm_config.model_config:

wangxiyuan · 2025-06-06T01:06:24Z

@NeverRaR Thanks for your review. I've updated the check logic to make it more clear. Please take a look at again.

Yikun · 2025-06-06T02:43:59Z

docs/source/user_guide/additional_config.md

 | `torchair_graph_config` | dict | `{}` | The config options for torchair graph mode |
 | `ascend_scheduler_config` | dict | `{}` | The config options for ascend scheduler  |
 | `expert_tensor_parallel_size` | str | `1` | Expert tensor parallel size the model to use. |
+| `refresh` | bool | `false` | Whether to refresh global ascend config content. This value is usually used by rlhf case. |


QQ: Does vLLM has similar config?

Not, this value is only used for rlhf, for verl or someother framework, the case is:

verl load and update vllm config

verl start LLM with additional config in external_executor mode

in the fisrt step, the ascend config has been initialized, then in the second step, the additional config will be skipped.

To solve the problem, we should let verl pass refresh, the we can regenerate the config

Signed-off-by: wangxiyuan <[email protected]>

…llm-project#1092) Merge branch wengang/cherry-pick-1029-1092 of [email protected]:Theta/vllm-ascend.git into dev-v0.9.0604 https://code.alipay.com/Theta/vllm-ascend/pull_requests/108 Reviewed-by: 子宏 <[email protected]> * [Misc] Refactor additional_config (vllm-project#1029) * [BugFix] Fix ascend config check (vllm-project#1092) * [Misc] Update benchmark scripts

Fix the ascend config check logic: 1. refactor check_ascend_config to make it clear: 1. torchair graph should not work with enforce_eager=True 2. aclgraph should not work with torchair graph 3. add refresh config for rlhf case 4. fix a typo in model runner 5. change expert_tensor_parallel_size default to 0 to keep the same as before Signed-off-by: wangxiyuan <[email protected]>

github-actions bot added module:tests module:core labels Jun 5, 2025

wangxiyuan force-pushed the fix_ascend_config branch from 372e466 to b9a4e95 Compare June 5, 2025 15:35

github-actions bot added the documentation Improvements or additions to documentation label Jun 5, 2025

Yikun approved these changes Jun 5, 2025

View reviewed changes

Yikun reviewed Jun 5, 2025

View reviewed changes

wangxiyuan force-pushed the fix_ascend_config branch 3 times, most recently from 324099c to d1439b7 Compare June 5, 2025 15:51

NeverRaR reviewed Jun 5, 2025

View reviewed changes

NeverRaR suggested changes Jun 5, 2025

View reviewed changes

wangxiyuan force-pushed the fix_ascend_config branch from d1439b7 to 3eefdc9 Compare June 6, 2025 01:04

Yikun reviewed Jun 6, 2025

View reviewed changes

NeverRaR approved these changes Jun 6, 2025

View reviewed changes

wangxiyuan force-pushed the fix_ascend_config branch 5 times, most recently from 830e333 to 8cf5adf Compare June 6, 2025 08:47

[BugFix] Fix ascend config check

755edcf

Signed-off-by: wangxiyuan <[email protected]>

wangxiyuan force-pushed the fix_ascend_config branch from 8cf5adf to 755edcf Compare June 6, 2025 09:08

wangxiyuan merged commit dab19d5 into vllm-project:main Jun 6, 2025
23 checks passed

wangxiyuan deleted the fix_ascend_config branch June 9, 2025 01:25

shen-shanshan mentioned this pull request Jun 10, 2025

[Bug][CI Failure]: AttributeError: ParallelConfig object has no attribute expert_parallel_size #1059

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] Fix ascend config check #1092

[BugFix] Fix ascend config check #1092

wangxiyuan commented Jun 5, 2025 •

edited

Loading

Uh oh!

Yikun left a comment

Uh oh!

NeverRaR Jun 5, 2025 •

edited

Loading

Uh oh!

NeverRaR Jun 5, 2025

Uh oh!

wangxiyuan commented Jun 6, 2025

Uh oh!

Yikun Jun 6, 2025

Uh oh!

wangxiyuan Jun 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[BugFix] Fix ascend config check #1092

[BugFix] Fix ascend config check #1092

Conversation

wangxiyuan commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yikun left a comment

Choose a reason for hiding this comment

Uh oh!

NeverRaR Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NeverRaR Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented Jun 6, 2025

Uh oh!

Yikun Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wangxiyuan commented Jun 5, 2025 •

edited

Loading

NeverRaR Jun 5, 2025 •

edited

Loading

wangxiyuan Jun 6, 2025 •

edited

Loading