Skip to content

Conversation

@ProExpertProg
Copy link
Collaborator

@ProExpertProg ProExpertProg commented Oct 15, 2025

This is just #26738 but with spawn forced so we can try that in CI as well.

Signed-off-by: Huy Do <[email protected]>
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: Huy Do <[email protected]>
huydhn and others added 11 commits October 14, 2025 18:14
Signed-off-by: ProExpertProg <[email protected]>
Signed-off-by: Luka Govedič <[email protected]>
commit a4ee300
Author: angelayi <[email protected]>
Date:   Tue Oct 14 19:19:25 2025 -0700

    test moving import

    Signed-off-by: angelayi <[email protected]>

commit 0ba846b
Author: angelayi <[email protected]>
Date:   Mon Oct 13 13:36:43 2025 -0700

    [BugFix] Patch inductor partitioning logic

    Signed-off-by: angelayi <[email protected]>

Signed-off-by: ProExpertProg <[email protected]>
commit 6b0c3c3
Author: Boyuan Feng <[email protected]>
Date:   Tue Oct 14 21:30:29 2025 -0700

    nit

    Signed-off-by: Boyuan Feng <[email protected]>

commit 1016467
Author: Boyuan Feng <[email protected]>
Date:   Tue Oct 14 21:21:47 2025 -0700

    fix multi-graph test

    Signed-off-by: Boyuan Feng <[email protected]>

Signed-off-by: ProExpertProg <[email protected]>
Signed-off-by: ProExpertProg <[email protected]>
@mergify
Copy link

mergify bot commented Oct 15, 2025

⚠️ The sha of the head commit of this PR conflicts with #26738. Mergify cannot evaluate rules on this PR. ⚠️

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant number of changes to support PyTorch 2.9, including updates to dependencies, CI/CD configurations, and Dockerfiles. It also enables Inductor graph partitioning by default for PyTorch 2.9+ and switches the default multiprocessing method for workers to spawn. Notably, it includes monkeypatches for PyTorch 2.9 to work around upstream issues, which are well-documented. The overall changes seem to be moving in the right direction for PyTorch 2.9 support. However, I've found some temporary testing code that should be removed before this PR is merged.

Comment on lines +21 to +23
# TESTING, TO BE REMOVED
VLLM_TEST_USE_PRECOMPILED_NIGHTLY_WHEEL=1 VLLM_USE_PRECOMPILED=1 pip3 install -vvv -e . \
--extra-index-url https://download.pytorch.org/whl/test/cu128
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change appears to be for temporary testing, as indicated by the comment # TESTING, TO BE REMOVED. Hardcoding the --extra-index-url and leaving such comments can lead to future issues if not cleaned up. Please remove this temporary code before this pull request is merged.

Suggested change
# TESTING, TO BE REMOVED
VLLM_TEST_USE_PRECOMPILED_NIGHTLY_WHEEL=1 VLLM_USE_PRECOMPILED=1 pip3 install -vvv -e . \
--extra-index-url https://download.pytorch.org/whl/test/cu128
VLLM_TEST_USE_PRECOMPILED_NIGHTLY_WHEEL=1 VLLM_USE_PRECOMPILED=1 pip3 install -vvv -e .

@GhostCCCatHenry
Copy link

GhostCCCatHenry commented Oct 15, 2025 via email

@ProExpertProg ProExpertProg added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 15, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Comment on lines 184 to +188
pytest.skip("inductor graph partition is only available in PyTorch 2.9+")

model = "nvidia/Llama-4-Scout-17B-16E-Instruct-FP8"
if current_platform.get_device_capability()[0] < 10:
pytest.skip(f"{model} can only be loaded by B200 or above")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard device capability before indexing

The new B200 gate in test_inductor_graph_partition_attn_fusion assumes current_platform.get_device_capability() always returns a tuple, but on CPU builds get_device_capability() returns None. The added check current_platform.get_device_capability()[0] < 10 will raise a TypeError before the test can skip, causing the entire test run to crash on environments without CUDA rather than being skipped. Consider storing the capability in a variable and skipping when it is None or has a major version below 10.

Useful? React with 👍 / 👎.

@ProExpertProg
Copy link
Collaborator Author

Seems like spawn is resolved, closing for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants