Skip to content

Conversation

zxd1997066
Copy link
Contributor

@zxd1997066 zxd1997066 commented Sep 19, 2025

This PR intends to add more ported distributed cases in torch-xpu-ops CI. And add pytest-xdist for distributed UT

The distributed UT time will increase to 1h22min with 2 work groups

disable_e2e
disable_ut

@zxd1997066 zxd1997066 force-pushed the xiangdong/ported_cases branch 13 times, most recently from 0d9b54f to 85fa6f1 Compare September 25, 2025 14:29
Copy link
Contributor

@chuanqi129 chuanqi129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split the test scope as CI scope and nightly full scope


inputs:
ut_name:
required: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
required: true
required: false

ze = xpu_list[i+1];
} else {
ze = i;
if [ "${{ inputs.ut_name }}" == "xpu_distributed" ];then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any assumptions in here? Can we detect topology directly and dynamically on the test node?
Please consider below scenarios:

  • No Xelink group, return failed
  • 1 Xelink group, launch 1 worker
  • 2 Xelink group, launch 2 workers
  • ...

runner:
runs-on: ${{ inputs.runner }}
name: get-runner
name: get-runner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we have such change?

@zxd1997066 zxd1997066 force-pushed the xiangdong/ported_cases branch 11 times, most recently from bd63535 to 61d9eef Compare October 13, 2025 03:26
@zxd1997066 zxd1997066 force-pushed the xiangdong/ported_cases branch from 61d9eef to 7d62aaa Compare October 17, 2025 03:29
@zxd1997066 zxd1997066 force-pushed the xiangdong/ported_cases branch 6 times, most recently from f8b4450 to 63799f6 Compare October 19, 2025 13:10
@zxd1997066 zxd1997066 force-pushed the xiangdong/ported_cases branch from 63799f6 to 1bcbce2 Compare October 19, 2025 15:05
@zxd1997066 zxd1997066 requested a review from chuanqi129 October 20, 2025 08:52
@zxd1997066
Copy link
Contributor Author

Please split the test scope as CI scope and nightly full scope

firstly added cases for CI in this PR, will enable nightly test in another PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants