-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[TPU] Deprecate xm.mark_step in favor of `torch_xla.sync
#25254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TPU] Deprecate xm.mark_step in favor of `torch_xla.sync
#25254
Conversation
Signed-off-by: NickLucche <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly replaces the deprecated xm.mark_step() with the new torch_xla.sync() API across multiple files. Most changes correctly use torch_xla.sync(wait=False), which is the direct equivalent for the asynchronous xm.mark_step(). However, I've identified one instance where torch_xla.sync() is used without wait=False, which changes the behavior to a synchronous wait. This could introduce a performance regression. My review includes a suggestion to correct this for consistency and to maintain the intended asynchronous behavior.
| ): | ||
| # Make sure we don't accidentally collect outside operations | ||
| xm.mark_step() | ||
| torch_xla.sync() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call to torch_xla.sync() uses the default wait=True, which introduces a synchronous wait. The deprecated xm.mark_step() is equivalent to torch_xla.sync(wait=False). Using wait=True here may unnecessarily block execution and could impact performance. To maintain the original asynchronous behavior and ensure consistency with other changes in this pull request, torch_xla.sync(wait=False) should be used instead.
| torch_xla.sync() | |
| torch_xla.sync(wait=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default is wait=False https://github.com/pytorch/xla/blob/0c0ae2d22d5eec52b9c69da831185a92bbb37d1c/torch_xla/torch_xla.py#L78, you should know better gemini >,<
|
cc @yaochengji |
yaochengji
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…oject#25254) Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
…oject#25254) Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: charlifu <[email protected]>
Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: yewentao256 <[email protected]>
…oject#25254) Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…oject#25254) Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
…oject#25254) Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
…oject#25254) Signed-off-by: NickLucche <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
xm.mark_stephas been deprecated for some time now.I am replacing the occurrences with the new API
torch_xla.sync.So long,
xm.mark_step().