-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[LLVM] Support atomic for GPU backend (NVPTX, ROCm) #7051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Laurawly
approved these changes
Dec 8, 2020
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a small comment that we can add a todo in the comment for CPU atomic.
zhiics
approved these changes
Dec 8, 2020
Member
Merged
TusharKanekiDey
pushed a commit
to TusharKanekiDey/tvm
that referenced
this pull request
Jan 20, 2021
* support atomic add on llvm * make atomic builtin intrin * test bincount on nvptx * use builtin::atomic_add * add atomic llvm codegen test, only works on int8 input somehow * supports fp32 atomic * drop support for cpu atomic * add comment * add atomic gpu unit test * reenable other tests * add doc string * run black * fix build with llvm 8 and older * fix format * do not run float32 atomic test on ci * do not run scatter_add 1d with float inputs on CI * fix typo * add todo comment for cpu backend * fix build on ci Co-authored-by: masa <[email protected]>
trevor-m
pushed a commit
to neo-ai/tvm
that referenced
this pull request
Jan 21, 2021
* support atomic add on llvm * make atomic builtin intrin * test bincount on nvptx * use builtin::atomic_add * add atomic llvm codegen test, only works on int8 input somehow * supports fp32 atomic * drop support for cpu atomic * add comment * add atomic gpu unit test * reenable other tests * add doc string * run black * fix build with llvm 8 and older * fix format * do not run float32 atomic test on ci * do not run scatter_add 1d with float inputs on CI * fix typo * add todo comment for cpu backend * fix build on ci Co-authored-by: masa <[email protected]>
electriclilies
pushed a commit
to electriclilies/tvm
that referenced
this pull request
Feb 18, 2021
* support atomic add on llvm * make atomic builtin intrin * test bincount on nvptx * use builtin::atomic_add * add atomic llvm codegen test, only works on int8 input somehow * supports fp32 atomic * drop support for cpu atomic * add comment * add atomic gpu unit test * reenable other tests * add doc string * run black * fix build with llvm 8 and older * fix format * do not run float32 atomic test on ci * do not run scatter_add 1d with float inputs on CI * fix typo * add todo comment for cpu backend * fix build on ci Co-authored-by: masa <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds a new tir builtin
atomic_addand corresponding lowering rule for LLVM GPU backends. So far,atomic_addis introduced and used by CUDA topi, and LLVM based GPU backend cannot compile ops that use it (nms,scatter_add,argwhere).Unfortunately I couldn't get atomic_add working for CPU backend. There is some pointer cast issue that llvm IR verifier rejects. I think it is related to implicit cast to i8* done by LLVM CPU backend, but I haven't looked into details. So for now, only GPU backends support lowering atomic_add.
Other restriction is I've only supported 32 bit atomics. Supporting int64 atomic would be desirable but it looks complicated (need to generate CAS loop etc).
Obviously I'm a complete noob to atomic issues, any help would be appreciated.
please review @tqchen @zhiics @yzhliu @yidawang