-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Auto TensorCore CodeGen #4106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto TensorCore CodeGen #4106
Conversation
| # choose device 0 | ||
| # attr type 4 for CUDA Compute Capability | ||
| cuda_compute_capability = _api_internal._GetDeviceAttr(2, 0, 4) | ||
| from tvm.contrib.nvcc import find_cuda_path, get_cuda_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like not necessary to put this "from..import" inside Try, and it may cause another problem that once "tvm.contrib.nvcc" changed , this module would set cuda_compute_capability into None instead of report the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we have same concerns about this part when submitting this pr. It should be better to move the cuda version and capability check to somewhere inside the TensorCore pass.
| .set_body([](TVMArgs args, TVMRetValue *ret) { | ||
| if (args.size() == 5) { | ||
| *ret = TensorCore(args[0], args[1], args[2], args[3], args[4]); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should handle not "args.size() == 5" case and set *ret value.
|
|
||
| Stmt TensorCore(Stmt stmt, | ||
| Schedule schedule, | ||
| double cuda_compute_capability, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use integer here instead of double?
| Stmt body); | ||
| Stmt body, | ||
| Expr new_expr = Expr(), | ||
| std::string free_function = std::string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why are not them symmetric?
|
Opened PR #4234. Closing this one. Thank you all for your reviews and comments. |
This is the code for RFC #4105
There's a a sample matmul schedule in this PR. The command to run it is:
python tutorials/autotvm/tensor_core_matmul.py $M $N $K $dtype $layout
$dtype is one of {‘float16’, ‘int8’}. (‘int8’ requires CUDA Version >= 10.0 & GPU Arch >= 7.2)
$layout is one of {‘NN’, ‘NT’, ‘TN’, ‘TT’}