Auto TensorCore CodeGen #4106

minminsun · 2019-10-11T14:56:05Z

This is the code for RFC #4105

There's a a sample matmul schedule in this PR. The command to run it is:

python tutorials/autotvm/tensor_core_matmul.py $M $N $K $dtype $layout

$dtype is one of {‘float16’, ‘int8’}. (‘int8’ requires CUDA Version >= 10.0 & GPU Arch >= 7.2)

$layout is one of {‘NN’, ‘NT’, ‘TN’, ‘TT’}

yangjunpro · 2019-10-11T15:44:22Z

@tqchen @Hzfengsy you may take a look at it and any feedback&comments are highly welcome.

huajsj · 2019-10-11T21:02:38Z

python/tvm/build_module.py

+        # choose device 0
+        # attr type 4 for CUDA Compute Capability
+        cuda_compute_capability = _api_internal._GetDeviceAttr(2, 0, 4)
+        from tvm.contrib.nvcc import find_cuda_path, get_cuda_version


seems like not necessary to put this "from..import" inside Try, and it may cause another problem that once "tvm.contrib.nvcc" changed , this module would set cuda_compute_capability into None instead of report the error.

Yeah, we have same concerns about this part when submitting this pr. It should be better to move the cuda version and capability check to somewhere inside the TensorCore pass.

huajsj · 2019-10-11T21:05:38Z

src/api/api_pass.cc

+.set_body([](TVMArgs args, TVMRetValue *ret) {
+    if (args.size() == 5) {
+      *ret = TensorCore(args[0], args[1], args[2], args[3], args[4]);
+    }


should handle not "args.size() == 5" case and set *ret value.

were · 2019-10-19T01:20:46Z

include/tvm/ir_pass.h


+Stmt TensorCore(Stmt stmt,
+                Schedule schedule,
+                double cuda_compute_capability,


Can we use integer here instead of double?

were · 2019-10-19T01:24:02Z

include/tvm/ir.h

-                           Stmt body);
+                           Stmt body,
+                           Expr new_expr = Expr(),
+                           std::string free_function = std::string());


Just curious, why are not them symmetric?

minminsun · 2019-10-31T09:44:23Z

Opened PR #4234. Closing this one. Thank you all for your reviews and comments.

Add IR Passes for Auto TensorCore CodeGen

9521f63

huajsj reviewed Oct 11, 2019

View reviewed changes

were suggested changes Oct 19, 2019

View reviewed changes

minminsun closed this Oct 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto TensorCore CodeGen #4106

Auto TensorCore CodeGen #4106

Uh oh!

minminsun commented Oct 11, 2019

Uh oh!

yangjunpro commented Oct 11, 2019

Uh oh!

huajsj Oct 11, 2019

Uh oh!

jcf94 Oct 12, 2019

Uh oh!

huajsj Oct 11, 2019

Uh oh!

were Oct 19, 2019

Uh oh!

were Oct 19, 2019

Uh oh!

minminsun commented Oct 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Auto TensorCore CodeGen #4106

Auto TensorCore CodeGen #4106

Uh oh!

Conversation

minminsun commented Oct 11, 2019

Uh oh!

yangjunpro commented Oct 11, 2019

Uh oh!

huajsj Oct 11, 2019

Choose a reason for hiding this comment

Uh oh!

jcf94 Oct 12, 2019

Choose a reason for hiding this comment

Uh oh!

huajsj Oct 11, 2019

Choose a reason for hiding this comment

Uh oh!

were Oct 19, 2019

Choose a reason for hiding this comment

Uh oh!

were Oct 19, 2019

Choose a reason for hiding this comment

Uh oh!

minminsun commented Oct 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants