Skip to content

Conversation

@soiferj
Copy link
Contributor

@soiferj soiferj commented Aug 15, 2019

Use cblas for dense and batch_matmul when "cblas" is in the target libraries. This is already supported for cublas and rocblas, it's just not plugged in for cblas. Also added overridden batch_matmul compute for x86.

@ajtulloch @yinghai would you be able to review?

@yinghai
Copy link

yinghai commented Aug 15, 2019

@hlu1, since you have a similar version. Could you take a look?

Copy link
Contributor

@hlu1 hlu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perf wise, it would make more sense to use AutoTVM to choose whether to use cblas based on tuning results, instead of hardcoding it.

cc @icemelon9, @kevinthesun

x : tvm.Tensor
3-D with shape [batch, M, K]
y : tvm.TEnsor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type: y : tvm.Tensor

Please fix the same typo on line 32 as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

x : tvm.Tensor
3-D with shape [batch, M, K]
y : tvm.TEnsor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@soiferj
Copy link
Contributor Author

soiferj commented Aug 19, 2019

Generally I would agree, but in this case, batch_matmul doesn't have any tuning schedules defined right now. Also, this code path is only hit when you explicitly specify "cblas" in the libs. If the user sets that as their library, I feel like it's expected to use BLAS.

I think we can do tuning of BLAS vs TVM schedules in another PR.

Also, dense already has this support for cublas and rocblas. It's just missing for cblas.

Copy link
Contributor

@hlu1 hlu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I can send a follow-up PR to fix the autotuning if you don't mind. I basically have it working.

@hlu1
Copy link
Contributor

hlu1 commented Aug 20, 2019

Please fix CI issue.

@hlu1 hlu1 merged commit c870261 into apache:master Aug 21, 2019
@soiferj soiferj deleted the soiferj/cblas branch August 22, 2019 16:31
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
…rget libraries (apache#3787)

* Support cblas library in dense

* start to add support for generic batch_matmul compute

* Add x86 override for batch_matmul

* Fix linting

* reset file

* Fix typos

* dummy change to re-trigger CI
wweic pushed a commit to wweic/tvm that referenced this pull request Sep 16, 2019
…rget libraries (apache#3787)

* Support cblas library in dense

* start to add support for generic batch_matmul compute

* Add x86 override for batch_matmul

* Fix linting

* reset file

* Fix typos

* dummy change to re-trigger CI
wweic pushed a commit to neo-ai/tvm that referenced this pull request Sep 16, 2019
…rget libraries (apache#3787)

* Support cblas library in dense

* start to add support for generic batch_matmul compute

* Add x86 override for batch_matmul

* Fix linting

* reset file

* Fix typos

* dummy change to re-trigger CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants