Skip to content

Conversation

@vinx13
Copy link
Member

@vinx13 vinx13 commented Mar 4, 2019

Writing int32 result to global memory can be much slower than int8. This PR does the following change:

  • in add_rewrite, quantize rhs to int8 so that read/write of rhs can be performed in int8.
  • In UnifyDtypeScale, if the input is simulated_quantize(QInput), cast the input to int8 before casting to int32.

@ZihengJiang

@vinx13 vinx13 force-pushed the feature/quanti_improve branch from 865b37c to 4996286 Compare March 4, 2019 08:11
@ZihengJiang ZihengJiang self-assigned this Mar 5, 2019
tqchen
tqchen previously requested changes Mar 5, 2019
@ZihengJiang
Copy link
Contributor

for the comment, I mean to explain the code like here

@ZihengJiang
Copy link
Contributor

please fixed the CI @vinx13

@vinx13 vinx13 force-pushed the feature/quanti_improve branch from 807b5cd to 59474d2 Compare March 9, 2019 04:46
@ZihengJiang ZihengJiang merged commit 21e8dfa into apache:master Mar 9, 2019
@ZihengJiang
Copy link
Contributor

Merged, thanks! @vinx13

wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 9, 2019
…ache#2723)

* [Relay][Quantization] Speed-aware quantization scheme improvement

* Add comment

* Add use_stop_fusion to qconfig

* Update comment
wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 12, 2019
…ache#2723)

* [Relay][Quantization] Speed-aware quantization scheme improvement

* Add comment

* Add use_stop_fusion to qconfig

* Update comment
wweic pushed a commit to neo-ai/tvm that referenced this pull request Mar 12, 2019
…ache#2723)

* [Relay][Quantization] Speed-aware quantization scheme improvement

* Add comment

* Add use_stop_fusion to qconfig

* Update comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants