Optimize causal mask using torch.where #2715

Akababa · 2020-02-02T21:11:03Z

Instead of multiplying by 1.0 float mask, use torch.where with a bool mask for increased performance.

codecov-io · 2020-02-02T21:18:19Z

Codecov Report

Merging #2715 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2715      +/-   ##
==========================================
- Coverage    77.8%   77.79%   -0.02%     
==========================================
  Files         100      100              
  Lines       17051    17052       +1     
==========================================
- Hits        13267    13266       -1     
- Misses       3784     3786       +2

Impacted Files	Coverage Δ
src/transformers/modeling_gpt2.py	`86.2% <100%> (+0.04%)`	⬆️
src/transformers/modeling_tf_utils.py	`88.15% <0%> (-0.18%)`	⬇️
src/transformers/modeling_utils.py	`91.81% <0%> (-0.14%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33ef700...a54a418. Read the comment docs.

julien-c · 2020-02-03T14:12:05Z

Thanks – what's the PyTorch compatibility on this?

Akababa · 2020-02-04T02:27:19Z

Not sure about that, where can I find more info on compatibility? I think it only relies on torch.where (introduced <= 1.0.0) and tensors of dtype torch.bool (introduced in 1.2.0). Does the None (newaxis) slicing introduce compatibility issues?

If we want to maintain compatibility with 1.0.0, I think we can use torch.uint8 instead of torch.bool.

nikita-smetanin · 2020-03-25T12:48:11Z

Hi, I'd recommend to make the following changes:

Keep the original shapes of bias buffer (because otherwise it breaks loading of already trained models) and make dtype equal to torch.uint8, so it'd be compatible with pytorch 1.0.0 as no torch.bool type available.
self.register_buffer("bias", torch.tril(torch.ones((n_ctx, n_ctx), dtype=torch.uint8)).view(1, 1, n_ctx, n_ctx))
Keep -1e4 constant in a buffer to reduce allocations on each _attn call and make it works automatically with different devices (CPU and CUDA):
self.register_buffer("masked_bias", torch.tensor(-1e4))
Keep b = self.bias[:, :, ns - nd : ns, :ns] line as bias buffer have the original shape now
So the where statement should look like w = torch.where(b, w, self.masked_bias)

As a result, overall speedup will be at 10-15% here as I measured, and the code should be 100% compatible with pytorch 1.0.0

patrickvonplaten · 2020-03-29T10:43:18Z

Hi @Akababa,

Thanks for the PR. I think this is a great change. I checked and it does lead to a significant speed-up :-)

Could you fix the tests and I think then we can merge (see https://github.com/huggingface/transformers/blob/master/CONTRIBUTING.md)

You should fetch the master branch and rebase your branch on top of it.
Make sure to run make style in the root folder before pushing to pass the "check_code_quality" test.

Instead of multiplying by 1.0 float mask, use torch.where with a bool mask for increased performance.

patrickvonplaten · 2020-04-05T08:24:22Z

Great work @Akababa - this looks good to me!

@LysandreJik @thomwolf - could you check and merge?

LysandreJik

LGTM

patrickvonplaten · 2020-04-07T20:24:47Z

Checked slow hardcoded GPT2 tests and it looks all good!

julien-c requested a review from patrickvonplaten March 25, 2020 23:26

patrickvonplaten self-assigned this Mar 27, 2020

Akababa force-pushed the patch-1 branch from 979c454 to 0a6206e Compare March 29, 2020 02:30

patrickvonplaten approved these changes Mar 29, 2020

View reviewed changes

Akababa added 4 commits March 29, 2020 15:02

Optimize causal mask using torch.where

96efd53

Instead of multiplying by 1.0 float mask, use torch.where with a bool mask for increased performance.

Maintain compatiblity with torch 1.0.0 - thanks for PR feedback

847a351

Fix typo

c2fa05f

reformat line for CI

a54a418

Akababa force-pushed the patch-1 branch from 6cb90ca to a54a418 Compare March 29, 2020 20:03

patrickvonplaten assigned thomwolf and LysandreJik Apr 5, 2020

LysandreJik approved these changes Apr 7, 2020

View reviewed changes

patrickvonplaten merged commit 05deb52 into huggingface:master Apr 7, 2020

patrickvonplaten mentioned this pull request May 14, 2020

Longformer #4352

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize causal mask using torch.where #2715

Optimize causal mask using torch.where #2715

Uh oh!

Akababa commented Feb 2, 2020

Uh oh!

codecov-io commented Feb 2, 2020 •

edited

Loading

Uh oh!

julien-c commented Feb 3, 2020

Uh oh!

Akababa commented Feb 4, 2020 •

edited

Loading

Uh oh!

nikita-smetanin commented Mar 25, 2020 •

edited

Loading

Uh oh!

patrickvonplaten commented Mar 29, 2020

Uh oh!

patrickvonplaten commented Apr 5, 2020

Uh oh!

LysandreJik left a comment

Uh oh!

patrickvonplaten commented Apr 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Optimize causal mask using torch.where #2715

Optimize causal mask using torch.where #2715

Uh oh!

Conversation

Akababa commented Feb 2, 2020

Uh oh!

codecov-io commented Feb 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

julien-c commented Feb 3, 2020

Uh oh!

Akababa commented Feb 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikita-smetanin commented Mar 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten commented Mar 29, 2020

Uh oh!

patrickvonplaten commented Apr 5, 2020

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Apr 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

codecov-io commented Feb 2, 2020 •

edited

Loading

Akababa commented Feb 4, 2020 •

edited

Loading

nikita-smetanin commented Mar 25, 2020 •

edited

Loading