Improve numba DimShuffle compile time #95

aseyboldt · 2022-12-08T23:23:29Z

Compile time for DimShuffle ops is pretty long:

import pytensor
import pytensor.tensor as pt
import numpy as np
import numba

z = pt.dscalar("z")
out = z[None, None]
z_val = np.array(0.1)

%%time
func = pytensor.function([z], out, mode="NUMBA")
func(z_val)


# Before
CPU times: user 1.6 s, sys: 23.5 ms, total: 1.62 s
Wall time: 1.62 s

# After
CPU times: user 661 ms, sys: 36.1 ms, total: 697 ms
Wall time: 697 ms

Still ways to go, and unfortunately it doesn't seem to have as big an impact on the compile times for larger models as I hoped, but it is a start...

Going forward I think we should try to find more ops like this, where individually the compile time is large, and try some other things:

Do we really need to ask for O3 optimization all the time? I guess O2 might be enough for most cases
We recreate the numba functions all the time, which means that llvm will potentially see functions multiple times, and will also have to optimize those multiple times. Could we move a lot of njit functions out of the dispatch function? I think this also prevents numba caching from working properly.
There are a lot of inline="always" functions still around. Maybe we want to get rid of at least some of those?

codecov-commenter · 2022-12-09T00:40:10Z

Codecov Report

Merging #95 (5ccb1f2) into main (491f93e) will increase coverage by 0.18%.
The diff coverage is 86.11%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #95      +/-   ##
==========================================
+ Coverage   74.22%   74.41%   +0.18%     
==========================================
  Files         174      179       +5     
  Lines       48734    49249     +515     
  Branches    10367    10422      +55     
==========================================
+ Hits        36175    36649     +474     
- Misses      10272    10295      +23     
- Partials     2287     2305      +18

Impacted Files	Coverage Δ
pytensor/misc/ordered_set.py	`79.80% <ø> (ø)`
pytensor/tensor/nnet/corr.py	`16.81% <0.00%> (ø)`
pytensor/link/numba/dispatch/extra_ops.py	`92.24% <70.21%> (-5.77%)`	⬇️
pytensor/link/numba/dispatch/scalar.py	`94.44% <75.00%> (+7.02%)`	⬆️
pytensor/link/numba/dispatch/cython_support.py	`86.95% <86.95%> (ø)`
pytensor/link/numba/dispatch/basic.py	`90.06% <95.83%> (-2.62%)`	⬇️
pytensor/link/numba/dispatch/elemwise.py	`97.04% <97.87%> (-0.09%)`	⬇️
pytensor/graph/basic.py	`88.10% <100.00%> (+0.43%)`	⬆️
pytensor/link/numba/dispatch/nlinalg.py	`100.00% <100.00%> (ø)`
pytensor/sparse/sandbox/sp.py	`73.48% <100.00%> (ø)`
... and 25 more

pytensor/link/numba/dispatch/elemwise.py

ricardoV94 · 2022-12-09T11:26:03Z

pytensor/link/numba/dispatch/elemwise.py

-            else:
-                new_shape = numba_basic.tuple_setitem(new_shape, i, shuffle_shape[j])
-                return j + 1, new_shape
+        def find_shape(array_shape):


Should we shortcut when the output static shape is known?

pt.random.normal(size=(2, 3)).dimshuffle(1, 0).type.shape # (3, 2)

I guess that might save a tiny bit of runtime and/or compile time, but if that were incorrect for some reason we would end up writing to arrays out of bounds. There is no error checking after this point.
I think it is safer to always infer the output shape from the inputs...

In that case the graph would be inconsistent and should fail anyway.

Yes, it should fail. But if we just assume the shapes are correct, we might just silently corrupt memory and not fail in an obvious way. :-)

ah, sorry I somehow though this was the shape code for the llvm-elemwise...
Here the reshape should just fail if we provide something incorrect. So yes, I think we can use extra info we have here, I'll update the PR.

Either way it's probably a minor performance difference. I was more thinking out loud in what types of place can we benefit from static shape info.

Co-authored-by: Ricardo Vieira <[email protected]>

ricardoV94

LGTM

What exactly provides the compilation speedup?

aseyboldt · 2022-12-11T21:19:03Z

I'm still trying to understand numba compile times better, but it seems in this case the transpose had quite some impact. And most of the time the transpose in DimShuffle is just a no-op (and we know it is), so we can remove it in those cases.
I also changed the code for the final shape so it involves fewer function calls that need to be analysed and typed.

Improve numba DimShuffle compile time

deb6535

aseyboldt force-pushed the compile-time-dimshuffle branch from b753226 to deb6535 Compare December 8, 2022 23:29

ricardoV94 reviewed Dec 9, 2022

View reviewed changes

ricardoV94 added numba performance labels Dec 9, 2022

aseyboldt and others added 2 commits December 11, 2022 10:51

Fix small error in comment

00d9392

Co-authored-by: Ricardo Vieira <[email protected]>

Use static shape info in numba DimShuffle

5ccb1f2

ricardoV94 approved these changes Dec 11, 2022

View reviewed changes

aseyboldt merged commit d9fe197 into pymc-devs:main Dec 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve numba DimShuffle compile time #95

Improve numba DimShuffle compile time #95

Uh oh!

aseyboldt commented Dec 8, 2022

Uh oh!

codecov-commenter commented Dec 9, 2022 •

edited

Loading

Uh oh!

Uh oh!

ricardoV94 Dec 9, 2022

Uh oh!

aseyboldt Dec 11, 2022

Uh oh!

ricardoV94 Dec 11, 2022

Uh oh!

aseyboldt Dec 11, 2022

Uh oh!

aseyboldt Dec 11, 2022

Uh oh!

ricardoV94 Dec 11, 2022

Uh oh!

ricardoV94 left a comment

Uh oh!

aseyboldt commented Dec 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve numba DimShuffle compile time #95

Improve numba DimShuffle compile time #95

Uh oh!

Conversation

aseyboldt commented Dec 8, 2022

Uh oh!

codecov-commenter commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

ricardoV94 Dec 9, 2022

Choose a reason for hiding this comment

Uh oh!

aseyboldt Dec 11, 2022

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Dec 11, 2022

Choose a reason for hiding this comment

Uh oh!

aseyboldt Dec 11, 2022

Choose a reason for hiding this comment

Uh oh!

aseyboldt Dec 11, 2022

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Dec 11, 2022

Choose a reason for hiding this comment

Uh oh!

ricardoV94 left a comment

Choose a reason for hiding this comment

Uh oh!

aseyboldt commented Dec 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Dec 9, 2022 •

edited

Loading