-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[microNPU] Add a pass to reorder copy and compute nodes #10959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[microNPU] Add a pass to reorder copy and compute nodes #10959
Conversation
lhutton1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NicolaLancellotti! Looks good overall, just some small things below
ekalda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good, @NicolaLancellotti! I'd be interested in seeing how that pass interacts with graphs that have mixture of operators with and without weights, e.g. it seems to me that when we have a graph that looks like
pooling -> copy -> copy -> conv2d -> copy -> copy -> depthwise2d
it will end up after this pass as
copy -> copy -> pooling -> copy -> copy -> conv2d ...
I suppose that is intentional, that we start copying the conv2d weights in while the MAC engine is crunching the pooling? Maybe it's worth adding a test that exercises that kind of mixture of ops?
63b6752 to
78f05ea
Compare
Yes, it is intentional, and the reordering is just what you said. I have added a test too. |
|
Nice thanks @NicolaLancellotti! It looks like we need a re-trigger due to the linting issues CI had last week :) |
74be4ce to
88c0ea3
Compare
lhutton1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Looks like it now conflicts with #10344 :/ |
88c0ea3 to
a1f8b5b
Compare
a1f8b5b to
e42b246
Compare
|
I have added a parameter to specify the maximum number of movements allowed for a copy. |
manupak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NicolaLancellotti for the great work here.
I've left some comments to improve the code readability and a suggestion to allow PassContext to configure the pass.
017c855 to
ce772d6
Compare
manupak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
manupak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah just found one thing that needs to be fixed.
Happy to take an immediate follow up. Let me know
ce772d6 to
367cdc9
Compare
12f7c44 to
5da51da
Compare
5da51da to
c82ea02
Compare
|
Thanks @NicolaLancellotti, @manupa-arm, @ekalda! |
This pr adds a pass to reorder Arm(R) Ethos(TM)-U copy and compute nodes in such a way that independent DMA copies, and computes happen in parallel.
cc @lhutton1 @ekalda @manupa-arm