Skip to content

Conversation

@mbs-octoml
Copy link
Contributor

@mbs-octoml mbs-octoml commented Dec 9, 2021

This is in support of #9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies due to arbitrary independently chosen memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.

@xqdan
Copy link
Contributor

xqdan commented Dec 9, 2021

Have you consider also pass how much mem which primfunc can use?

@mbs-octoml
Copy link
Contributor Author

Hi @xqdan, thx for the comment. That takes us more into memory planning and what @manupa-arm is doing with USMP (including a representation of available memory pools, analysis to determine the conflict set for every abstract buffer, and a realization pass to resolve abstract buffers to physical within a pool). Currently the notion of 'memory/storage scope' we are using here is not connected to the USMP memory pools, or indeed anything at all! It's just a label we push around for use downstream. I see us eventually reconciling the 'flow memory scope constraints' aspect I'm working on here with the 'account for memory scope constraints during scheduling' work, but we'll need to get there gradually.

…imFuncs

This is in support of apache#9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies being necessary due to independently chosen
memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.
Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like. Just a few nits. CC: @vinx13

@mbs-octoml mbs-octoml force-pushed the mbs-primfunc-constraints branch from 3398250 to 4468dd8 Compare December 9, 2021 20:11
@mbs-octoml
Copy link
Contributor Author

PTAL

@mbrookhart mbrookhart merged commit e785b26 into apache:main Dec 10, 2021
@mbrookhart
Copy link
Contributor

Thanks @mbs-octoml @xqdan @junrushao1994

@mbs-octoml mbs-octoml deleted the mbs-primfunc-constraints branch December 10, 2021 17:34
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
…imFuncs (apache#9689)

* [TIR] Allow memory (aka storage) scopes to be retrieved/applied to PrimFuncs

This is in support of apache#9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies being necessary due to independently chosen
memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.

* [checkpoint] Junru's comments.
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 11, 2022
…imFuncs (apache#9689)

* [TIR] Allow memory (aka storage) scopes to be retrieved/applied to PrimFuncs

This is in support of apache#9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies being necessary due to independently chosen
memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.

* [checkpoint] Junru's comments.
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 12, 2022
…imFuncs (apache#9689)

* [TIR] Allow memory (aka storage) scopes to be retrieved/applied to PrimFuncs

This is in support of apache#9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies being necessary due to independently chosen
memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.

* [checkpoint] Junru's comments.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…imFuncs (apache#9689)

* [TIR] Allow memory (aka storage) scopes to be retrieved/applied to PrimFuncs

This is in support of apache#9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies being necessary due to independently chosen
memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.

* [checkpoint] Junru's comments.
qsqqsqqsq-intellif pushed a commit to qsqqsqqsq-intellif/tvm that referenced this pull request Apr 29, 2022
…imFuncs (apache#9689)

* [TIR] Allow memory (aka storage) scopes to be retrieved/applied to PrimFuncs

This is in support of apache#9613 which allows memory scopes to flow
out of already-lowered PrimFuncs into the rest of the Relay
program. This means scope choices made during lowering can
be accounted for in the rest of the program, with device_copies
inserted as required.

Somewhat more speculatively we also allow memory scopes to flow
in to PrimFuncs. This is in preparation for when we can split
lowering into two phases: i) lower "primitive" fused Relay
functions to TensorIR in a schedulable form roughly isomorphic
to TE, and ii) actual scheduling down to traditional TIR. Once
that split is made it will be possible to flow memory scopes
out of one PrimFunc and into another so as to avoid unnecessary
device_copies being necessary due to independently chosen
memory scopes.

I also suspect we'll want to put our focus on layouts rather
than memory scopes, but this at least sets up some of the
machinery.

* [checkpoint] Junru's comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants