[TIR] Avoid all-pairs comparison in subexpr elimination #11423

tkonolige · 2022-05-23T23:55:12Z

Subexpression elimination used a comparison of each subexpression to every other subexpression to determine which to eliminate. This resulted in an O(N^2) algorithm that was slow when there were many LetStmts. Instead we now hash each subexpression and compare the hashes. We reuse structual hashing without remapping free variables to uniquely
determine each subexpression.

@masahi @FranckQC

Subexpression elimination used a comparison of each subexpression to every other subexpression to determine which to eliminate. This resulted in an O(N^2) algorithm that was slow when there were many LetStmts. Instead we now hash each subexpression and compare the hashes. We reuse structual hashing without remapping free variables to uniquely determine each subexpression.

AndrewZhaoLuo

Thanks for doing this, I have noticed this pass is a little slow when conducting some synthetic tests with lots of subexpressions.

src/tir/transforms/common_subexpr_elim_tools.cc

tkonolige · 2022-05-24T22:58:27Z

@AndrewZhaoLuo comments addressed

AndrewZhaoLuo

LGTM

FranckQC · 2022-05-25T00:06:05Z

Hi,

Thank you for your interest in improving the pass!
I haven't worked on the CSE pass for a while, so I am probably a little bit rusty on it, but I think there are quite a few problems with this PR.

1)
The role of this function SyntacticToSemanticComputations() is to transform a ComputationTable (that is to say, a std::unordered_map<PrimExpr, size_t, StructuralHash, ExprDeepEqual>, ie a hashtable mapping PrimExpr to size_t, using StructuralHash for the hashing, and ExprDeepEqual for the equality test) into a vector of std::pair<PrimExpr, size_t> where semantically equivalent elements have been merged.

The whole need for this function is due to having collected syntactical entities (using an efficient structure for that, the ComputationTable, where it's fast to retrieve an element (in constant time), as it's a hashtable), which latter of course needs to be transformed into a collection (a vector), where equivalent terms (like x+y and y+x) are merged together (and their counters added), which I then call semantical entities.

This notion of being semantically equivalent could be anything, like identifying terms modulo associativity [(x+y)+z with x+(y+z)], modulo commutatitvity [x+y with y+x], or anything else, with any equivalence relation. It's completely customizable by just changing the EquivalentTerms() function.

Sure, at the moment, this function EquivalentTerms() just calls the syntactical equality EqualTerms(), but the whole pass has been written with the idea that we could latter-on replace it with anything else, for making it even more powerful. You can see that the function SyntacticToSemanticComputations() that you have changed used to call std::find_if with the predicate being precisely this very customizable EquivalentTerms(). This is lost with these changes, and it no longer identifies equivalent terms.

2)
There is something else which I don't quite get about your changes. Why do you even bother to transform the ComputationTable (which is a std::unordered_map<PrimExpr, size_t, StructuralHash, ExprDeepEqual>) into a std::unordered_map<PrimExpr, size_t, ExprDeepHashStruct, ExprDeepEqual>?
The lines 753 to 763 (https://github.com/apache/tvm/pull/11423/files#diff-f83c46530e92c628fd309499ceffa32cb2fb0505633ac0e754e2bdef4d518962R753-R762) are basically transforming a hashtable (table) into the exact same hashtable (equiv_computations)). This code is doing nothing.

If the pass is really taking too long for many people, I could try to help to improve it, if that's needed. There will necessary be a limit in what we can gain at compile time, as improving the runtime speed (or creating opportunities for it, which CSE is doing) often has a price to pay at compile time. But we can see what we can do, for sure.

What do you think?

tkonolige · 2022-05-25T17:03:05Z

Thanks for helping to clarify things @FranckQC! I think I'm a little confused still.

Regarding 1), you are saying that right now SyntacticToSemanticComputations just happens to use ExprDeepEqual, but we intend to change it in the future? And so, right now (in main), SyntacticToSemanticComputations is essentially doing nothing? If so, could we just have SyntacticToSemanticComputations just be a no-op for now?

Even if we switch SyntacticToSemanticComputations to a no-op, I worry that we are going to hit this O(N^2) problem in the future. Is there a way to get around the all-pairs comparison?

Regarding 2), I didn't realize ComputationTable was already a std::unordered_map<PrimExpr, size_t, StructuralHash, ExprDeepEqual>. Whoops :).

AndrewZhaoLuo · 2022-05-26T18:40:49Z

@FranckQC thanks for elucidating the intention of the original design. I see why for an arbitrary semantic computation functions you may need to have N^2 comparisons.

Is it possible to relax the number of comparisons by using a canonical form. E.g. (x + y) + z and z + (y + x) would always reduce to (x + y) + z or is this too limiting/impossible to catch some optimizations? If we could do things this way we would have O(N) computations now.

Regardless, this has O(N^2) for little benefit (as we do equality only) at the moment so I propose to comment out the original impl. and replace it with a O(N) fast one / no op, and add a suitable comment.

AndrewZhaoLuo

Turning off the merge approval until we hash this out.

FranckQC · 2022-05-26T19:57:54Z

Hi everyone,

@tkonolige :

Regarding 1), you are saying that right now SyntacticToSemanticComputations just happens to use ExprDeepEqual, but we intend to change it in the future? And so, right now (in main), SyntacticToSemanticComputations is essentially doing nothing? If so, could we just have SyntacticToSemanticComputations just be a no-op for now?

Hum, no, I wouldn't say it does nothing. It still transforms a hastable into a vector. How it does so depends on the EquivalentTerms() function.

SyntacticToSemanticComputations() transforms a table of syntactic entities (where x+y is not the same thing as y+x, even if you were to work modulo commutativity) into a vector of "semantical entities", where equivalent computations have been fusioned/merged together, according to the notion of "being equivalent". This function is building equivalence classes, if you want. And this notion of "being equivalent" is customizable, and implemented by the EquivalentTerms() function, which at the moment simply calls EqualTerms(). So at the moment, the only way to have x Equiv y is to have x and y being syntactically the same term. But that's not a requirement, and it could be any equivalence relation, because the pass has been written to work for any equivalence relation, in order to be able to produce even more commonings.

As a use case of this kind of thing:
For instance, in just a 3 lines of code, @yuanfz98 proposed to change the equivalence relation to now identify terms modulo associativity and distributivity (of * over +) in a PR. Sadly I didn't have enough time to discuss this PR at the time (which got quickly closed). Some people were afraid it would add too much computational complexity to the pass. The complexity of the pass itself wouldn't change (we already look at each pair!), but this time it took benefit of that. However, this new equivalence function was relying on a pseudo-normalisation function (arith::Analyzer::Simplify, see discussion here), which would necessary need some time to (try to) normalize the terms being manipulated. It should not take too long, as they did not implement a full decision procedure, just some quick attempts of applying some rewrite rules, which are some known patterns, that often lead to the most simplified term (but it's not guaranteed). (Said differently, arith::Analyzer::Simplify is correct but not complete.)
It could still take too long in the current form, I don't know, I didn't try it as it was. But I think there's a way to make that more efficient.

@AndrewZhaoLuo and @tkonolige

I think I should be able to address the issue without removing completely the possibility to have more interesting comparisons in the future.
I'll give it a go this week-end if you're happy with that.

Thanks.

tkonolige · 2022-05-26T20:08:56Z

@FranckQC because this currently is causing issues, how about we remove the code for now. When you figure out a solution we can add the code back in.

FranckQC · 2022-05-26T20:11:41Z

If that's solved by tomorrow, would that work for you?

tkonolige · 2022-05-26T20:13:09Z

Sure, that is probably fine. Thank you for the quick turnaround.

AndrewZhaoLuo · 2022-05-26T21:04:52Z

I think the main issue I have O(N^2) runtime on a default pass. We can do linear time if we plan things well, e.g. have syntactic canonical forms. Haven't dug too deeply, but if this normalization stuff does something like this I will be cool

tkonolige · 2022-05-27T21:22:03Z

@FranckQC any update on a solution?

FranckQC · 2022-05-27T21:46:35Z

Just starting to work on it now. Was busy with other things earlier today.
A PR will come this evening/night.

tkonolige · 2022-05-27T21:59:13Z

I don't think @AndrewZhaoLuo or I will be back to review this until Tuesday so don't feel rushed.

FranckQC · 2022-05-31T21:37:42Z

Just a small update to let you know that I have a patch almost ready for that. Just testing it now (both functionally and performance-wise) to make sure everything is ok.
I will update soon again.

tkonolige · 2022-06-03T16:11:19Z

@FranckQC any update?

FranckQC · 2022-06-03T16:13:12Z

Sorry, it's currently being reviewed in our repo downstream. I hope to make the PR here this afternoon!

FranckQC · 2022-06-05T02:38:52Z

Hi,

The PR is ready here : #11574
Apologies as it took me a little bit longer than I expected.
The only test failing seems to be from a flaky test as it's unrelated to the changes introduced.

Thanks!

tkonolige · 2022-06-06T16:57:39Z

Thanks @FranckQC.

AndrewZhaoLuo reviewed May 24, 2022

View reviewed changes

src/tir/transforms/common_subexpr_elim_tools.cc Outdated Show resolved Hide resolved

src/tir/transforms/common_subexpr_elim_tools.cc Outdated Show resolved Hide resolved

address comments

58c24b4

AndrewZhaoLuo approved these changes May 24, 2022

View reviewed changes

AndrewZhaoLuo requested changes May 26, 2022

View reviewed changes

FranckQC mentioned this pull request Jun 4, 2022

[TIR] CSE pass : Restrict the equivalence to be decided by a normal form - avoids comparison of terms #11574

Merged

tkonolige closed this Jun 6, 2022

[TIR] Avoid all-pairs comparison in subexpr elimination #11423

[TIR] Avoid all-pairs comparison in subexpr elimination #11423

Uh oh!

Conversation

tkonolige commented May 23, 2022

Uh oh!

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tkonolige commented May 24, 2022

Uh oh!

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

Uh oh!

FranckQC commented May 25, 2022

Uh oh!

tkonolige commented May 25, 2022

Uh oh!

AndrewZhaoLuo commented May 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

Uh oh!

FranckQC commented May 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tkonolige commented May 26, 2022

Uh oh!

FranckQC commented May 26, 2022

Uh oh!

tkonolige commented May 26, 2022

Uh oh!

AndrewZhaoLuo commented May 26, 2022

Uh oh!

tkonolige commented May 27, 2022

Uh oh!

FranckQC commented May 27, 2022

Uh oh!

tkonolige commented May 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FranckQC commented May 31, 2022

Uh oh!

tkonolige commented Jun 3, 2022

Uh oh!

FranckQC commented Jun 3, 2022

Uh oh!

FranckQC commented Jun 5, 2022

Uh oh!

tkonolige commented Jun 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AndrewZhaoLuo commented May 26, 2022 •

edited

Loading

FranckQC commented May 26, 2022 •

edited

Loading

tkonolige commented May 27, 2022 •

edited

Loading