-
Notifications
You must be signed in to change notification settings - Fork 5.2k
JIT: Optimize for cost instead of score in 3-opt layout #109741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
cc @dotnet/jit-contrib, @AndyAyersMS PTAL. No diffs. TP diffs on Windows aren't useful since the MSVC version changed, though the Linux TP diffs suggest this change has negligible cost. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit puzzled why this is no diff, but it looks good and I think makes more sense as cost minimization.
src/coreclr/jit/fgopt.cpp
Outdated
|
|
||
| if (fallthroughEdge != nullptr) | ||
| { | ||
| return maxCost - fallthroughEdge->getLikelyWeight(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess since getLikelyWeight is derived from bbWeight and we're checking likelihoods, the result here can't ever be negative?
Though I wonder if we should have a max(0.0, ...) just in case there is some odd rounding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess since getLikelyWeight is derived from bbWeight and we're checking likelihoods, the result here can't ever be negative?
That's correct, though it seems reasonable to defend against rounding errors here.
src/coreclr/jit/fgopt.cpp
Outdated
| // Continue evaluating candidates if this one isn't profitable | ||
| if ((newScore <= currScore) || Compiler::fgProfileWeightsEqual(newScore, currScore, 0.001)) | ||
| // Continue evaluating partitions if this one isn't profitable | ||
| if ((currCost <= newCost) || Compiler::fgProfileWeightsEqual(currCost, newCost, 0.001)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but I think this reads better...
| if ((currCost <= newCost) || Compiler::fgProfileWeightsEqual(currCost, newCost, 0.001)) | |
| if ((newCost >= currCost) || Compiler::fgProfileWeightsEqual(currCost, newCost, 0.001)) |
| #ifdef DEBUG | ||
| // Ensure the swap improved the overall layout | ||
| const weight_t newLayoutCost = GetLayoutCost(startPos, endPos); | ||
| assert(newLayoutCost < currLayoutCost); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we could end up equal?
| assert(newLayoutCost < currLayoutCost); | |
| assert(newLayoutCost <= currLayoutCost); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll only perform a swap if the new cost is strictly less than the original cost (plus epsilon), so I don't think it should be possible for the new layout cost to be the same as the old one after swapping, right?
I was puzzled too, though I suppose it's nice to know we were modeling cost/score one-to-one. I think our cost model is sufficiently powerful for the time being, and the true limiting factor is in our greedy strategy for selecting cut points. #109534 might help 3-opt catch a few more cut points, though the diffs aren't dramatic. I think part of the problem is how we partition backward jumps. It's not obvious to me what the partition should look like: We can move the partition |
Thanks for this -- I'll give this a go locally. Are you ok with me merging this as-is? Or am I missing something with the |
Yes |
Follow-up to #109741. Fixes #109831. Fixes #109905. Fixes #109903. Tolerate some floating-point imprecision when comparing the cost of a new layout produced by 3-opt to the previous layout's cost. The JitStress failures I looked all stressed JitOptRepeat, which can produce absurdly large loop block weights by rerunning optSetBlockWeights. Manually inspecting the layout decisions made for these methods shows 3-opt making reasonable transformations, but the assertion newLayoutCost < currLayoutCost fails due to imprecision from summing large block weights. Having a backup check that the layout costs are within some epsilon seems like a reasonable fix.
Part of dotnet#107749. Follow-up to dotnet#103450. This refactors 3-opt to minimize layout cost instead of maximizing layout score. This is arguably more intuitive, and it should facilitate implementing a more sophisticated cost model. This PR also adds a mechanism for evaluating the total cost of a given layout, which means we can assert at each move that we actually improved the global layout cost.
Follow-up to dotnet#109741. Fixes dotnet#109831. Fixes dotnet#109905. Fixes dotnet#109903. Tolerate some floating-point imprecision when comparing the cost of a new layout produced by 3-opt to the previous layout's cost. The JitStress failures I looked all stressed JitOptRepeat, which can produce absurdly large loop block weights by rerunning optSetBlockWeights. Manually inspecting the layout decisions made for these methods shows 3-opt making reasonable transformations, but the assertion newLayoutCost < currLayoutCost fails due to imprecision from summing large block weights. Having a backup check that the layout costs are within some epsilon seems like a reasonable fix.


Part of #107749. Follow-up to #103450. This refactors 3-opt to minimize layout cost instead of maximizing layout score. This is arguably more intuitive, and it should facilitate implementing a more sophisticated cost model. This PR also adds a mechanism for evaluating the total cost of a given layout, which means we can assert at each move that we actually improved the global layout cost.