Skip to content

Conversation

@amanasifkhalid
Copy link
Contributor

Part of #107749. Follow-up to #103450. This refactors 3-opt to minimize layout cost instead of maximizing layout score. This is arguably more intuitive, and it should facilitate implementing a more sophisticated cost model. This PR also adds a mechanism for evaluating the total cost of a given layout, which means we can assert at each move that we actually improved the global layout cost.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 12, 2024
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Contributor Author

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. No diffs. TP diffs on Windows aren't useful since the MSVC version changed, though the Linux TP diffs suggest this change has negligible cost. Thanks!

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit puzzled why this is no diff, but it looks good and I think makes more sense as cost minimization.


if (fallthroughEdge != nullptr)
{
return maxCost - fallthroughEdge->getLikelyWeight();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess since getLikelyWeight is derived from bbWeight and we're checking likelihoods, the result here can't ever be negative?

Though I wonder if we should have a max(0.0, ...) just in case there is some odd rounding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess since getLikelyWeight is derived from bbWeight and we're checking likelihoods, the result here can't ever be negative?

That's correct, though it seems reasonable to defend against rounding errors here.

// Continue evaluating candidates if this one isn't profitable
if ((newScore <= currScore) || Compiler::fgProfileWeightsEqual(newScore, currScore, 0.001))
// Continue evaluating partitions if this one isn't profitable
if ((currCost <= newCost) || Compiler::fgProfileWeightsEqual(currCost, newCost, 0.001))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but I think this reads better...

Suggested change
if ((currCost <= newCost) || Compiler::fgProfileWeightsEqual(currCost, newCost, 0.001))
if ((newCost >= currCost) || Compiler::fgProfileWeightsEqual(currCost, newCost, 0.001))

#ifdef DEBUG
// Ensure the swap improved the overall layout
const weight_t newLayoutCost = GetLayoutCost(startPos, endPos);
assert(newLayoutCost < currLayoutCost);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we could end up equal?

Suggested change
assert(newLayoutCost < currLayoutCost);
assert(newLayoutCost <= currLayoutCost);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll only perform a swap if the new cost is strictly less than the original cost (plus epsilon), so I don't think it should be possible for the new layout cost to be the same as the old one after swapping, right?

@amanasifkhalid
Copy link
Contributor Author

I'm a bit puzzled why this is no diff

I was puzzled too, though I suppose it's nice to know we were modeling cost/score one-to-one. I think our cost model is sufficiently powerful for the time being, and the true limiting factor is in our greedy strategy for selecting cut points. #109534 might help 3-opt catch a few more cut points, though the diffs aren't dramatic.

I think part of the problem is how we partition backward jumps. It's not obvious to me what the partition should look like: We can move the partition [targetBlk...srcBlk) to in front of srcBlk, or we can move the partition (targetBlk...srcBlk] to behind targetBlk, or we could split somewhere in between targetBlk and srcBlk, etc. Right now, we do only the first approach, and not for any particular reason -- it was just easy to implement. A truly global 3-opt implementation would eventually consider the best cut point for a given backward jump; I wonder if we can afford to do a more rigorous search for a cut point in the case of backward jumps, though I imagine 3-opt considers such jumps frequently enough that this may become too expensive. I'm afraid I don't know of an obvious approach that's also cheap at the moment.

@AndyAyersMS
Copy link
Member

If we can afford to walk the [target .. src) range one idea would be to look for blocks in that range with two successors (BBJ_COND) such that one successor is src and the other is after src somewhere; these blocks would be a plausible choices for loop exits. Then pick the one with the highest weight transfer to src as a trial cut (or maybe pick each successively, if that's possible)...

Failing that, any block in the range that conditionally branches beyond "src" might also be a reasonable choice (though further moves might be needed after that).

Something like this -- scan the range looking for the "best" G ...

========>

@amanasifkhalid
Copy link
Contributor Author

If we can afford to walk the [target .. src) range one idea would be to look for blocks in that range with two successors (BBJ_COND) such that one successor is src and the other is after src somewhere; these blocks would be a plausible choices for loop exits. Then pick the one with the highest weight transfer to src as a trial cut (or maybe pick each successively, if that's possible)...

Thanks for this -- I'll give this a go locally. Are you ok with me merging this as-is? Or am I missing something with the newLayoutCost < currLayoutCost assert (comment)?

@AndyAyersMS
Copy link
Member

Are you ok with me merging this as-is?

Yes

@amanasifkhalid amanasifkhalid merged commit f9b2a6d into dotnet:main Nov 13, 2024
108 checks passed
@amanasifkhalid amanasifkhalid deleted the 3-opt-model-cost branch November 13, 2024 16:58
amanasifkhalid added a commit that referenced this pull request Nov 18, 2024
Follow-up to #109741. Fixes #109831. Fixes #109905. Fixes #109903.

Tolerate some floating-point imprecision when comparing the cost of a new layout produced by 3-opt to the previous layout's cost. The JitStress failures I looked all stressed JitOptRepeat, which can produce absurdly large loop block weights by rerunning optSetBlockWeights. Manually inspecting the layout decisions made for these methods shows 3-opt making reasonable transformations, but the assertion newLayoutCost < currLayoutCost fails due to imprecision from summing large block weights. Having a backup check that the layout costs are within some epsilon seems like a reasonable fix.
mikelle-rogers pushed a commit to mikelle-rogers/runtime that referenced this pull request Dec 10, 2024
Part of dotnet#107749. Follow-up to dotnet#103450. This refactors 3-opt to minimize layout cost instead of maximizing layout score. This is arguably more intuitive, and it should facilitate implementing a more sophisticated cost model. This PR also adds a mechanism for evaluating the total cost of a given layout, which means we can assert at each move that we actually improved the global layout cost.
mikelle-rogers pushed a commit to mikelle-rogers/runtime that referenced this pull request Dec 10, 2024
Follow-up to dotnet#109741. Fixes dotnet#109831. Fixes dotnet#109905. Fixes dotnet#109903.

Tolerate some floating-point imprecision when comparing the cost of a new layout produced by 3-opt to the previous layout's cost. The JitStress failures I looked all stressed JitOptRepeat, which can produce absurdly large loop block weights by rerunning optSetBlockWeights. Manually inspecting the layout decisions made for these methods shows 3-opt making reasonable transformations, but the assertion newLayoutCost < currLayoutCost fails due to imprecision from summing large block weights. Having a backup check that the layout costs are within some epsilon seems like a reasonable fix.
@github-actions github-actions bot locked and limited conversation to collaborators Dec 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants