-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Move the IsLeft/IsRight decision out of the loop and use computed substring set #88516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @dotnet/area-system-collections Issue DetailsWhile reviewing #87510, I noticed the inline code in the Comparers seems like since it's in the hot-loop path, might be faster to move the IsLeft conditionalized code out of the loop by adding a Slicer to the comparator. The Slicer is the same logic as was in the bodies of the Equals and GetHashCode methods, and indeed also matches the delegate that was passed to the internal CreateAnalysisResults method. The slicing is really only changed once per Count, so move the Also made a subtle tweak because we know that BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22631.1972)
Intel Core i7-10875H CPU 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=8.0.100-preview.7.23322.33
[Host] : .NET 8.0.0 (8.0.23.32106), X64 RyuJIT AVX2
Job-ZWELZX : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
Job-KUEFVA : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:EnableUnsafeBinaryFormatterSerialization=true IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1
In the
|
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
00e52d1 to
d9d188b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Slicer does the left/right based on the sign of the Index value now, which should JIT down better.
I really wish that String.AsSpan understood the use of -1 start intrinsically as string.Length - 1... that would make the ternary unneeded.
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a real improvement. We do the math for setting a negative index (e.g. from the right side) starting with count characters from the right (as before). Then on line 81 we keep decrementing comparer.Index as we go on moving the "cursor" left.
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
8c2b2e4 to
865a9ee
Compare
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ternary is the sole remaining conditional jump in the hot loop, but there's no good way to avoid that simultaneous avoiding delegate overhead so make it as simple as possible. The number of jumps (old vs. new) is identical, but this is a tiny tiny block of JITtable goodness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried running this as a jumpless method body, but it makes little difference and is harder to grok; passing fromRight = 0 for left, or fromRight = 1 for right which requires the SubstringComparers to carry the left/right multiplier with them (so more state...):
public static ReadOnlySpan<char> Slicer(this string s, byte fromRight, int index, int count) => s.AsSpan((s.Length * fromRight) + index), count);Also, if we COULD swap the HashSet<string>'s comparer out for left vs. right the we could just have that knowledge embedded with a trait and thus fully jumpless, but that would require allocating two HashSet<string>s which might be an allocation regression nobody wants :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did something like that in PR #89689 which is a huge win... since we can't swap the comparer, I ended up creating both a left and right HashSet with backing comparer that "hard codes" the left/right logic. HUGE WINS, see that PR.
865a9ee to
b11d49f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to re-slice the string again here. It would be awesome if we could have a HashSet<ReadOnlySpan<char> but that's not going to happen as those would be structs not objects.
The slicing is really only changed once per Count, so move the IsLeft-dependent logic into `Slicer` method and eliminate the `GetSpan` delegate. Changed to also pass the already-computed `set` of unique substrings to the `CreateAnalysisResults` method, so we don't recompute the slices twice. In order than either the set or the original `uniqueStrings` can be passed, swapped that argument for the `Analyze` method to take the `uniqueStrings` as a `string[]` (which it already is at all call-sites).
Subtle bug in that the entire string is being placed in the set, not the span.
b11d49f to
c8c8801
Compare
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Outdated
Show resolved
Hide resolved
Since we are working with the same set of input strings in each strategy there's no reason to take the length every time we make an attempt (per count, both left and right justified). Hoist the calculation of the acceptable number of collisions out to the top, do it once, and pass that number into the `HasSufficientUniquenessFactor` method for it to (locally) use up.
f97e491 to
a168767
Compare
Benchmarks ever so slightly better.
Looks like the overhead of IEnumerable<string> is not worth the savings for the benchmark test data. Perhaps it would matter less if we were freezing more strings, but not likely
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Show resolved
Hide resolved
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/String/KeyAnalyzer.cs
Show resolved
Hide resolved
|
Performance tests (note, PR #89689 wipes the floor on this, so we should merge that instead)
BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22631.2115)
Intel Core i7-10875H CPU 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=8.0.100-preview.7.23322.33
[Host] : .NET 8.0.0 (8.0.23.32106), X64 RyuJIT AVX2
Job-PMFJSX : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
Job-XUZNBJ : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
Job-ORKLIM : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
PowerPlanMode=00000000-0000-0000-0000-000000000000 Arguments=/p:EnableUnsafeBinaryFormatterSerialization=true IterationTime=250.0000 ms
MaxIterationCount=20 MinIterationCount=15 WarmupCount=1
|
|
Did much better in PR #89689 |
While reviewing #87510, I noticed the inline code in the comparers seems like since it's in the hot-loop path, might be faster to move the
IsLeftconditionalized code out of the loop by adding astatic Slicerto the comparator. TheSliceris the same logic as was originally in the bodies of theEqualsandGetHashCodemethods, and also matches thedelegates that were being passed to the internalCreateAnalysisResultsmethod.The slicing is really only changed once per Count, so move the
IsLeft-dependent logic into aggressively inlinedSlicerextension method because that makes things a bit faster and slightly reduces allocations.Builds on #88709 as that's a trivially true change.Has been merged now.The summary of changes:
TryUseSubstringbecause we can just early return the calculated results as we build themacceptableNonUniqueCountout to the top level since it never changes (which means we pass that into theHasSufficientUniquenessFactormethod for it to "use up" internally (passed by value, so unchanged at call-site)delegate ReadOnlySpan<char> GetSpanand use, which helps reduce dynamic dispatch overhead in theCreateAnalysisResultsmethodIsLeftfield of theSubstringComparersince we can tell by the Index being negative that we're doing right-justified slicing (and documented that on the class)IndexandCounton the comparer for right-justified substrings.[MethodImpl(MethodImplOptions.AggressiveInlining)]to theEqualsandGetHashCodeoverrides.