-
Notifications
You must be signed in to change notification settings - Fork 5.2k
[SGen/Tarjan] Handle edge case with node heaviness changing due to deduplication #113044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The fix is incomplete, we would likely report incorrect SCCs now instead of hitting the assert, and previous XREFs would likely be wrong too. I'm keeping this as draft until I come up with a fix. |
The fix needs to be done differently. We basically have three options:
|
This was referenced Mar 2, 2025
This was referenced Mar 2, 2025
…duplication Do early deduplication Fix Windows build Add test cases to sgen-bridge-pathologies
Rebased and updated for the new test infrastructure. |
BrzVlad
approved these changes
Apr 2, 2025
filipnavara
added a commit
to filipnavara/runtime
that referenced
this pull request
Apr 8, 2025
…duplication (dotnet#113044) * [SGen/Tarjan] Handle edge case with node heaviness changing due to deduplication Do early deduplication Fix Windows build Add test cases to sgen-bridge-pathologies * Move test code * Remove old code * Add extra check (no change to functionality)
4 tasks
akoeplinger
added a commit
that referenced
this pull request
Apr 11, 2025
* Fix dump_processor_state debug code to compile and work on Android (#112970) Fix typo in GC bridge comparison message (SCCS -> XREFS) * [mono] Add a few bridge tests (#113703) * [mono][sgen] Fix DUMP_GRAPH debug option build for tarjan bridge * [mono][sgen] Don't create ScanData* during debug dumping of SCCs It serves no purpose and it would later crash the runtime since we didn't patch the lockword back in place. * [mono][sgen] Fix some null deref crashes in DUMP_GRAPH debug option * [mono][tests] Add bridge tests These are ported from some of the bridge tests we had on mono/mono. In order to test them we compare between the output of the new and the tarjan bridge. * Fix an edge case in the Tarjan GC bridge that leads to losing xref information (#112825) * Fix an edge case in the Tarjan SCC that lead to losing xref information In the Tarjan SCC bridge processing there's a color graph used to find out connections between SCCs. There was a rare case which only manifested when a cycle in the object graph points to another cycle that points to a bridge object. We only recognized direct bridge pointers but not pointers to other non-bridge SCCs that in turn point to bridges and where we already calculated the xrefs. These xrefs were then lost. * Add test case to sgen-bridge-pathologies and add an assert to catch the original bug * Add review --------- Co-authored-by: Vlad Brezae <[email protected]> * [SGen/Tarjan] Handle edge case with node heaviness changing due to deduplication (#113044) * [SGen/Tarjan] Handle edge case with node heaviness changing due to deduplication Do early deduplication Fix Windows build Add test cases to sgen-bridge-pathologies * Move test code * Remove old code * Add extra check (no change to functionality) * Disable test on wasm --------- Co-authored-by: Vlad Brezae <[email protected]> Co-authored-by: Alexander Köplinger <[email protected]>
filipnavara
added a commit
to filipnavara/runtime
that referenced
this pull request
Apr 15, 2025
…duplication (dotnet#113044) * [SGen/Tarjan] Handle edge case with node heaviness changing due to deduplication Do early deduplication Fix Windows build Add test cases to sgen-bridge-pathologies * Move test code * Remove old code * Add extra check (no change to functionality)
4 tasks
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This code path can introduce duplicates into the
color->other_colors
array:runtime/src/mono/mono/metadata/sgen-tarjan-bridge.c
Lines 795 to 798 in b658cd2
Then we get to the code that counts
xref_count
:runtime/src/mono/mono/metadata/sgen-tarjan-bridge.c
Lines 1101 to 1114 in b658cd2
It uses the
color_visible_to_client
helper method which in turn callsbridgeless_color_is_heavy
:runtime/src/mono/mono/metadata/sgen-tarjan-bridge.c
Lines 131 to 139 in b658cd2
The
fanout
metric is computed from the count ofother_colors
which contain duplicates. Once the references are gathered withgather_xrefs
they get deduplicated. This can change the number of fanout nodes just enough that it no longer satisfies the "heavy" property. We still increase thexref_count
but in the second loop that iterates over the same data it will get skipped and result in a mismatch:runtime/src/mono/mono/metadata/sgen-tarjan-bridge.c
Lines 1126 to 1130 in b658cd2
Repro: https://github.com/user-attachments/files/19044193/Archive.zip
Fixes #106410