Skip to content

Conversation

@cshung
Copy link
Contributor

@cshung cshung commented May 26, 2023

Together with #87533, this PR fixed #76929.

Causes

The root cause of the bug is that because we are allocating small POH objects, they may share the same mark word. Before these changes, the mark word could be concurrently accessed by the background mark phase as well as the various allocating threads.

Fixes

@Maoni0's fix avoided the marking of the object while background mark phase is in progress, and therefore we are left with the concurrent marking among allocating threads.

This fix move the marking of the objects inside of more_space_lock, this will guarantee no two allocating threads will mark the same word at the same time, and therefore eliminated all concurrent accesses.

Testing

The customer repro were run under workstation mode for 8,000 iterations and it doesn't crash.
ReliabilityFramework is ran for 13 hours for both (workstation/server), (Debug/Release). The release server version crashed with OutOfMemory. The OOM is unrelated to the fix, I just don't have enough memory.

@ghost ghost added the area-GC-coreclr label May 26, 2023
@cshung cshung marked this pull request as draft May 26, 2023 16:30
@ghost ghost assigned cshung May 26, 2023
@ghost
Copy link

ghost commented May 26, 2023

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author: cshung
Assignees: -
Labels:

area-GC-coreclr

Milestone: -

@Maoni0
Copy link
Member

Maoni0 commented Jun 1, 2023

it appears you are not setting mark array bits at all for Server GC?

@ghost ghost closed this Jul 5, 2023
@ghost
Copy link

ghost commented Jul 5, 2023

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@cshung cshung force-pushed the public/poh-allocation-fix branch from baa025a to 2ae3ce0 Compare August 1, 2023 21:37
@cshung cshung marked this pull request as ready for review August 3, 2023 18:42
@cshung cshung merged commit bb0be13 into dotnet:main Aug 4, 2023
@cshung cshung deleted the public/poh-allocation-fix branch August 4, 2023 22:54
@ghost ghost locked as resolved and limited conversation to collaborators Sep 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data Corruption With Ref Locals, Punning, and Pinned Object Heap

2 participants