Skip to content

Conversation

@davmason
Copy link
Contributor

@davmason davmason commented Jan 8, 2021

It's always been possible for managed code to continue running on background threads after EE shutdown, but the profiler tests have never needed to deal with it before because all the profilees were single threaded. That means that when the main method returned no more managed code would run.

Now that the managed thread pool is in place there can be background threads running managed code during shutdown. If something happens that triggers a call in to a profiler (ELT hook, JIT notification, etc) after shutdown, the test profiler will AV and fail the test due to trying to use a null ICorProfilerInfo*.

This adds synchronization so that the profile will block at shutdown if any runtime calls in the to profiler in progress, and then block future calls from doing anything.

@davmason davmason added this to the 6.0.0 milestone Jan 8, 2021
@davmason davmason requested a review from a team January 8, 2021 09:41
@davmason davmason self-assigned this Jan 8, 2021
@ghost
Copy link

ghost commented Jan 8, 2021

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

It's always been possible for managed code to continue running on background threads after EE shutdown, but the profiler tests have never needed to deal with it before because all the profilees were single threaded. That means that when the main method returned no more managed code would run.

Now that the managed thread pool is in place there can be background threads running managed code during shutdown. If something happens that triggers a call in to a profiler (ELT hook, JIT notification, etc) after shutdown, the test profiler will AV and fail the test due to trying to use a null ICorProfilerInfo*.

This adds synchronization so that the profile will block at shutdown if any runtime calls in the to profiler in progress, and then block future calls from doing anything.

Author: davmason
Assignees: davmason
Labels:

area-Diagnostics-coreclr

Milestone: 6.0.0

Copy link
Contributor

@sdmaclea sdmaclea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Briefly reviewed the code restructuring. Nice reuse of your previous solution to generalize the solution.

Curious why I got an email with three Fixes in it, but I don't see the comment.

I am hopeful that this will have a positive impact on the Apple Silicon GCStress tests which are failing.

profiler.elt.slowpatheltenter.slowpatheltenter.sh
profiler.elt.slowpatheltleave.slowpatheltleave.sh
profiler.transitions.transitions.transitions.sh
profiler.unittest.releaseondetach.releaseondetach.sh

@davmason
Copy link
Contributor Author

davmason commented Jan 8, 2021

This should solve the Apple Silicon failures, unless there is some other bug there too. I suspect the GCStress made them more vulnerable to the timing issues here

@davmason davmason merged commit cb7c824 into dotnet:master Jan 8, 2021
@davmason davmason deleted the profiler_shutdown branch January 20, 2021 08:58
@ghost ghost locked as resolved and limited conversation to collaborators Feb 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

3 participants