-
Notifications
You must be signed in to change notification settings - Fork 5.2k
[NativeAOT] do not do shutdown for the main thread on Windows #74679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // initialization and false on failure. | ||
| REDHAWK_PALEXPORT bool REDHAWK_PALAPI PalInit() | ||
| { | ||
| g_mainThreadId = GetCurrentThreadId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initialization can run on any thread when NativeAOT is used to produce libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is inconvenient.
There is no easy way to identify the main thread.
Online samples suggest iterating through all threads and look for the earliest start time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A part of the problem is that the thread object in NativeAOT lives directly in a TSL slot and thus is owned by the thread, so we must unlink the threads from the list before the thread dies and that requires taking a lock.
|
The alternative solution that I tried was to switch Windows to the same plan as on Unix with That did not work. The reason is that on Windows TLS destructors run while holding system loader lock and we want to call managed cleanup routines, which makes it prone to deadlock.
|
| // The current fiber is the home fiber of a thread, so the thread is shutting down | ||
| RuntimeThreadShutdown(lpFlsData); | ||
| // Do not do shutdown for the main thread though. | ||
| // other threads are terminated before the main one and could leave TLS locked, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how shutdown works. The shutdown first terminates all threads except the thread that called ExitProcess. ExitProcess can be called by any thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of vague information on the details of process/thread shutdown. It looks like in a completely broad sense it is hard to handle as there is always some crazy scenario such as someone starting new threads in a DLLMain detach handler.
I think we should strive for handling "common" cases.
However using NativeAOT in a library and calling ExitProcess from non-main thread would seem reasonable.
What would it take to not call any managed code during thread exit? Calling managed code during process exit sounds like an endless source of problems. |
I agree on this. It seems to be more useful on Unix than on Windows. On Unix we need to tell the managed wait subsystem to release all locks associated with the thread. We do not have a problem on Unix though. |
|
There is some stuff on Windows too. Less than on Unix, but we still decrement a number of foreground threads, possibly release an event used in |
|
Or just do that via a bit of unmanaged code. |
|
It is actually trickier than it seems. We need to interact with managed code eventually (more on Unix, less on Windows) to inform that the thread has stopped, and we need to do it fairly promptly upon the thread's exit, thus finalizer might not work well due to unknown latency. I see how the current scheme was attractive - as long as the calling code does not hold system locks, calling a managed helper from the native callback is a natural solution, not much different from other cases when managed code is called from native. |
and not just that. If all threads were killed during shutdown, it would probably be ok, but the last one is not killed, so if it blocks, the process does not exit. |
|
Another possibility is to keep a count of attached threads and if we are in shutdown mode, do not perform cleanup for the last one. |
|
I think we can specialcase the thread that called For v.Next we can think of a better solution that would handle cases like NativeAOT libraries, ExitProcess called from native code or multiple times, and preferably find a way to not call managed code at process shutdown. Although, if fiber detach on Windows and TLS dtor on Unix are guaranteed to not hold locks, they may still be a part of the solution. |
jkotas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a complete fix for the problem, but it is definitely improvement. We should look at the full fix in .NET 8 as you have said. Thank you!
|
Thanks! |
|
@VSadov I think that this is candidate for .NET 7 backport. Do you agree? |
yes, I will initiate a backport. |
|
/backport to release/7.0 |
|
Started backporting to release/7.0: https://github.com/dotnet/runtime/actions/runs/2939886329 |
Fixes:#74678
There is not a lot of value in gracefully removing the main thread from the thread list. By the time it gets to detaching of the main thread it is typically the only one still running. Besides some thread may be terminated while holding the thread list spinlock, so attempt to do shutdown for the main thread will likely deadlock.