Fix handling exceptions from native code when interpreted #121212

janvorli · 2025-10-30T18:51:11Z

There were two issues. One is that the SfiNextWorker was incorrectly advancing the stack frame iterator in case the interpreter was called from CallDescrWorkerInternal in some cases.
The second was that the call to MethodDesc::GetMethodDescOfVirtualizedCode that is made by the InterpExecMethod can end up running managed code and throw managed exception. So it needs to have PAL_TRY/PAL_EXCEPT around the invocation, since the main loop of the interpreter can only catch and process C++ exceptions.

This fixes the Loader\classloader\regressions\vsw529206\vsw529206ModuleCctor test.

There were two issues. One is that the SfiNextWorker was incorrectly advancing the stack frame iterator in case the interpreter was called from CallDescrWorker in some cases. The second was that the call to MethodDesc::GetMethodDescOfVirtualizedCode that is made by the InterpExecMethod can end up running managed code and throw managed exception. So it needs to have PAL_TRY/PAL_EXCEPT around the invocation, since the main loop of the interpreter can only catch and process C++ exceptions.

dotnet-policy-service · 2025-10-30T18:51:56Z

Tagging subscribers to this area: @BrzVlad, @janvorli, @kg
See info in area-owners.md if you want to be subscribed.

Copilot

Pull Request Overview

This PR improves exception handling in the CoreCLR interpreter by:

Wrapping calls to GetMethodDescOfVirtualizedCode in a new helper function that properly handles both C++ and managed exceptions
Refining the stack frame iteration logic to correctly distinguish between managed and native callers during exception unwinding

Key changes:

Introduces CallGetMethodDescOfVirtualizedCode wrapper function for proper exception handling
Updates virtual call and delegate invocation sites to use the new wrapper
Improves exception handling logic to check if return addresses point to managed code before advancing the stack frame iterator

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
src/coreclr/vm/interpexec.cpp	Adds `CallGetMethodDescOfVirtualizedCode` wrapper function and updates two call sites to use it for proper exception handling
src/coreclr/vm/exceptionhandling.cpp	Refines stack frame unwinding logic to properly distinguish managed from native callers using `ExecutionManager::IsManagedCode`

src/coreclr/vm/exceptionhandling.cpp

davidwrighton · 2025-11-01T00:47:46Z

@janvorli it looks like the stackoverflowtester failure may be related. I saw this fail on the previous iteration of this PR, and my PRs have not been failing with that issue.

janvorli · 2025-11-01T00:51:03Z

@janvorli it looks like the stackoverflowtester failure may be related. I saw this fail on the previous iteration of this PR, and my PRs have not been failing with that issue.

This change doesn't touch any code path that is executed without the interpreter. This stack overflow test issue has been occurring for more than a year on regular basis in the ci, but I was never able to repro it locally on any of my devices.

am11 · 2025-11-01T13:01:09Z

This change doesn't touch any code path that is executed without the interpreter. This stack overflow test issue has been occurring for more than a year on regular basis in the ci, but I was never able to repro it locally on any of my devices.

@janvorli, is it #110173? Just curious, if async handling is the culprit, should we queue and process SIGSEGVs in FIFO order? The end result should still feel organic in both scenarios; either one of the SIGSEGVs will eventually abort the process (the most likely outcome), or it will be handled gracefully and execution will continue.

janvorli · 2025-11-03T12:13:19Z

@janvorli, is it #110173? Just curious, if async handling is the culprit, should we queue and process SIGSEGVs in FIFO order? The end result should still feel organic in both scenarios; either one of the SIGSEGVs will eventually abort the process (the most likely outcome), or it will be handled gracefully and execution will continue.

Yes, it is that one. What we do is that we actually handle only the first SIGSEGV and let it dump the stack and then abort the process. Handler for all subsequent SIGSEGVs on other threads are just left spinning in a loop.

The issue that happens is that we actually get SIGSEGV on the same thread again (there is an added logging that shows that). I can see that the secondary thread we start to dump the stack trace dumped all the frames. So the main thread gets SIGSEGV while waiting for this thread to complete or right after the wait completes. I was thinking that the activation async signal might occur during that wait and require more space then there is available for the stack overflow handling. But the issue persisted even after that signal was left disabled for the stack overflow handling.

This issue never reproduces on any of my local devices. I've left it running in a loop for a week with no repro. So it is something that only happens in the CI for some reason.

janvorli · 2025-11-03T12:51:46Z

/ba-g the arm64 legs are down

janvorli added this to the 11.0.0 milestone Oct 30, 2025

janvorli requested a review from davidwrighton October 30, 2025 18:51

janvorli self-assigned this Oct 30, 2025

Copilot AI review requested due to automatic review settings October 30, 2025 18:51

janvorli requested review from BrzVlad and kg as code owners October 30, 2025 18:51

janvorli added the area-CodeGen-Interpreter-coreclr label Oct 30, 2025

Copilot AI reviewed Oct 30, 2025

View reviewed changes

src/coreclr/vm/exceptionhandling.cpp Outdated Show resolved Hide resolved

src/coreclr/vm/exceptionhandling.cpp Show resolved Hide resolved

This was referenced Oct 31, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

Test failure: baseservices/exceptions/stackoverflow/stackoverflowtester/stackoverflowtester.cmd #110173

Open

davidwrighton approved these changes Oct 31, 2025

View reviewed changes

Update comment

8ccafaf

janvorli merged commit 60d14f8 into dotnet:main Nov 3, 2025
92 of 98 checks passed

janvorli deleted the fix-eh-flowing-to-native-in-interpreter branch November 3, 2025 12:52

dotnet-maestro bot mentioned this pull request Nov 6, 2025

[main] Source code updates from dotnet/runtime dotnet/dotnet#3258

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix handling exceptions from native code when interpreted #121212

Fix handling exceptions from native code when interpreted #121212

janvorli commented Oct 30, 2025

Uh oh!

dotnet-policy-service bot commented Oct 30, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

davidwrighton commented Nov 1, 2025

Uh oh!

janvorli commented Nov 1, 2025

Uh oh!

am11 commented Nov 1, 2025

Uh oh!

janvorli commented Nov 3, 2025

Uh oh!

janvorli commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix handling exceptions from native code when interpreted #121212

Fix handling exceptions from native code when interpreted #121212

Conversation

janvorli commented Oct 30, 2025

Uh oh!

dotnet-policy-service bot commented Oct 30, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

davidwrighton commented Nov 1, 2025

Uh oh!

janvorli commented Nov 1, 2025

Uh oh!

am11 commented Nov 1, 2025

Uh oh!

janvorli commented Nov 3, 2025

Uh oh!

janvorli commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants