Skip to content

Conversation

@AndyAyersMS
Copy link
Member

OSR and PGO both rely on the fact that the early flowgraph the JIT builds is the same flowgraph as the one seen in a previous JIT complation of that method, since there is metadata (patchpoint offset, pgo schema) that refers to the flowgraph. Previous work here (#85860) didn't ensure this for enough cases.

In particular if Tier0 does not do early block merging, but OSR does, it's possible for the OSR entry point to fall within a merged block range, which the JIT is not prepared to handle. This lead to the asserts seen in #86125.

The fix is to enable early block merging unless we're truly in a minopts or debug codegen mode (previous to this Tier0 would not merge unless it also was instrumenting; now it will always merge).

An alternative fix would be to find the block containing the OSR entry IL offset, scan its statements, and split the block at the right spot, but that seemed more involved.

Fixes #86125.

OSR and PGO both rely on the fact that the early flowgraph the JIT builds is
the same flowgraph as the one seen in a previous JIT complation of that method,
since there is metadata (patchpoint offset, pgo schema) that refers to the
flowgraph. Previous work here (dotnet#85860) didn't ensure this for enough cases.

In particular if Tier0 does not do early block merging, but OSR does, it's possible
for the OSR entry point to fall within a merged block range, which the JIT is not
prepared to handle. This lead to the asserts seen in dotnet#86125.

The fix is to enable early block merging unless we're truly in a minopts or
debug codegen mode (previous to this Tier0 would not merge unless it also
was instrumenting; now it will always merge).

An alternative fix would be to find the block containing the OSR entry IL
offset, scan its statements, and split the block at the right spot, but that
seemed more involved.

Fixes dotnet#86125.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 12, 2023
@ghost ghost assigned AndyAyersMS May 12, 2023
@ghost
Copy link

ghost commented May 12, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

OSR and PGO both rely on the fact that the early flowgraph the JIT builds is the same flowgraph as the one seen in a previous JIT complation of that method, since there is metadata (patchpoint offset, pgo schema) that refers to the flowgraph. Previous work here (#85860) didn't ensure this for enough cases.

In particular if Tier0 does not do early block merging, but OSR does, it's possible for the OSR entry point to fall within a merged block range, which the JIT is not prepared to handle. This lead to the asserts seen in #86125.

The fix is to enable early block merging unless we're truly in a minopts or debug codegen mode (previous to this Tier0 would not merge unless it also was instrumenting; now it will always merge).

An alternative fix would be to find the block containing the OSR entry IL offset, scan its statements, and split the block at the right spot, but that seemed more involved.

Fixes #86125.

Author: AndyAyersMS
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@EgorBo PTAL
cc @dotnet/jit-contrib

Some diffs in Tier0 codegen.

@AndyAyersMS AndyAyersMS requested a review from EgorBo May 12, 2023 15:53
@AndyAyersMS
Copy link
Member Author

/azp run runtime-jit-experimental

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

Will fix formatting after the experimental leg completes.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented May 12, 2023

Experimental had what looks like two unrelated failures in jit-stress-splitting. Will ignore for now as the OSR failures are fixed.
(I'll open an issue if this pops up again in weekend runs)

Process terminated. Parallel crash in main thread
at System.Environment.FailFast(System.String)
at ParallelCrash.Main(System.String[])
[createdump] The app model does not support crash report generation
[createdump] Failure took 0ms
waitpid() returned successfully (wstatus 0000ff00) WEXITSTATUS ff WTERMSIG 0


cmdLine:/private/tmp/helix/working/A9500944/w/A32E08EB/e/baseservices/exceptions/simple/ParallelCrashMainThread/ParallelCrashMainThread.sh Timed Out (timeout in milliseconds: 600000 from variable __TestTimeout, start: 5/12/2023 2:05:16 PM, end: 5/12/2023 2:15:16 PM)
``

@AndyAyersMS
Copy link
Member Author

Had to revamp this a bit, turns out BBOPT is set even for debuggable and EnC cases, and so the first change I made here (#85860) broke some debugger tests.

This will cause some large negative diffs in minotps for clr_tests that undoes the large positive diff from #85860.

@AndyAyersMS
Copy link
Member Author

Replay failures are related to collections issues.

@AndyAyersMS AndyAyersMS merged commit aa38055 into dotnet:main May 13, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OSR tests failing in jit-experimental pipeline

2 participants