Skip to content

Conversation

@VSadov
Copy link
Member

@VSadov VSadov commented Sep 22, 2024

There should not be any functional changes. The most affected part is GC stress as there is a lot of dead code there now.
Some code moved around to tease apart reachable and unreachable scenarios.

  • remove code specific to no longer used GC info versions
  • separate major parts that deal with X86 specific scenarios from everything else
  • once we have X86 and "everything else", we can make "everything else" much simpler.
    I.E. the figuring the targets of calls and what they may return is no longer needed on non-X86 - neither at instrumentation time nor at fault time.
  • once separated, X86 part can be somewhat simpler too. (no multireg returns, etc)
  • look for more things, address TODOs

@VSadov
Copy link
Member Author

VSadov commented Sep 23, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented Sep 23, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented Sep 24, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@VSadov VSadov marked this pull request as ready for review September 24, 2024 01:19
@VSadov
Copy link
Member Author

VSadov commented Sep 24, 2024

I think this is ready for review.
Cc: @jkotas @janvorli @jakobbotsch

{
_ASSERTE(!nativeCodeVersion.IsNull());

// CONSIDER: does anyone call this with fZapped == true ? are there plans?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zap was a code name for fragile NGen.

Does the GC stress instrumentation work for R2R code? If it does, this flag can be deleted. If it does not, what needs to happen to enable the GC stress instrumentation for R2R code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two theories - either R2R code comes on the JIT path (and JIT does not really JIT much, maybe some fixups), then we could remove fZapped path as a part this change.
Or we do not stress R2R. There is nothing that would make that impossible, I think R2R code is writeable in the same sense as JIT code is writeable. Enabling stress over R2R would be a separate change though.

I will check what happens with R2R.

@VSadov
Copy link
Member Author

VSadov commented Sep 24, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@janvorli
Copy link
Member

@VSadov I wonder if it would be possible to reorder stuff in the gccover.cpp so that the diff github shows would make sense. Looking at it now, it is very difficult to see what was changed and what was just moved.

@VSadov
Copy link
Member Author

VSadov commented Sep 24, 2024

@VSadov I wonder if it would be possible to reorder stuff in the gccover.cpp so that the diff github shows would make sense. Looking at it now, it is very difficult to see what was changed and what was just moved.

Some of that is because a few functions make sense only for x86, and a few are very different between x86 and "the rest", so I brought x86-specific parts closer to each other under one #if defined(TARGET_X86). I think it is easier to work with this code by concentrating on X86 or "everything else" parts separately. Moving code did contribute to extra diffs.
I'll see if I could rearrange the X86 part a bit so diffs are smaller.

@VSadov
Copy link
Member Author

VSadov commented Sep 24, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

// 5) Now, thread T can modify the stack (ex: RedirectionFrame setup) while the GC thread is scanning it.
//
// This race is now mitigated below. Where we won't initiate a stress mode GC
// for a thread in cooperative mode with an active ICF, if g_TrapReturningThreads is true.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original explanation for the g_TrapReturningThreads + active ICF scenario causing a race was somewhat questionable, so I initially removed the mitigation, but the race appears to be real.

Corrected the explanation of how the race may happen and brought the mitigation back.

@VSadov
Copy link
Member Author

VSadov commented Sep 25, 2024

Does the GC stress instrumentation work for R2R code? If it does, this flag can be deleted. If it does not, what needs to happen to enable the GC stress instrumentation for R2R code?

It does not look like we instrument R2R code for GC stress. The instrumentation (SetupGcCoverage) is called from JitCompileCodeLocked and that happens only on JIT-compiled code path.

I will log a follow-up issue. We should be able to GC-stress R2R code. (#108117)

@VSadov
Copy link
Member Author

VSadov commented Sep 25, 2024

I'll see if I could rearrange the X86 part a bit so diffs are smaller.

@janvorli - We had a lot of platform specific code and now a lot of that is gone, while X86 specific code is still in. Moving code roughly to where it was resulted in many #if defined(TARGET_X86), with no corresponding "everything else" code.

I tried to move code around to reduce diffs, but it made things mostly worse. The diffs were still big enough that VS differ could not make a lot of sense of it (I assume github diff will not be any better). So basically the rearrangements were making the file harder to read without much improvement to the diff.

The code that I moved generally stayed the same, modulo removing dead code. I was particularly careful that X86 code path did not change. In couple places I made separate X86 and non-X86 copies (DoGcStress, GCCoverageInfo::SprinkleBreakpoints) - just to make sure that X86 stays the same.

I think the easiest way to review this file is basically by reading the new code, and when in doubt finding old version and check with that.

I think the most "interesting" part is that AMD64 code path is now merged with ARM,RISC-V,etc.. and not a part of X86 path.
That is where I had to add some code - like to figure lengths of x64 instructions we need to ask disassembler. However we no longer go over all the instructions for the function/funclets, we just iterate over safe points or over interruptible ranges - the same as we did on RISC architectures. Thus we no longer need a funclet iterator and do not need to worry about disassembler being confused by codegen for switches.

The part that the stress runs are green and take roughly the same time as before give me a good deal of confidence that the code is functionally the same as before.
(and I did verify in debugger that we actually stress things :-)

@VSadov
Copy link
Member Author

VSadov commented Oct 15, 2024

Rebased onto recent main.

@VSadov
Copy link
Member Author

VSadov commented Oct 15, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented Oct 28, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented Oct 30, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@VSadov
Copy link
Member Author

VSadov commented Nov 1, 2024

/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@VSadov
Copy link
Member Author

VSadov commented Nov 5, 2024

Thanks!!

@VSadov VSadov merged commit 552782f into dotnet:main Nov 5, 2024
122 checks passed
@VSadov VSadov deleted the gcVerCleanup branch November 5, 2024 19:06
@github-actions github-actions bot locked and limited conversation to collaborators Dec 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants