Skip to content

Conversation

@kg
Copy link
Member

@kg kg commented Aug 7, 2024

Should fix #80393

I've never really touched CoreCLR so I'm not certain this is right.

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@kg
Copy link
Member Author

kg commented Aug 12, 2024

/azp list

@azure-pipelines
Copy link

CI/CD Pipelines for this repository:

@kg
Copy link
Member Author

kg commented Aug 12, 2024

/azp run runtime-coreclr crossgen2 outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@davidwrighton
Copy link
Member

@steveisok Who is working on fixing the crossgen2 outerloop? It appears that determinism in the crossgen2 compiler is broken for x64 to x86 builds, and there is a fundamental problem with composite R2R causing lock order violations? I don't see bugs filed here.

@kg I'd like to see a cleaner crossgen2 outerloop pass than what you're seeing here. This lock order problem is NOT related to what you are doing, but if you could modify CrstBase::IsSafeToTake to always just return TRUE and rerun the crossgen2 outerloop, I'd be obliged.

@EgorBo
Copy link
Member

EgorBo commented Aug 23, 2024

cc @dotnet/jit-contrib if anyone can assist here

@kg
Copy link
Member Author

kg commented Aug 23, 2024

I spent a while figuring out how to run the ARM32 JIT tests and was able to confirm that they don't pass, but the necessary steps to iterate on it are pretty time consuming (>1 hr per iteration), and I don't really know enough about ARM or the JIT to make quick progress here. So it would be awesome if someone could help by digging into the ARM issue, even if they don't have time to fix it.

I mentioned this in an email but essentially, it looks like on ARM32 specifically, the HFA argument is passed incorrectly. The HFA return value works correctly, and on ARM64 both HFA arguments and return values appear to work.

@AndyAyersMS
Copy link
Member

Also re being productive on arm32, I have had pretty decent luck cross-building from an x64 linux host (via docker) and then mounting the repo + artifacts via SSHFS on an arm64 host (RPI4 in my case) and then running arm32 under docker there ... then if you need to rebuild anything then there's no need to copy files around. And I remote to the boxes via VS code so can direct it all from the same machine.

The only real pain is that debugging requires a specially built lldb that @janvorli created and getting the right flavor of the lldb SOS plugin for that lldb takes a bit of care.

@kg
Copy link
Member Author

kg commented Sep 11, 2024

Added some instrumentation into morph.cpp locally to examine the arg ABI info for the call into native code:

            printf(
                "%s.%s arg[%u] numRegs=%u byteOffset=%u byteSize=%u argType=%s passedByRef=%u hfaElemKind=%u IsSplitAcrossRegistersAndStack=%u\n",
                comp->info.compClassName, comp->info.compMethodName,
                GetIndex(&arg),
                arg.AbiInfo.NumRegs,
                arg.AbiInfo.ByteOffset,
                arg.AbiInfo.ByteSize,
                varTypeName(arg.AbiInfo.ArgType),
                (unsigned)arg.AbiInfo.PassedByRef,
                (unsigned)arg.AbiInfo.GetHfaElemKind(),
                abiInfo.IsSplitAcrossRegistersAndStack()
            );
StructABI.Issue80393Wrapper arg[0] numRegs=1 byteOffset=0 byteSize=4 argType=int passedByRef=0 hfaElemKind=0 IsSplitAcrossRegistersAndStack=0
StructABI.Issue80393Wrapper arg[1] numRegs=2 byteOffset=0 byteSize=16 argType=struct passedByRef=0 hfaElemKind=0 IsSplitAcrossRegistersAndStack=1

Based on what godbolt arm32 clang generates for the target function, the numRegs looks suspect. the clang code appears to expect the struct to be passed in r1,r2,r3 with the last 4 bytes stored at sp. But if the numRegs=2 is accurate that would imply we're passing the first double in r1,r2 and then probably spilling the whole second double to the stack.

@kg
Copy link
Member Author

kg commented Sep 11, 2024

Dumped the segments for the argument and I'm not sure how to interpret them:

StructABI.Issue80393Wrapper arg[0] numRegs=1 byteOffset=0 byteSize=4 argType=int passedByRef=0 hfaElemKind=0 IsSplitAcrossRegistersAndStack=0
  segment[0] regType=int stackOffset=0 register=0
StructABI.Issue80393Wrapper arg[1] numRegs=2 byteOffset=0 byteSize=16 argType=struct passedByRef=0 hfaElemKind=0 IsSplitAcrossRegistersAndStack=1
  segment[0] regType=int stackOffset=0 register=2
  segment[1] regType=int stackOffset=0 register=3
  segment[2] regType=stack stackOffset=0 register=0

The first argument being an int in register 0 makes sense if it's the return address in r0. but then it's weird that the 2nd argument has two int segments in r2/r3 and then a stack segment. why isn't r1 being used? I would expect to see four segments here if I understand this system right, first 3 for r1/r2/r3 and then a stack segment. i'm not sure what happened to r1 here.

@kg
Copy link
Member Author

kg commented Sep 11, 2024

I feel like it's probably here, but I'm not sure how to confirm it or what the fix would be:

if ((type == TYP_LONG) || (type == TYP_DOUBLE) ||
((type == TYP_STRUCT) &&
(comp->info.compCompHnd->getClassAlignmentRequirement(structLayout->GetClassHandle()) == 8)))
{
alignment = 8;
m_nextIntReg = roundUp(m_nextIntReg, 2);

@kg
Copy link
Member Author

kg commented Sep 11, 2024

This issue was a bug in the C side of the test. According to the MS version of the arm32 ABI:

If the argument requires 8-byte alignment, the NCRN is rounded up to the next even register number.
...
If the NCRN is less than r4 and the NSAA is equal to the SP, the argument is split between core registers and the stack. The first part of the argument is copied into the core registers, starting at the NCRN, up to and including r3. The rest of the argument is copied onto the stack, starting at the NSAA. The NCRN is set to r4 and the NSAA is incremented by the size of the argument minus the amount passed in registers.

In order to approximate the StructLayout with explicit offsets, I had done pragma pack(1) around all the struct/union definitions. This overrode the natural 8-byte alignment (due to the doubles) of all the structs, instead of just the overlay struct with the byte padding in it. As a result clang decided not to round the NCRN up to the next even register number. CoreCLR was (correctly?) rounding it up because the natural alignment on the managed side was 8 bytes.

This does raise the question of whether the natural alignment of the C# struct should still be 8 bytes even though it has a field with a weird offset. I don't know how to answer the question, but it seems like what we do right now is consistent across targets and can be reproduced in C. The question is what the real-world scenario(s) for this sort of struct look like, I think.

@kg
Copy link
Member Author

kg commented Sep 11, 2024

/azp run runtime-coreclr crossgen2 outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@kg
Copy link
Member Author

kg commented Sep 12, 2024

Looks like the windows x86 crossgen2 comparison is failing:

Comparing crossgen results in "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" and "C:\h\w\AB58096F\w\B0FA09B5\uploads" directories for files of type "NativeOrReadyToRunImage":
File hash sum mismatch for "System.CodeDom" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "9fd1469fb9acdc6be19fb93e1dec2d839a402e396ff9bbb530fe1e23ef456783"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "7662b50a21a6d79f8739a3c947905c0e3c2b3baf6c5bef0969dab590750ff300"
File hash sum mismatch for "System.Collections.Concurrent" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "cd5dc836490b00509e6ece3f753556a1e3ee34be31a8b22f599995dd1b923842"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "e933ed9fd4224bd0c6143ac45359dbbc54692e74bce059a908c45984dd2d35b7"
File hash sum mismatch for "System.Collections.Immutable" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "7e288252a0e077d568e7c184a9842ebf4d1b4624f40a0e2bcddfac17869f14f8"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "0a8e1d73b47835e6e737ce74b8dccb4099dd98203b0da1765887d429e29f6cfa"
File hash sum mismatch for "System.Data.Common" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "4d6ed54e10401006ac0bccce69e930f101d72b9a7a9af7a7e219576c153aedb5"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "2c437e15413ad8192f3dd8b29be0be50cbc5f4f1d9bea4aae4236fe9f4675d3d"
File hash sum mismatch for "System.Data.OleDb" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "c1fd1bc0dabc2bc1bed299615cd686ee9b0869805f50f432df612fe0a6931c77"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "99f6101671ef04564897919f11b5c4079e89b75cffe2106415369d978a313a89"
File hash sum mismatch for "System.DirectoryServices.Protocols" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "158fa748a983ae40998650378d2c0f0765244f3b32dadb41dee2227add6c277a"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "dbef09ab8e7094fce3b96c42e244921702d7962fe3e891d47abf7749f3a1accc"
File hash sum mismatch for "System.Linq" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "296fbf3f48dad26ded62cc86ccfaa0f2c59cb9c0eaf48cca1a840aff2397f51b"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "4edf8d4a19c413c4510f375aadc85f91df14be66c191f207c7a1cd452390a7f3"
File hash sum mismatch for "System.Net.Http" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "119ec8640fbd8bc9990bb1f7b964f6acaf1ecbcbf17457bdf17284fa509b6eab"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "6b28f08f8737b8b11050cf3659160587994e5259755e78df2494e77d2cbbd435"
File hash sum mismatch for "System.Numerics.Tensors" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "31e98f958e111395fc9e0e87e6c0d3be03fc7a6cf2643cc3325b13f979eea367"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "1a3cd3595957b5ae5f8f99a67c315251efa9c88c5d82c81c075cc3bfb05951d3"
File hash sum mismatch for "System.Private.CoreLib" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "77972f917542d5a89450290e0a680da3d0ae9873a21d40f1d77b8ed9f3d9b0fd"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "6aa6cc3c15e8aebf8bfa9ea380b454c711d3a8721072ae236041d9250b27c163"
File hash sum mismatch for "System.Private.DataContractSerialization" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "1a17a9d5eb62e67b59c5797a69348ff79d8e40f1dc856bfe0435ddbd920c029d"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "8bb659bd3c8149d4bf0d5c1db341093de4e9d394b6a7a992e50a3d68e057ddd0"
File hash sum mismatch for "System.Private.Xml" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "adc43a5b1c069ee2ee76d7be703dce5b4dd33e81becb7fcc2c6bf32ea47efd4f"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "fc8dde4ff02ecaf2eaf83c0108987937439c5fa182e97ca4c939a267c75f93b1"
File hash sum mismatch for "System.Reflection.Emit" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "fbef91e4943b6ab5c89c7cfc4f53374041b03949f048ab5b7fec879b10a1e257"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "de6524b7f97e46c0a35c32d4ef1b52f5e2c034dd9cd7d81fb2ff050eef35ece2"
File hash sum mismatch for "System.Runtime.Caching" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "fd7aa41e14aa96c94f0de4000e679c033c08753f0f780d92c2656fbc22aa265a"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "fe4888a5cc1df51ea0bfa502d84ed6af3317bf233fb84d9c9449eb51822b554e"
File hash sum mismatch for "System.Security.Cryptography.Pkcs" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "1718989bb378cc9d6eb90f5f80ce6b0ae24ad80def9654124f7ac4383f418e7e"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "7e8db4c914429c47f32fd9be66e97bea163327a2a678f46eb5c5c53965c2a3de"
File hash sum mismatch for "System.Text.RegularExpressions" assembly for files of type "NativeOrReadyToRunImage":
 - "C:\h\w\AB58096F\w\B0FA09B5\u\prebuiltWork\log" has "235f819defd26ff3e97afba7e6eb54e14b27753b7704c5439e66aa8c0bdbbaa9"
 - "C:\h\w\AB58096F\w\B0FA09B5\uploads" has "3736ca14a35490dcd5f60f3b027d56fe5de9d0d4a9e9bc618a1fa78dbd5443f5"

@kg
Copy link
Member Author

kg commented Sep 12, 2024

The R2R composite lanes appear to be failing with the same error across linux/osx on both arm64 and x64, so I don't think my HFA changes can be responsible:

Running CrossGen2:  /datadisks/disk1/work/AEBC0975/p/crossgen2/crossgen2 @/datadisks/disk1/work/AEBC0975/w/B8A509B5/e/JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_d/composite-r2r.dll.rsp   --composite
Emitting R2R PE file: /datadisks/disk1/work/AEBC0975/w/B8A509B5/e/JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_d/composite-r2r.dll
Emitting R2R PE file: /datadisks/disk1/work/AEBC0975/w/B8A509B5/e/JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_d/sizeof64_Target_64Bit_and_arm_d.dll
Running R2RDump:  dotnet /datadisks/disk1/work/AEBC0975/p/R2RDump/R2RDump.dll --header --sc --in /datadisks/disk1/work/AEBC0975/w/B8A509B5/e/JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_d/composite-r2r.dll --out /datadisks/disk1/work/AEBC0975/w/B8A509B5/e/JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_d/composite-r2r.dll.r2rdump --val
Error: System.BadImageFormatException: Failed to convert invalid RVA to offset: 67222
   at ILCompiler.Reflection.ReadyToRun.PEReaderExtensions.GetOffset(PEReader reader, Int32 rva) in /_/src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/PEReaderExtensions.cs:line 113
   at ILCompiler.Reflection.ReadyToRun.ReadyToRunReader.EnsureImportSections() in /_/src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs:line 1456
   at R2RDump.TextDumper.DumpSectionContents(ReadyToRunSection section) in /_/src/coreclr/tools/r2rdump/TextDumper.cs:line 435
   at R2RDump.TextDumper.DumpSection(ReadyToRunSection section) in /_/src/coreclr/tools/r2rdump/TextDumper.cs:line 120
   at R2RDump.TextDumper.DumpHeader(Boolean dumpSections) in /_/src/coreclr/tools/r2rdump/TextDumper.cs:line 81
   at R2RDump.Program.Dump(ReadyToRunReader r2r) in /_/src/coreclr/tools/r2rdump/Program.cs:line 166
   at R2RDump.Program.Run() in /_/src/coreclr/tools/r2rdump/Program.cs:line 460
23:29:30
R2RDump failed with exitcode: 1
in ReleaseLock
Test failed. Trying to see if dump file was created in /home/helixbot/dotnetbuild/dumps since 9/11/2024 11:29:29 PM
Test Harness Exitcode is : 1
To run the test:
Set up CORE_ROOT and run.

@kg kg marked this pull request as ready for review September 12, 2024 11:36
@kg
Copy link
Member Author

kg commented Sep 13, 2024

Another category of failure on the R2R lanes that looks unrelated:

cp: target '/root/helix/work/correlation/lib*.dylib' is not a directory
DOTNET_DbgEnableMiniDump is set and the createdump binary does not exist: /root/helix/work/correlation/crossgen2/createdump

Return code:      1
Raw output file:      /root/helix/work/workitem/uploads/Intrinsics/BitCast/output.txt
Raw output:
BEGIN EXECUTION
in takeLock
23:42:39
Response file: /root/helix/work/workitem/e/JIT/Intrinsics/BitCast/composite-r2r.dll.rsp
/root/helix/work/workitem/e/JIT/Intrinsics/BitCast/IL-CG2/BitCast.dll
-o:/root/helix/work/workitem/e/JIT/Intrinsics/BitCast/composite-r2r.dll
-r:/root/helix/work/correlation/System..dll
-r:/root/helix/work/correlation/Microsoft..dll
-r:/root/helix/work/correlation/xunit.*.dll
-r:/root/helix/work/correlation/mscorlib.dll
--verify-type-and-field-layout
--method-layout:random
--targetarch:arm64
--targetos:linux
Running CrossGen2:  /root/helix/work/correlation/crossgen2/crossgen2 @/root/helix/work/workitem/e/JIT/Intrinsics/BitCast/composite-r2r.dll.rsp   --composite
Emitting R2R PE file: /root/helix/work/workitem/e/JIT/Intrinsics/BitCast/composite-r2r.dll
Emitting R2R PE file: /root/helix/work/workitem/e/JIT/Intrinsics/BitCast/BitCast.dll
Running R2RDump:  dotnet /root/helix/work/correlation/R2RDump/R2RDump.dll --header --sc --in /root/helix/work/workitem/e/JIT/Intrinsics/BitCast/composite-r2r.dll --out /root/helix/work/workitem/e/JIT/Intrinsics/BitCast/composite-r2r.dll.r2rdump --val
Error: System.Exception: Missing reference assembly: Bitcast
at ILCompiler.Reflection.ReadyToRun.ReadyToRunReader.OpenReferenceAssembly(Int32 refAsmIndex) in //src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs:line 1589
at ILCompiler.Reflection.ReadyToRun.MetadataNameFormatter.FormatSignature(IAssemblyResolver assemblyResolver, ReadyToRunReader r2rReader, Int32 imageOffset) in //src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunSignature.cs:line 96
at ILCompiler.Reflection.ReadyToRun.ReadyToRunReader.EnsureImportSections() in //src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs:line 1447
at R2RDump.TextDumper.DumpSectionContents(ReadyToRunSection section) in //src/coreclr/tools/r2rdump/TextDumper.cs:line 435
at R2RDump.TextDumper.DumpSection(ReadyToRunSection section) in //src/coreclr/tools/r2rdump/TextDumper.cs:line 120
at R2RDump.TextDumper.DumpHeader(Boolean dumpSections) in //src/coreclr/tools/r2rdump/TextDumper.cs:line 81
at R2RDump.Program.Dump(ReadyToRunReader r2r) in //src/coreclr/tools/r2rdump/Program.cs:line 166
at R2RDump.Program.Run() in //src/coreclr/tools/r2rdump/Program.cs:line 460
23:42:39
R2RDump failed with exitcode: 1
in ReleaseLock
Test failed. Trying to see if dump file was created in /home/helixbot/dotnetbuild/dumps since 9/11/2024 11:42:39 PM
Test Harness Exitcode is : 1
To run the test:
Set up CORE_ROOT and run.


    /root/helix/work/workitem/e/JIT/JIT_others/../Intrinsics/BitCast/BitCast.sh


@EgorBo
Copy link
Member

EgorBo commented Jul 25, 2025

/azp run runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr r2r-extra

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@EgorBo
Copy link
Member

EgorBo commented Jul 26, 2025

All failures seem to be #118006

@Copilot Copilot AI review requested due to automatic review settings August 14, 2025 20:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses issue #80393 related to HFA (Homogeneous Float Aggregate) alignment validation in CoreCLR. The change ensures that HFA classification correctly rejects structs with misaligned value type fields according to ARM32/ARM64 ABI requirements.

  • Adds alignment validation for HFA value type fields in the EEClass::CheckForHFA() method
  • Introduces a comprehensive test case that validates HFA behavior with deliberately misaligned struct fields
  • Includes documentation for testing ARM32 code on ARM64 hardware

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/vm/class.cpp Adds alignment checks for HFA value type fields based on their element type
src/tests/JIT/Directed/StructABI/StructABI.cs Adds test structs and wrapper function to validate HFA alignment behavior
src/tests/JIT/Directed/StructABI/StructABI.c Implements native C function to test HFA parameter passing with misaligned fields
docs/workflow/testing/coreclr/running-arm32-tests.md Documents process for running ARM32 tests on ARM64 hardware

@kg
Copy link
Member Author

kg commented Aug 14, 2025

/azp run runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr r2r-extra

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@kg kg merged commit 5a7ac76 into dotnet:main Aug 19, 2025
103 of 105 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Sep 19, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CoreCLR runtime seems to have a subtle bug in HFA flag calculation

6 participants