- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
Use .alt_entry/.private_extern on Apple platforms #106224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…native symbol names
the Mach-O linker. We depend on the layout when rehydrating it using relative offsets. This was previously achieved using the N_NO_DEAD_STRIP flag but N_ALT_ENTRY is a stronger guarantee. Also, emit N_PEXT (.private_extern) flag for all non-global symbols to align the behavior with ELF targets.
| Type = N_SECT | N_EXT, | ||
| Descriptor = | ||
| definition.AltEntry || definition.SectionIndex == _hydrationTargetSectionIndex ? | ||
| N_ALT_ENTRY : N_NO_DEAD_STRIP, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking the N_NO_DEAD_STRIP should no longer be required with the changes in this PR. I left it in out of abundance of caution and to minimize the chance of something going wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, caution was warranted. New Xcode linker still messes up the .hydrated section even if it's completely composed of N_ALT_ENTRY symbols. I changed it to N_ALT_ENTRY | N_NO_DEAD_STRIP and I will probably follow up with a bug report to Apple if I can make a small repro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the pattern, will fix ILC to avoid producing it:
test_data.S:
.subsections_via_symbols
.section __DATA,hydrated
.global .hydrated // <-- symbol we generate at beginning of .hydrated section
.hydrated:
lsection6: // <-- local symbol for beginning of section
.alt_entry _foo_data
.private_extern _foo_data
_foo_data:
  .dword 0
.alt_entry _bar_data
.private_extern _bar_data
_bar_data:
  .dword 1test_code.S:
extern char *foo_data;
int main() { return (int)(long)&foo_data; }
Build with clang test_data.S test_code.c -o ./test -Wl,-dead_strip [-Wl,-ld_classic]. Check output with nm -nm ./test
For new linker we get:
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
0000000100003f8c (__TEXT,__text) external _main
0000000100004000 (__DATA,hydrated) external .hydrated
0000000100004000 (__DATA,hydrated) non-external (was a private external) _foo_data
For old linker we get:
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
0000000100003f9c (__TEXT,__text) external _main
0000000100004000 (__DATA,hydrated) external .hydrated
0000000100004000 (__DATA,hydrated) non-external (was a private external) _foo_data
0000000100004008 (__DATA,hydrated) non-external (was a private external) _bar_data
This only happens if we produce the local section symbol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed as Apple bug FB14743667. The workaround in ILC is orthogonal to the changes in this PR, so I will likely submit it separately after I do sufficient testing. Keeping N_NO_DEAD_STRIP is sufficient workaround for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Temporarily switched back to draft while I try to validate what is happening with the new linker (Xcode 15/16). While we currently disable it I want to ensure that this doesn't introduce any additional regression that we would have to deal with. Seems like the  | 
| Re-tested with the following scenarios: 
 | 
| Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas | 
| 
 Does this work if dehydration is turned off? I don't fully understand why the hydrated section is special. Is it because we happen to place everything that cannot be broken up into that section right now (when dehydration is turned on, that is)? | 
| 
 Yes 
 It's special because the dehydrated data indexes the  In the atom linking model the native linker is free to reorder atoms (subsections), or discard them if they are unreferenced (from the linker's point of view, ie. relocation targets). Thus the whole "hydrated" section is by default considered a bunch of atoms that can be reordered or removed. If that happens the offsets in the dehydrated data no longer match the symbols and relocations that point to them. The idea is to turn the whole section into one single non-breakable and non-relocatable atom for the native linker. That's accomplished by making the whole section start with the ".hydrated" symbol, which is the start of the atom, and then marking any other symbol in the section as "alternative entry" symbol. "alternative entry" symbols don't break up atoms (*). We previously started to mark all the symbols with "no dead strip" marker. That guarantees that the symbol is preserved in the final linked image. It doesn't guarantee that it's present in the original order which is a stronger guarantee that this PR is trying to enforce with documented behavior instead of relying on undefined behavior. (*) Unless you hit a linker bug, like with the new Xcode 16 linker | 
| 
 Don't we do the same thing when dehydration is turned off? E.g. the fact that MethodTable symbols are prefixed by GCDesc, or non-GC static bases are prefixed by the class constructor context. If the linker decides to shuffle these, bad things will happen. I would expect we do the same thing for all sections we emit. The compiler doesn't expect linker messing with section contents. | 
| 
 That largely depends on which symbols are produced in the object file and whether they are referenced or not. As far as I can tell,  For places which call  (*) ...or rather it may result in more atoms in the linker. The heuristics for code also look at unwinding information, so I am more concerned with DATA. | 
| 
 EETypeNode sets ISymbolDefinition.Offset to the GCDesc size: runtime/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/EETypeNode.cs Lines 247 to 248 in 15e96fa 
 So the symbol is only defined GCDesc-size-bytes into the ObjectData blob. Would the data still be emitted correctly for this? If I understand this correctly, we'd need object writer to emit a (made up?) symbol at the beginning of object data, (make sure it's referenced somehow?) and generate the actual MethodTable symbol as alt-entry. (Or just do the same thing to all section that this PR is currently doing for the hydrated section only?) | 
…assumptions about method symbols
…accidentally strip them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me otherwise! Thank you!
| // We emit a local N_NO_DEAD_STRIP symbol for the beginning of all sections marked with | ||
| // KeepDataLayout to ensure the final layout is kept intact by the linker. This works in | ||
| // tandem with setting the N_ALT_ENTRY flag for all other symbols in the same section in | ||
| // the loop below. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the PR version still doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is.
| // Match the logic for KeepDataLayout in MachSection.CreateSection. If we are in a data section | ||
| // we don't need to have a symbol for beginning of the node data. For other nodes, particularly | ||
| // executable code, we enforce it though. | ||
| Debug.Assert(hasInitialEntrypoint || section.Type is SectionType.ReadOnly or SectionType.Writeable or SectionType.Uninitialized); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid duplicating the section.Type is SectionType.ReadOnly or SectionType.Writeable or SectionType.Uninitialized part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a public getter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| /azp run runtime-extra-platforms | 
| Azure Pipelines successfully started running 1 pipeline(s). | 
| /azp run runtime-nativeaot-outerloop | 
| Azure Pipelines successfully started running 1 pipeline(s). | 
| /azp run runtime-extra-platforms, runtime-nativeaot-outerloop | 
| Azure Pipelines successfully started running 2 pipeline(s). | 
| clang or linker seems to be segfaulting in the outerloop runs for at least one of the larger tests :(  | 
| 
 I'll check it locally but likely not until tomorrow :-/ | 
| For posterity, the linker crash is stack overflow in the recursive call: I am not sure there's an easy solution without some out-of-the-box thinking. (The extreme measure would be to resolve the data symbols to section-based relocations with offset, but that seems like quite a harsh solution.) | 
| We've also reached out to Apple about this issue and hope they will get back to us | 
| I will try to extract the non-problematic bits of this PR tomorrow into separate one(s). I have some idea how to make the data section layout reliable without triggering the ld64 crash. In any case it’s going to be a non-trivial change and I need to test it properly. | 
| 
 I came up with two hacky prototypes: 
 Neither of those solutions sounds like a slam dunk, but I wanted to share them anyway for posterity. Lastly, there's always the option of relying on the undefined behavior that the linker doesn't do the reordering for our code/data and that  I extracted the more obvious bits from this PR into #106442, #106444, and #106446. I'll close this PR since the general approach is a dead-end but I am open to continue discussion if there's any interest or guidance from Apple. | 
…e beginning of the section are not stripped if there's no other N_NO_DEAD_STRIP symbol referencing them (#106444) Extracted from #106224. PR #103039 added `N_NO_DEAD_STRIP` flag to all symbols emitted by ILC and enabled the dead code stripping in the native linker. It failed to handle one specific edge case that is luckily not happening in the wild. If the first node emitted into a section has a symbol with non-zero offset NN the first `N_NO_DEAD_STRIP` symbol is not pointing at the start of the section. The native linker then splits up the section into atom and the first atom from offset 0 to offset NN is never referenced and becomes eligible for dead code stripping. Since we emit a symbol for each section start (for use in section-relative relocations) we can just mark the symbol with `N_NO_DEAD_STRIP` to resolve the issue.
…h ELF (#106446) Extracted from #106224 `.private_extern` is the logical equivalent of `.hidden`+`.global` in ELF. We already emit those flags in ELF, so do it in Mach-O too. Documentation for `.private_extern`: > It's used to mark symbols with limited visibility. When the file is fed to the static linker, it clears the N_EXT bit for each symbol with the N_PEXT bit set. The ld option -keep_private_externs turns off this behavior.
| For posterity's sake, I received some feedback from Apple: 
 TL;DR: 
 I filed another issue (FB14897581) for the old linker (ld64) crashing with stack overflow for huge number of .alt_entry symbols and dead stripping. I asked for guidance on producing an unbreakable atom in DATA/BSS section with symbols that would be good enough not to break the "image lookup" experience in  | 
| Thanks for following up on this! 
 Is there an official statement from Apple on this? | 
| 
 No official statement, as far as I know. The best I can tell is that I still keep receiving guidance  (That said, the new linker works for the NativeAOT smoke tests now, so we are already in a better place than with Xcode 15.) | 
Background
Unlike ELF which can support pretty much arbitrary number of sections, the Mach-O object files are limited to ~255 sections. Apple uses the
.subsections_via_symbolslinking approach to circumvent this limitation. In this model the external symbols are used to slice the section into subsections, or atoms, which are then used as the linking unit (for purpose of dead stripping, reordering, etc.)..alt_entryis an escape hatch to introduce symbols that don't divide a (sub)section into multiple atoms. It can be combined with.globalor.private_externto create symbols pointing into the middle of a method or data while making sure that the linker doesn't try to slice the method/data and place it into two different locations in the output..private_externis similar to combination of symbol marked with.globaland.hiddenon ELF platforms. It's used to mark symbols with limited visibility. When the file is fed to the static linker, it clears theN_EXTbit for each symbol with the N_PEXT bit set. The ld option-keep_private_externsturns off this behavior.Problem
NativeAOT and CoreCLR have assembly files that need to reference instructions in the middle of the method. This is used for marking places recognized by the
SIGSEGVhandler as valid instructions that are expected to produce a signal which can transformed toNullReferenceException. CoreCLR also uses it in parts of code that are patchable at runtime (applies only to macOS x64 for the purpose of this PR).Additionally, NativeAOT has a concept of dehydrated data that are unpacked at runtime into the
.hydratedsection. The dehydrated data use relative offsets and rely on the linker keeping the layout of the.hydratedsection intact..subsections_via_symbolswas emitted by Clang for over a decade now and the new Apple linker introduced in Xcode 15 no longer correctly supports object files without it. The old linker is scheduled to be removed in Xcode 16 which is currently in beta. Additionally, Xcode 16 is more strict at enforcing the rules for.globallabels inside assembly files. While the current Xcode 16 betas have additional issues (upstream LLVM links: llvm/llvm-project#82261, llvm/llvm-project#97116), it is prudent to change the code to emit.alt_entrysymbols instead of depending on undefined linker behavior.Solution
This PR is changing the code to correctly use
.alt_entry/.private_externand compile with Xcode 14/15.It's best reviewed commit-by-commit:
.alt_entry+.private_externinstead of.globalto mark code location labels.N_ALT_ENTRYflag for any symbols pointing to the middle of the node..hydratedsection as single unbreakable atom for the linker. It also updates the Mach-O emitter to emit theN_PEXT(.private_extern) flag to match the ELF behavior.