- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
ELF: Add support for R_AARCH64_PATCHINST relocation type. #133534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ELF: Add support for R_AARCH64_PATCHINST relocation type. #133534
Conversation
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1
| @llvm/pr-subscribers-mc @llvm/pr-subscribers-lld Author: Peter Collingbourne (pcc) ChangesThe R_AARCH64_INST32 relocation type is to support deactivation TODO: 
 Full diff: https://github.com/llvm/llvm-project/pull/133534.diff 2 Files Affected: 
 diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 9538dd4a70bae..110d087230a9c 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -135,6 +135,7 @@ RelExpr AArch64::getRelExpr(RelType type, const Symbol &s,
   switch (type) {
   case R_AARCH64_ABS16:
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_ABS64:
   case R_AARCH64_ADD_ABS_LO12_NC:
   case R_AARCH64_LDST128_ABS_LO12_NC:
@@ -278,6 +279,7 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_PREL16:
     return SignExtend64<16>(read16(ctx, buf));
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_PREL32:
     return SignExtend64<32>(read32(ctx, buf));
   case R_AARCH64_ABS64:
@@ -505,6 +507,12 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
     checkIntUInt(ctx, loc, val, 32, rel);
     write32(ctx, loc, val);
     break;
+  case R_AARCH64_INST32:
+    if (!rel.sym->isUndefined()) {
+      checkIntUInt(ctx, loc, val, 32, rel);
+      write32(ctx, loc, val);
+    }
+    break;
   case R_AARCH64_PLT32:
   case R_AARCH64_GOTPCREL32:
     checkInt(ctx, loc, val, 32, rel);
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
index 05b79eae573f7..c9f17ee4e0c7a 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
@@ -61,6 +61,7 @@ ELF_RELOC(R_AARCH64_LD64_GOT_LO12_NC,                0x138)
 ELF_RELOC(R_AARCH64_LD64_GOTPAGE_LO15,               0x139)
 ELF_RELOC(R_AARCH64_PLT32,                           0x13a)
 ELF_RELOC(R_AARCH64_GOTPCREL32,                      0x13b)
+ELF_RELOC(R_AARCH64_INST32,                          0x13c)
 // General dynamic TLS relocations
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PREL21,                0x200)
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PAGE21,                0x201)
 | 
| @llvm/pr-subscribers-lld-elf Author: Peter Collingbourne (pcc) ChangesThe R_AARCH64_INST32 relocation type is to support deactivation TODO: 
 Full diff: https://github.com/llvm/llvm-project/pull/133534.diff 2 Files Affected: 
 diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 9538dd4a70bae..110d087230a9c 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -135,6 +135,7 @@ RelExpr AArch64::getRelExpr(RelType type, const Symbol &s,
   switch (type) {
   case R_AARCH64_ABS16:
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_ABS64:
   case R_AARCH64_ADD_ABS_LO12_NC:
   case R_AARCH64_LDST128_ABS_LO12_NC:
@@ -278,6 +279,7 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_PREL16:
     return SignExtend64<16>(read16(ctx, buf));
   case R_AARCH64_ABS32:
+  case R_AARCH64_INST32:
   case R_AARCH64_PREL32:
     return SignExtend64<32>(read32(ctx, buf));
   case R_AARCH64_ABS64:
@@ -505,6 +507,12 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
     checkIntUInt(ctx, loc, val, 32, rel);
     write32(ctx, loc, val);
     break;
+  case R_AARCH64_INST32:
+    if (!rel.sym->isUndefined()) {
+      checkIntUInt(ctx, loc, val, 32, rel);
+      write32(ctx, loc, val);
+    }
+    break;
   case R_AARCH64_PLT32:
   case R_AARCH64_GOTPCREL32:
     checkInt(ctx, loc, val, 32, rel);
diff --git a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
index 05b79eae573f7..c9f17ee4e0c7a 100644
--- a/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
+++ b/llvm/include/llvm/BinaryFormat/ELFRelocs/AArch64.def
@@ -61,6 +61,7 @@ ELF_RELOC(R_AARCH64_LD64_GOT_LO12_NC,                0x138)
 ELF_RELOC(R_AARCH64_LD64_GOTPAGE_LO15,               0x139)
 ELF_RELOC(R_AARCH64_PLT32,                           0x13a)
 ELF_RELOC(R_AARCH64_GOTPCREL32,                      0x13b)
+ELF_RELOC(R_AARCH64_INST32,                          0x13c)
 // General dynamic TLS relocations
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PREL21,                0x200)
 ELF_RELOC(R_AARCH64_TLSGD_ADR_PAGE21,                0x201)
 | 
| Apologies for the delay in responding, just come back from vacation. Not had a chance to look through the Discourse posts yet, will do so this week. While it may seem a bit premature, would you be able to open an issue in the AArch64 ABI (https://github.com/ARM-software/abi-aa/issues) with a request to add a new relocation directive, citing the Discourse posts and PRs? | 
| 
 Sure, created ARM-software/abi-aa#317 | 
The R_AARCH64_INST32 relocation type is to support deactivation symbols. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 TODO: - Agree on semantics and relocation type number. - Add tests. Pull Request: llvm#133534
The R_AARCH64_INST32 relocation type is to support deactivation symbols. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 TODO: - Agree on semantics and relocation type number. - Add tests. Pull Request: llvm#133534
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1
        
          
                lld/ELF/Arch/AArch64.cpp
              
                Outdated
          
        
      | switch (type) { | ||
| case R_AARCH64_ABS16: | ||
| case R_AARCH64_ABS32: | ||
| case R_AARCH64_PATCHINST: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the relocation is used as described in the RFC, with the target symbol absolute all looks well. If a non ABS symbol is used in a position independent context I think you'll get a recompile with PIC error message. If not position independent then I think the relocation will pass through and give a likely unpredictable value to patch the instruction.
Could it be worth putting some checking (likely in Relocation.cpp) to make sure the target symbol is SHN_ABS?
A related question is whether non-zero addends should be supported? For most instructions the addend doesn't make logical sense. There could be some instructions where the immediate field is in the right place to make it work, but I think these cases may not be worth supporting.
An alternative formulation for RELA is to completely ignore the symbol value and just use the bottom 32-bits of the addend field. That would give the object producer that defines the relocations more control over what is patched in. Not got a strong opinion on whether that is better though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be worth putting some checking (likely in Relocation.cpp) to make sure the target symbol is SHN_ABS?
That makes sense. getRelExpr in Arch/AArch64.cpp seems like a slightly better place since it's arch-specific semantics, so I added it there.
A related question is whether non-zero addends should be supported? For most instructions the addend doesn't make logical sense. There could be some instructions where the immediate field is in the right place to make it work, but I think these cases may not be worth supporting.
An alternative formulation for RELA is to completely ignore the symbol value and just use the bottom 32-bits of the addend field. That would give the object producer that defines the relocations more control over what is patched in. Not got a strong opinion on whether that is better though.
This could be useful for something like an hypothetical PAC variant that takes three operands like most other instructions:
pacda x0, x1, x2
meaning sign x1 using x2 and store the result in x0. Then we could encode the instruction
mov x0, x1
in the relocation addend and have the object defining the deactivation symbol define it to absolute 0. And then of course you can have the replacement instruction use whichever registers the original instruction did (or replace it entirely with NOP if the source and destination registers are the same). This doesn't seem useful for PAC given how the existing instructions work but maybe there's a non-PAC use case for this.
It doesn't seem harmful to have the flexibility of letting the instruction be defined by the symbol, the relocation or both. On the psABI issue we discussed restrictions for things like erratum fixes but for that purpose it doesn't seem to matter how the instruction word is computed. We can make it the responsibility of the object file producer(s) to ensure that the instruction formed by value+addend follows the rules.
| add x0, x1, x2 | ||
| # This instruction has two relocations: the DS relocation and the JUMP26 to f1. | ||
| # Make sure that the DS relocation takes precedence. | ||
| .reloc ., R_AARCH64_PATCHINST, ds | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth a test with emit-relocs to show both relocations coming out.
Thinking of Bolt, which relies on emit-relocs, I expect that it would just ignore the R_AARCH_PATCHINST relocations on their own as it wouldn't know how to recreate the original value [1]. It would have to discard any relocation at the same location. We're hoping to create a binary analysis ABI supplement soon to document conventions that binary analysis tools are using. First step ARM-software/abi-aa#333
[1] In theory if we did want to let Bolt reverse a patch, the emit-relocs output could give the reverse patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1 [skip ci]
The R_AARCH64_PATCHINST relocation type is to support deactivation symbols. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 TODO: - Agree on semantics and relocation type number. Pull Request: llvm#133534
| psABI update landed, this is now unblocked. | 
| // CHECK: R_AARCH64_JUMP26 | ||
| // CHECK-NEXT: R_AARCH64_PATCHINST | ||
| .reloc ., R_AARCH64_PATCHINST, ds | ||
| b f1 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve the test to test that with more than one fragments and more than one PATCHINST, the relocations are still ordered.
.reloc ., R_AARCH64_PATCHINST, ds
b f1
.balign 8
.reloc ., R_AARCH64_PATCHINST, ds
b f2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 55e71c0
| @@ -0,0 +1,87 @@ | |||
| # RUN: rm -rf %t && split-file %s %t | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider adding && cd %t so that we can remove %t/ below, which clutter up the commands...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this form so that the commands are copy-pastable into the shell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assembler part (BinaryFormat / Target/AArch64 changes) look good. The linker change should be made separate. But thank for combining this in a single PR, making the full picture clear:)
| ELF_RELOC(R_AARCH64_LD64_GOTPAGE_LO15, 0x139) | ||
| ELF_RELOC(R_AARCH64_PLT32, 0x13a) | ||
| ELF_RELOC(R_AARCH64_GOTPCREL32, 0x13b) | ||
| ELF_RELOC(R_AARCH64_PATCHINST, 0x13c) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add a test to tools/llvm-readobj/ELF/reloc-types-aarch64.test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| 
 The RFC is quite long and doesn't come up with an assembly example. Adding some description of how the relocation works will help future readers. | 
| Need an aarch64 maintainer's signoff on the llvm part. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left my approval on the AArch64 side, the relocation id matches the ABI. I'm not a an official maintainer, so please leave a few days to see if @davemgreen has any objections.
| 
 Yeah you are. If you say this LGTY then it SGTM :) | 
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1
| 
 MC parts were moved to a922731, LLD parts were landed in this PR. | 
The R_AARCH64_PATCHINST relocation type is to support deactivation symbols. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 Part of the AArch64 psABI extension: ARM-software/abi-aa#340 Reviewers: smithp35, davemgreen, MaskRay Reviewed By: MaskRay, davemgreen, smithp35 Pull Request: llvm/llvm-project#133534
The R_AARCH64_PATCHINST relocation type is to support deactivation
symbols. For more information, see the RFC:
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556
Part of the AArch64 psABI extension:
ARM-software/abi-aa#340