- 
                Notifications
    You must be signed in to change notification settings 
- Fork 15k
[MachineLICM] Do not rely on hasSideEffect when handling impdefs #145379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @llvm/pr-subscribers-backend-aarch64 Author: Anatoly Trosinenko (atrosinenko) ChangesTake clobbered registers described as implicit-def operands into When hasSideEffect was set to 0 in the definition of LOADgotAUTH, like the following: Full diff: https://github.com/llvm/llvm-project/pull/145379.diff 2 Files Affected: 
 diff --git a/llvm/lib/CodeGen/MachineLICM.cpp b/llvm/lib/CodeGen/MachineLICM.cpp
index c9079170ca575..5841a9ffa7a99 100644
--- a/llvm/lib/CodeGen/MachineLICM.cpp
+++ b/llvm/lib/CodeGen/MachineLICM.cpp
@@ -559,18 +559,15 @@ void MachineLICMImpl::ProcessMI(MachineInstr *MI, BitVector &RUDefs,
       if (!MO.isDead())
         // Non-dead implicit def? This cannot be hoisted.
         RuledOut = true;
-      // No need to check if a dead implicit def is also defined by
-      // another instruction.
-      continue;
+    } else {
+      // FIXME: For now, avoid instructions with multiple defs, unless
+      // it's a dead implicit def.
+      if (Def)
+        RuledOut = true;
+      else
+        Def = Reg;
     }
 
-    // FIXME: For now, avoid instructions with multiple defs, unless
-    // it's a dead implicit def.
-    if (Def)
-      RuledOut = true;
-    else
-      Def = Reg;
-
     // If we have already seen another instruction that defines the same
     // register, then this is not safe.  Two defs is indicated by setting a
     // PhysRegClobbers bit.
diff --git a/llvm/test/CodeGen/AArch64/mlicm-implicit-defs.mir b/llvm/test/CodeGen/AArch64/mlicm-implicit-defs.mir
new file mode 100644
index 0000000000000..51cb35116dc62
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/mlicm-implicit-defs.mir
@@ -0,0 +1,53 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=aarch64 -run-pass machinelicm -verify-machineinstrs -o - %s | FileCheck %s
+# RUN: llc -mtriple=aarch64 -passes machinelicm -o - %s | FileCheck %s
+
+---
+name: test
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: test
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT:   liveins: $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $x16 = COPY killed $x0
+  ; CHECK-NEXT:   B %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT:   liveins: $x16
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $x1 = COPY killed $x16
+  ; CHECK-NEXT:   $x2 = MOVi64imm 1024, implicit-def dead $x16
+  ; CHECK-NEXT:   $x16 = LDRXroX killed $x1, killed $x2, 0, 0
+  ; CHECK-NEXT:   $xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   Bcc 1, %bb.1, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   liveins: $x1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $x0 = COPY killed $x1
+  ; CHECK-NEXT:   RET_ReallyLR
+  bb.0:
+    liveins: $x0
+    $x16 = COPY killed $x0
+    B %bb.1
+
+  bb.1:
+    liveins: $x16
+    $x1 = COPY killed $x16
+    /* MOVi64imm below mimics a pseudo instruction that doesn't have any */
+    /* unmodelled side effects, but uses x16 as a scratch register.      */
+    $x2 = MOVi64imm 1024, implicit-def dead $x16
+    $x16 = LDRXroX killed $x1, killed $x2, 0, 0
+    $xzr = SUBSXri $x16, 0, 0, implicit-def $nzcv
+    Bcc 1, %bb.1, implicit $nzcv
+    B %bb.2
+
+  bb.2:
+    liveins: $x1
+    $x0 = COPY killed $x1
+    RET_ReallyLR
+...
 | 
05c3da9    to
    079d654      
    Compare
  
    | Fixed the test failures: the original fix ruled out safe-to-move instructions with  | 
Take clobbered registers described as implicit-def operands into
account when deciding if an instruction can be moved.
When hasSideEffect was set to 0 in the definition of LOADgotAUTH,
MultiSource/Benchmarks/Ptrdist/ks/ks test from llvm-test-suite started
to crash. The issue was traced down to MachineLICM pass placing
LOADgotAUTH right after an unrelated copy to x16 like rewriting this
code:
    bb.0:
      renamable $x16 = COPY renamable $x12
      B %bb.1
    bb.1:
      ...
      /* use $x16 */
      ...
      renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
      /* use $x20 */
      ...
like the following:
    bb.0:
      renamable $x16 = COPY renamable $x12
      renamable $x20 = LOADgotAUTH target-flags(aarch64-got) @some_variable, implicit-def dead $x16, implicit-def dead $x17, implicit-def dead $nzcv
      B %bb.1
    bb.1:
      ...
      /* use $x16 */
      ...
      /* use $x20 */
      ...
    079d654    to
    707a36c      
    Compare
  
    | Ping. | 
        
          
                llvm/lib/CodeGen/MachineLICM.cpp
              
                Outdated
          
        
      | } | ||
|  | ||
| RUDefs.set(Unit); | ||
| if (MO.isImplicit()) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What makes implicit defs special here (compared to explicit defs)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped the special case in 4c29db1, thank you!
After investigating HoistRegionPostRA and ProcessMI functions for a while, it feels like the description of RUDefs and RUClobbers are hardly accurate:
  BitVector RUDefs(NumRegUnits);     // RUs defined once in the loop.
  BitVector RUClobbers(NumRegUnits); // RUs defined more than once.These two variables are defined in HoistRegionPostRA and used both in HoistRegionPostRA and ProcessMI by setting and testing bits as well as by mass-settint bits in applyBitsNotInRegMaskToRegUnitsMask.
I have an impression that RUClobbers is sometimes set while it would be better to first check RUDefs (and possibly set corresponding bit there instead), such as when handling MO.isRegMask() case a few lines above - it looks harmless, even though slightly pessimistic, to me.
At the same time, RUDefs seems to mix two different properties of a register: "the register is live" vs. "we have observed an instruction that defined the register that was initially dead" - while both should transition to "clobbered" state after defining that register one more time, only the latter implies some sort of non-uniformity. Thus, when HoistRegionPostRA rejects candidate instructions as follows:
    MachineInstr *MI = Candidate.MI;
    for (const MachineOperand &MO : MI->all_uses()) {
      if (!MO.getReg())
        continue;
      for (MCRegUnit Unit : TRI->regunits(MO.getReg())) {
        if (RUDefs.test(Unit) || RUClobbers.test(Unit)) {
          // If it's using a non-loop-invariant register, then it's obviously
          // not safe to hoist.
          Safe = false;
          break;
        }
      }
      if (!Safe)
        break;
    }it looks like rejecting any instructions having any register inputs given that
    // Conservatively treat live-in's as an external def.
    // FIXME: That means a reload that're reused in successor block(s) will not
    // be LICM'ed.
    for (const auto &LI : BB->liveins()) {
      for (MCRegUnit Unit : TRI->regunits(LI.PhysReg))
        RUDefs.set(Unit);
    }Quite surprisingly, it is not, because on X86 something like this is possible (here $rip is not a live-in):
renamable $rcx = MOV64rm $rip, 1, $noreg, target-flags(x86-gotpcrel) @x, $noreg, debug-location !22 :: (load (s64) from got); t.ll:5:7
Nevertheless, assuming that the only reason to treat implicit register operands specially is to ignore dead implicit defs for simplicity, it looks harmless to handle any defined register identically, whether it is implicit or explicit.
| The test failure looks unrelated: By the way, the PRs stacked on top of this one passed the tests - maybe  | 
| continue; | ||
| } | ||
|  | ||
| if (MO.isImplicit()) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's unclear to me is why implicit defs are special here as well. I uploaded #147624 which merges the logic for implicit and explicit defs, and it passes the new test from this PR. I also ran a stage2 check-llvm and check-clang and they both pass. It requires updates to some target-specific tests; I'm not an expert on all of the affected ISAs but they seem correct.
| Superseded by #147624. | 

Take clobbered registers described as implicit-def operands into
account when deciding if an instruction can be moved.
When hasSideEffect was set to 0 in the definition of LOADgotAUTH,
MultiSource/Benchmarks/Ptrdist/ks/ks test from llvm-test-suite started
to crash. The issue was traced down to MachineLICM pass placing
LOADgotAUTH right after an unrelated copy to x16 like rewriting this
code:
like the following: