Skip to content

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Jul 18, 2025

There are cases we end up removing a conditional branch with a stackified register condition operand. For example:

bb.0:
  %0 = ...    ;; %0 is stackified
  br_if %bb.1, %0
bb.1:

In this code, br_if will be removed, so we should unstackify %0 so that it can be correctly dropped in ExplicitLocals.

There seems no good method to determine which registers to unstackify without analyzing branches thoroughly, which is supposed to be done within updateTerminator(). So we just unstackify all of them and restackify the still-remaining registers after updateTerminator().

Fixes #149097.

There are cases we end up removing a conditional branch with a stackified
register condition operand. For example:
```wasm
bb.0:
  %0 = ...    ;; %0 is stackified
  br_if %bb.1, %0
bb.1:
```wasm

In this code, br_if will be removed, so we should unstackify %0 so that
it can be correctly dropped in ExplicitLocals.

There seems no good method to determine which registers to unstackify
without analyzing branches thoroughly, which is supposed to be done
within `updateTerminator()`. So we just unstackify all of them and
restackify the still-remaining registers after `updateTerminator()`.

Fixes llvm#149097.
@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2025

@llvm/pr-subscribers-backend-webassembly

Author: Heejin Ahn (aheejin)

Changes

There are cases we end up removing a conditional branch with a stackified register condition operand. For example:

bb.0:
  %0 = ...    ;; %0 is stackified
  br_if %bb.1, %0
bb.1:
```wasm

In this code, br_if will be removed, so we should unstackify %0 so that it can be correctly dropped in ExplicitLocals.

There seems no good method to determine which registers to unstackify without analyzing branches thoroughly, which is supposed to be done within `updateTerminator()`. So we just unstackify all of them and restackify the still-remaining registers after `updateTerminator()`.

Fixes #<!-- -->149097.

---
Full diff: https://github.com/llvm/llvm-project/pull/149432.diff


2 Files Affected:

- (modified) llvm/lib/Target/WebAssembly/WebAssemblyCFGSort.cpp (+34-1) 
- (added) llvm/test/CodeGen/WebAssembly/removed-terminator.ll (+15) 


``````````diff
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyCFGSort.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyCFGSort.cpp
index 6525c6d93bee8..378579d8776bb 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyCFGSort.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyCFGSort.cpp
@@ -18,6 +18,7 @@
 
 #include "WebAssembly.h"
 #include "WebAssemblyExceptionInfo.h"
+#include "WebAssemblyMachineFunctionInfo.h"
 #include "WebAssemblySortRegion.h"
 #include "WebAssemblyUtilities.h"
 #include "llvm/ADT/PriorityQueue.h"
@@ -97,8 +98,40 @@ static void maybeUpdateTerminator(MachineBasicBlock *MBB) {
           ? MF->getBlockNumbered(MBB->getNumber() + 1)
           : nullptr;
 
-  if (AllAnalyzable)
+  if (AllAnalyzable) {
+    // There are cases we end up removing a conditional branch with a stackified
+    // register condition operand. For example:
+    // bb.0:
+    //   %0 = ...    ;; %0 is stackified
+    //   br_if %bb.1, %0
+    // bb.1:
+    //
+    // In this code, br_if will be removed, so we should unstackify %0 so that
+    // it can be correctly dropped in ExplicitLocals.
+    //
+    // There seems no good method to determine which registers to unstackify
+    // without analyzing branches thoroughly, which is supposed to be done
+    // within updateTerminator(). So we just unstackify all of them and
+    // restackify the still-remaining registers after updateTerminator().
+    SmallSet<Register, 2> StackifiedRegs;
+    auto *MF = MBB->getParent();
+    WebAssemblyFunctionInfo *MFI = MF->getInfo<WebAssemblyFunctionInfo>();
+    for (const MachineInstr &Term : MBB->terminators()) {
+      for (auto &MO : Term.explicit_uses()) {
+        if (MO.isReg() && MFI->isVRegStackified(MO.getReg())) {
+          StackifiedRegs.insert(MO.getReg());
+          MFI->unstackifyVReg(MO.getReg());
+        }
+      }
+    }
+
     MBB->updateTerminator(OriginalSuccessor);
+
+    for (const MachineInstr &Term : MBB->terminators())
+      for (auto &MO : Term.explicit_uses())
+        if (MO.isReg() && StackifiedRegs.contains(MO.getReg()))
+          MFI->stackifyVReg(MF->getRegInfo(), MO.getReg());
+  }
 }
 
 namespace {
diff --git a/llvm/test/CodeGen/WebAssembly/removed-terminator.ll b/llvm/test/CodeGen/WebAssembly/removed-terminator.ll
new file mode 100644
index 0000000000000..4a3c590ea13f7
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/removed-terminator.ll
@@ -0,0 +1,15 @@
+; RUN: llc -O0 < %s
+
+target triple = "wasm32-unknown-unknown"
+
+define void @test(i1 %x) {
+  %y = xor i1 %x, true
+  ; We now do a limited amount of register stackification in RegStackify even in
+  ; -O0, so its operand (%y) is stackified. But this terminator will be removed
+  ; in CFGSort after that. We need to make sure we unstackify %y so that it can
+  ; be dropped in ExplicitLocals.
+  br i1 %y, label %exit, label %exit
+
+exit:
+  ret void
+}

@aheejin aheejin requested a review from nikic July 18, 2025 00:42
@aheejin
Copy link
Member Author

aheejin commented Jul 18, 2025

@SingleAccretion (you don't show up in the reviewers list)

@aheejin
Copy link
Member Author

aheejin commented Jul 18, 2025

Some background: #149097 (comment)

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, couldn't this still potentially cause issues with tee instructions? That is, if a register does stay unstackified and it happens to be the first result of tee?

Otherwise this looks reasonable to me.

@SingleAccretion
Copy link
Contributor

Did you consider the fix I was thinking about (dropping all unused defs not just those not-stackified) in explicit-locals?

@aheejin
Copy link
Member Author

aheejin commented Jul 19, 2025

Did you consider the fix I was thinking about (dropping all unused defs not just those not-stackified) in explicit-locals?

I think that is a better idea. See #149626.

@aheejin
Copy link
Member Author

aheejin commented Jul 19, 2025

Closing in favor of #149626.

@aheejin aheejin closed this Jul 19, 2025
@aheejin
Copy link
Member Author

aheejin commented Jul 19, 2025

@nikic

Hm, couldn't this still potentially cause issues with tee instructions? That is, if a register does stay unstackified and it happens to be the first result of tee?

The only case I can think of is a really contrived case like this:
(I've removed all implicit arguments for readability, so this may not run with llc out of the box)

bb.0:
  successors: %bb.1; %bb.1
  %0:i32 = CALL @foo
  %1:i32 = CONST_I32 -1
  %2:i32 = XOR_I32 %0:i32, %1:i32
  BR_IF %bb.1, %2:i32
  BR_IF %bb.1, %2:i32
  BR_IF %bb.1, %2:i32

bb.1:
  ; predecessors: %bb.0
  RETURN

Because %2 is used in both CALL @use and BR_IF, we will get a TEE:

bb.0:                                                                        
  successors: %bb.1; %bb.1                                                            
  %0:i32 = CALL @foo
  %1:i32 = CONST_I32 -1
  %4:i32 = XOR_I32 %0:i32, %1:i32
  %3:i32, %2:i32 = TEE_I32 %4:i32
  BR_IF %bb.1, %3:i32
  BR_IF %bb.1, %2:i32
  BR_IF %bb.1, %2:i32

bb.1:
; predecessors: %bb.0
  RETURN

with %4 and %3 stackified. (See this for how it's done)

If all these BR_IFs get removed in CFGSort I think the situation you are worried about can occur. If I run this from RegStackify, this fails in here:

assert(!B && "UpdateTerminators requires analyzable predecessors!");

which is called from here:
MBB->updateTerminator(OriginalSuccessor);

So I think it's OK to assume that this kind of code is not generated without manually modifying mir files.

In valid code, there will be only one BR_IF, so the code could be something like

bb.0:
  successors: %bb.1; %bb.1
  %0:i32 = CALL @foo
  %1:i32 = CONST_I32 -1
  %2:i32 = XOR_I32 %0:i32, %1:i32
  CALL @use, %2:i32
  CALL @use, %2:i32
  BR_IF %bb.1, %2:i32

bb.1:
  ; predecessors: %bb.0
  RETURN
bb.0:                                                                        
  successors: %bb.1; %bb.1                                                            
  %0:i32 = CALL @foo
  %1:i32 = CONST_I32 -1
  %4:i32 = XOR_I32 %0:i32, %1:i32
  %3:i32, %2:i32 = TEE_I32 %4:i32
  CALL @use, %3:i32
  CALL @use, %2:i32
  BR_IF %bb.1, %2:i32

bb.1:
; predecessors: %bb.0
  RETURN

with %4 and %3 stackified.

In this case, even if the BR_IF is removed, its operand (%2) is not the stackified register.

aheejin added a commit to aheejin/llvm-project that referenced this pull request Sep 25, 2025
`FAKE_USE`s are essentially no-ops, so they have to be removed before
running ExplicitLocals so that `drop`s will be correctly inserted to
drop those values used by the `FAKE_USE`s.

---

This is reapplication of llvm#160228, which broke Wasm waterfall. This PR
additionally prevents `FAKE_USE`s uses from being stackified.

Previously, a 'def' whose first use was a `FAKE_USE` was able to be
stackified as `TEE`:
- Before
```
Reg = INST ...            // Def
FAKE_USE ..., Reg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```

- After RegStackify
```
DefReg = INST ...            // Def
TeeReg, Reg = TEE ... DefReg
FAKE_USE ..., TeeReg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```
And this assumes `DefReg` and `TeeReg` are stackified.

But this PR removes `FAKE_USE`s in the beginning of ExplicitLocals. And
later in ExplicitLocals we have a routine to unstackify registers that
have no uses left:
https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L257-L269
(This was added in llvm#149626. Then it didn't seem it would trigger the
same assertions for `TEE`s because it was fixing the bug where a
terminator was removed in CFGSort (llvm#149097).
Details here:
llvm#149432 (comment))

- After `FAKE_USE` removal and unstackification
```
DefReg = INST ...
TeeReg, Reg = TEE ... DefReg
INST ..., Reg, ...
INST ..., Reg, ...
```
And now `TeeReg` is unstackified. This triggered the assertion here,
that `TeeReg` should be stackified:
https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L316

This prevents `FAKE_USE`s' uses from being stackified altogether,
including `TEE` transformation. Even when it is not a `TEE` transformation
and just a single use stackification, it does not trigger the assertion
but there's no point stackifying it given that it will be deleted.

---

Fixes emscripten-core/emscripten#25301.
aheejin added a commit that referenced this pull request Sep 25, 2025
`FAKE_USE`s are essentially no-ops, so they have to be removed before
running ExplicitLocals so that `drop`s will be correctly inserted to
drop those values used by the `FAKE_USE`s.

---

This is reapplication of #160228, which broke Wasm waterfall. This PR
additionally prevents `FAKE_USE`s uses from being stackified.

Previously, a 'def' whose first use was a `FAKE_USE` was able to be
stackified as `TEE`:
- Before
```
Reg = INST ...            // Def
FAKE_USE ..., Reg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```

- After RegStackify
```
DefReg = INST ...            // Def
TeeReg, Reg = TEE ... DefReg
FAKE_USE ..., TeeReg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```
And this assumes `DefReg` and `TeeReg` are stackified.

But this PR removes `FAKE_USE`s in the beginning of ExplicitLocals. And
later in ExplicitLocals we have a routine to unstackify registers that
have no uses left:

https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L257-L269
(This was added in #149626. Then it didn't seem it would trigger the
same assertions for `TEE`s because it was fixing the bug where a
terminator was removed in CFGSort (#149097).
Details here:
#149432 (comment))

- After `FAKE_USE` removal and unstackification
```
DefReg = INST ...
TeeReg, Reg = TEE ... DefReg
INST ..., Reg, ...
INST ..., Reg, ...
```
And now `TeeReg` is unstackified. This triggered the assertion here,
that `TeeReg` should be stackified:

https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L316

This prevents `FAKE_USE`s' uses from being stackified altogether,
including `TEE` transformation. Even when it is not a `TEE`
transformation and just a single use stackification, it does not trigger
the assertion but there's no point stackifying it given that it will be
deleted.

---

Fixes emscripten-core/emscripten#25301.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 25, 2025
…768)

`FAKE_USE`s are essentially no-ops, so they have to be removed before
running ExplicitLocals so that `drop`s will be correctly inserted to
drop those values used by the `FAKE_USE`s.

---

This is reapplication of #160228, which broke Wasm waterfall. This PR
additionally prevents `FAKE_USE`s uses from being stackified.

Previously, a 'def' whose first use was a `FAKE_USE` was able to be
stackified as `TEE`:
- Before
```
Reg = INST ...            // Def
FAKE_USE ..., Reg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```

- After RegStackify
```
DefReg = INST ...            // Def
TeeReg, Reg = TEE ... DefReg
FAKE_USE ..., TeeReg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```
And this assumes `DefReg` and `TeeReg` are stackified.

But this PR removes `FAKE_USE`s in the beginning of ExplicitLocals. And
later in ExplicitLocals we have a routine to unstackify registers that
have no uses left:

https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L257-L269
(This was added in #149626. Then it didn't seem it would trigger the
same assertions for `TEE`s because it was fixing the bug where a
terminator was removed in CFGSort (#149097).
Details here:
llvm/llvm-project#149432 (comment))

- After `FAKE_USE` removal and unstackification
```
DefReg = INST ...
TeeReg, Reg = TEE ... DefReg
INST ..., Reg, ...
INST ..., Reg, ...
```
And now `TeeReg` is unstackified. This triggered the assertion here,
that `TeeReg` should be stackified:

https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L316

This prevents `FAKE_USE`s' uses from being stackified altogether,
including `TEE` transformation. Even when it is not a `TEE`
transformation and just a single use stackification, it does not trigger
the assertion but there's no point stackifying it given that it will be
deleted.

---

Fixes emscripten-core/emscripten#25301.
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
`FAKE_USE`s are essentially no-ops, so they have to be removed before
running ExplicitLocals so that `drop`s will be correctly inserted to
drop those values used by the `FAKE_USE`s.

---

This is reapplication of llvm#160228, which broke Wasm waterfall. This PR
additionally prevents `FAKE_USE`s uses from being stackified.

Previously, a 'def' whose first use was a `FAKE_USE` was able to be
stackified as `TEE`:
- Before
```
Reg = INST ...            // Def
FAKE_USE ..., Reg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```

- After RegStackify
```
DefReg = INST ...            // Def
TeeReg, Reg = TEE ... DefReg
FAKE_USE ..., TeeReg, ...    // Insert
INST ..., Reg, ...
INST ..., Reg, ...
```
And this assumes `DefReg` and `TeeReg` are stackified.

But this PR removes `FAKE_USE`s in the beginning of ExplicitLocals. And
later in ExplicitLocals we have a routine to unstackify registers that
have no uses left:

https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L257-L269
(This was added in llvm#149626. Then it didn't seem it would trigger the
same assertions for `TEE`s because it was fixing the bug where a
terminator was removed in CFGSort (llvm#149097).
Details here:
llvm#149432 (comment))

- After `FAKE_USE` removal and unstackification
```
DefReg = INST ...
TeeReg, Reg = TEE ... DefReg
INST ..., Reg, ...
INST ..., Reg, ...
```
And now `TeeReg` is unstackified. This triggered the assertion here,
that `TeeReg` should be stackified:

https://github.com/llvm/llvm-project/blob/7b28fcd2b182ba2c9d2d71c386be92fc0ee3cc9d/llvm/lib/Target/WebAssembly/WebAssemblyExplicitLocals.cpp#L316

This prevents `FAKE_USE`s' uses from being stackified altogether,
including `TEE` transformation. Even when it is not a `TEE`
transformation and just a single use stackification, it does not trigger
the assertion but there's no point stackifying it given that it will be
deleted.

---

Fixes emscripten-core/emscripten#25301.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WebAssembly] Assertion failure at -O0

4 participants