Skip to content

Conversation

@etiotto
Copy link
Contributor

@etiotto etiotto commented Oct 24, 2025

This PR extends the RemoveMask pass in order to consider mask on load and select operations that evaluate to true (or false) in the entire loop iteration space. This masked loads to be transformed into unmasked ones, and the mask condition may become dead if not used by other operations (therefore it may contribute to reduction of arithmetic complexity).

@etiotto etiotto self-assigned this Oct 24, 2025
Signed-off-by: Ettore Tiotto <[email protected]>
@etiotto etiotto requested a review from wdziurdz October 30, 2025 13:46
@etiotto etiotto linked an issue Oct 30, 2025 that may be closed by this pull request
@etiotto
Copy link
Contributor Author

etiotto commented Nov 3, 2025

Run micro-benchmarks on b580 and this PR did not regress any of the benchmarks

@etiotto
Copy link
Contributor Author

etiotto commented Nov 4, 2025

No inductor or accuracy PyTorch tests failure detected. This is good to go.

@etiotto etiotto merged commit 0e374bf into main Nov 4, 2025
73 of 76 checks passed
@etiotto etiotto deleted the etiotto.remove_masks branch November 4, 2025 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[pytorch upstream] softmax on BMG is slow

3 participants