-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
optimizations: better modeling and codegen for apply and svec calls #59548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
elseif is_known_call(stmt, Core._apply_iterate, compact) | ||
length(stmt.args) >= 4 || continue | ||
lift_apply_args!(compact, idx, stmt, 𝕃ₒ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just commenting for reference, and we don't have to do this in this PR, but I startd to think it'd be better to make this kind of optimization independent of sroa_pass!
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to instead just rename the pass GVN or MemSSAOpt, since doing everything in one pass is probably a lot more efficient, and either alternative name would reflect that this does general memory-value-replacement optimizations
fde4599
to
66687c1
Compare
- Use svec instead of tuple for arguments (better match for ABI which will require boxes) - Directly forward single svec argument, both runtime and codegen, without copying. - Optimize all consistant builtin functions of constant arguments, not just ones with special tfuncs. Reducing code duplication and divergence. - Codegen for `svec()` directly, so optimizer can see each store (and doesn't have to build the whole thing on the stack first).
66687c1
to
19ad3be
Compare
Without a release store, it seems LLVM considers it a data race to have read the initial state on another thread. Marking this as a release store seems sufficient to prevent that optimization. It is also more consistent with how we initialize and write to most other structs, particularly since #55767. Fixes #59547 (more)
…59548) - Use svec instead of tuple for arguments (better match for ABI which will require boxes) - Directly forward single svec argument, both runtime and codegen, without copying. - Optimize all consistant builtin functions of constant arguments, not just ones with special tfuncs. Reducing code duplication and divergence. - Codegen for `svec()` directly, so optimizer can see each store (and doesn't have to build the whole thing on the stack first). Written with help by Claude
Backported to 1.12. |
Further improves the implementation from #59548. Specifically, uses `widenconst` to enable conversion of `tuple` calls that have become `PartialStruct`, and removes incorrect comments and unused arguments. Also adds some Julia-IR level tests.
Adds a dedicated `_svec_len_nothrow` function that does more precise `:nothrow` modeling of `Core._svec_len` implemented in #59548.
Adds a dedicated `_svec_len_nothrow` function that does more precise `:nothrow` modeling of `Core._svec_len` implemented in #59548.
…Lang#59548 (JuliaLang#59601) Further improves the implementation from JuliaLang#59548. Specifically, uses `widenconst` to enable conversion of `tuple` calls that have become `PartialStruct`, and removes incorrect comments and unused arguments. Also adds some Julia-IR level tests.
svec()
directly, so optimizer can see each store (and doesn't have to build the whole thing on the stack first).Written by Claude