- 
          
- 
        Couldn't load subscription status. 
- Fork 5.7k
          optimizer: eliminate isdefined check
          #44708
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sroa_mutables!isdefined check
      | -└──       goto https://github.com/JuliaLang/julia/issues/3 if not %5
-2 ─       goto https://github.com/JuliaLang/julia/issues/4Hmm 😄 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first commit SGTM. The second one seems to be a whole lot more allocations, and I question the value of it. I'd think a tag based system would be better, and avoid creating lots of random types to dispatch on just for tagging data with a kind.
This commit implements a simple optimization within `sroa_mutables!` to
eliminate `isdefined` call by checking load-forwardability of the field.
This optimization may be especially useful to eliminate extra allocation
of `Core.Box` involved with a capturing closure, e.g.:
```julia
julia> callit(f, args...) = f(args...);
julia> function isdefined_elim()
           local arr::Vector{Any}
           callit() do
               arr = Any[]
           end
           return arr
       end;
julia> code_typed(isdefined_elim)
```
```diff
diff --git a/_master.jl b/_pr.jl
index 3aa40ba20e5..11eccf65f32 100644
--- a/_master.jl
+++ b/_pr.jl
@@ -1,15 +1,8 @@
 1-element Vector{Any}:
  CodeInfo(
-1 ─ %1  = Core.Box::Type{Core.Box}
-│   %2  = %new(%1)::Core.Box
-│   %3  = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
-│         Core.setfield!(%2, :contents, %3)::Vector{Any}
-│   %5  = Core.isdefined(%2, :contents)::Bool
-└──       goto #3 if not %5
-2 ─       goto #4
-3 ─       $(Expr(:throw_undef_if_not, :arr, false))::Any
-4 ┄ %9  = Core.getfield(%2, :contents)::Any
-│         Core.typeassert(%9, Vector{Any})::Vector{Any}
-│   %11 = π (%9, Vector{Any})
-└──       return %11
+1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
+└──      goto #3 if not true
+2 ─      goto #4
+3 ─      $(Expr(:throw_undef_if_not, :arr, false))::Any
+4 ┄      return %1
 ) => Vector{Any}
```
    1f7dda1    to
    10cf29d      
    Compare
  
    | 
 Ok, let's go only with the first commit for now, and puts off the discussion on the second commit to another PR. | 
Follows up #44708 -- in that PR I missed the most obvious optimization opportunity, i.e. we can safely eliminate `isdefined` checks when all fields are defined at allocation site. This change allows us to eliminate capturing closure constructions when the body and callsite of capture closure is available within a optimized frame, e.g.: ```julia function abmult(r::Int, x0) if r < 0 r = -r end f = x -> x * r return @inline f(x0) end ``` ```diff diff --git a/_master.jl b/_pr.jl index ea06d865b75..c38f221090f 100644 --- a/_master.jl +++ b/_pr.jl @@ -1,24 +1,19 @@ julia> @code_typed abmult(-3, 3) CodeInfo( -1 ── %1 = Core.Box::Type{Core.Box} -│ %2 = %new(%1, r@_2)::Core.Box -│ %3 = Core.isdefined(%2, :contents)::Bool -└─── goto #3 if not %3 +1 ── goto #3 if not true 2 ── goto #4 3 ── $(Expr(:throw_undef_if_not, :r, false))::Any -4 ┄─ %7 = (r@_2 < 0)::Any -└─── goto #9 if not %7 -5 ── %9 = Core.isdefined(%2, :contents)::Bool -└─── goto #7 if not %9 +4 ┄─ %4 = (r@_2 < 0)::Any +└─── goto #9 if not %4 +5 ── goto #7 if not true 6 ── goto #8 7 ── $(Expr(:throw_undef_if_not, :r, false))::Any -8 ┄─ %13 = -r@_2::Any -9 ┄─ %14 = φ (#4 => r@_2, #8 => %13)::Any -│ %15 = Core.isdefined(%2, :contents)::Bool -└─── goto #11 if not %15 +8 ┄─ %9 = -r@_2::Any +9 ┄─ %10 = φ (#4 => r@_2, #8 => %9)::Any +└─── goto #11 if not true 10 ─ goto #12 11 ─ $(Expr(:throw_undef_if_not, :r, false))::Any -12 ┄ %19 = (x0 * %14)::Any +12 ┄ %14 = (x0 * %10)::Any └─── goto #13 -13 ─ return %19 +13 ─ return %14 ) => Any ```
Follows up #44708 -- in that PR I missed the most obvious optimization opportunity, i.e. we can safely eliminate `isdefined` checks when all fields are defined at allocation site. This change allows us to eliminate capturing closure constructions when the body and callsite of capture closure is available within a optimized frame, e.g.: ```julia function abmult(r::Int, x0) if r < 0 r = -r end f = x -> x * r return @inline f(x0) end ``` ```diff diff --git a/_master.jl b/_pr.jl index ea06d865b75..c38f221090f 100644 --- a/_master.jl +++ b/_pr.jl @@ -1,24 +1,19 @@ julia> @code_typed abmult(-3, 3) CodeInfo( -1 ── %1 = Core.Box::Type{Core.Box} -│ %2 = %new(%1, r@_2)::Core.Box -│ %3 = Core.isdefined(%2, :contents)::Bool -└─── goto #3 if not %3 +1 ── goto #3 if not true 2 ── goto #4 3 ── $(Expr(:throw_undef_if_not, :r, false))::Any -4 ┄─ %7 = (r@_2 < 0)::Any -└─── goto #9 if not %7 -5 ── %9 = Core.isdefined(%2, :contents)::Bool -└─── goto #7 if not %9 +4 ┄─ %4 = (r@_2 < 0)::Any +└─── goto #9 if not %4 +5 ── goto #7 if not true 6 ── goto #8 7 ── $(Expr(:throw_undef_if_not, :r, false))::Any -8 ┄─ %13 = -r@_2::Any -9 ┄─ %14 = φ (#4 => r@_2, #8 => %13)::Any -│ %15 = Core.isdefined(%2, :contents)::Bool -└─── goto #11 if not %15 +8 ┄─ %9 = -r@_2::Any +9 ┄─ %10 = φ (#4 => r@_2, #8 => %9)::Any +└─── goto #11 if not true 10 ─ goto #12 11 ─ $(Expr(:throw_undef_if_not, :r, false))::Any -12 ┄ %19 = (x0 * %14)::Any +12 ┄ %14 = (x0 * %10)::Any └─── goto #13 -13 ─ return %19 +13 ─ return %14 ) => Any ```
Follows up #44708 -- in that PR I missed the most obvious optimization opportunity, i.e. we can safely eliminate `isdefined` checks when all fields are defined at allocation site. This change allows us to eliminate capturing closure constructions when the body and callsite of capture closure is available within a optimized frame, e.g.: ```julia function abmult(r::Int, x0) if r < 0 r = -r end f = x -> x * r return @inline f(x0) end ``` ```diff diff --git a/_master.jl b/_pr.jl index ea06d865b75..c38f221090f 100644 --- a/_master.jl +++ b/_pr.jl @@ -1,24 +1,19 @@ julia> @code_typed abmult(-3, 3) CodeInfo( -1 ── %1 = Core.Box::Type{Core.Box} -│ %2 = %new(%1, r@_2)::Core.Box -│ %3 = Core.isdefined(%2, :contents)::Bool -└─── goto #3 if not %3 +1 ── goto #3 if not true 2 ── goto #4 3 ── $(Expr(:throw_undef_if_not, :r, false))::Any -4 ┄─ %7 = (r@_2 < 0)::Any -└─── goto #9 if not %7 -5 ── %9 = Core.isdefined(%2, :contents)::Bool -└─── goto #7 if not %9 +4 ┄─ %4 = (r@_2 < 0)::Any +└─── goto #9 if not %4 +5 ── goto #7 if not true 6 ── goto #8 7 ── $(Expr(:throw_undef_if_not, :r, false))::Any -8 ┄─ %13 = -r@_2::Any -9 ┄─ %14 = φ (#4 => r@_2, #8 => %13)::Any -│ %15 = Core.isdefined(%2, :contents)::Bool -└─── goto #11 if not %15 +8 ┄─ %9 = -r@_2::Any +9 ┄─ %10 = φ (#4 => r@_2, #8 => %9)::Any +└─── goto #11 if not true 10 ─ goto #12 11 ─ $(Expr(:throw_undef_if_not, :r, false))::Any -12 ┄ %19 = (x0 * %14)::Any +12 ┄ %14 = (x0 * %10)::Any └─── goto #13 -13 ─ return %19 +13 ─ return %14 ) => Any ```
This PR is based on #44706 and consists of two separate commits.The primary change is the first commit (e205e50),
which implements additional
isdefined-elimination pass withinsroa_mutables!,and the second commit (1f7dda1) does some refactoring on top of the first change.
EDIT: now this PR only implements the first commit, and the second commit is put off to another PR.
Their commit messages follow:
optimizer: add additional
isdefined-elimination pass withinsroa_mutables!This commit implements a simple optimization within
sroa_mutables!toeliminate
isdefinedcall by checking load-forwardability of the field.This optimization may be especially useful to eliminate extra allocation
of
Core.Boxinvolved with a capturing closure, e.g.:optimizer: switch to more general data structure of
SSADefUseNow
(du::SSADefUse).usesfield can contain arbitrary "usages" to be eliminated.This structure might be helpful for implementing array SROA, for example.
Also slightly tweaks the implementation of
ccallpreserve elimination.EDIT: will not be merged in this PR.