Skip to content

Type inference optimize hang on master: generator stuck perform_lifting / simple_walk #51694

@NHDaly

Description

@NHDaly

Okay, we have a standalone isolated MRE for the hang on master!

This reliably hangs _on master (but not on julia 1.10). So I'm breaking this out from #51603, into a separate issue, since this is something separate.


I don't think we could have gotten this without @vtjnash walking us through some very gnarly hackery to find which function was stuck in inference (including reading some pointers off of registers and some manual pointer arithmetic and memory reads).

One of my takeaways here is that it would be nice to be able to set some kind of "debug mode" flag to have julia log which function it's inferring before it starts and stops inference / optimization, so that we could have found where it was stuck much more easily. Once we found that, isolating an MRE wasn't too hard.

@aviatesk and @vtjnash: Please try to reproduce the hang from this, and see if you can work out the issue from there!

Once you find the culprit, can you also give a time estimate to fix it? I do still think that if it will take more than a couple days, we should start by reverting the culprits so that main can be unbroken until we land the fix.

const Iterator = Any

abstract type Shape end

struct ShapeUnion <: Shape
    args::Vector{Shape}
end
shape_null(::Type{Shape}) = SHAPE_NULL
const SHAPE_NULL = ShapeUnion(Shape[])

const CANCELLED = Ref(false)
throw_if_cancelled() = if CANCELLED[] throw(ErrorException("cancelled")) end

@noinline Base.@nospecializeinfer function _union_vec_no_union(args)
    return args
end

# WEIRDLY, this also reproduces if shape_disjuncts is not defined!
function shape_disjuncts end
shape_disjuncts(s::ShapeUnion) = s.args
shape_disjuncts(s::Shape) = Shape[s]
# ^ (You can try commenting out the above three lines to produce another hang as well).

function shape_union(::Type{Shape}, args::Iterator)
    return _union(args)
end
function _union(args::Iterator)
    if any(arg -> arg isa ShapeUnion, args)
        # Call the entry point rather than `_union_vec_no_union` because we are uncertain
        # about the size and because there might be some optimization opportunity.
        return shape_union(Shape, (disj for arg in args for disj in shape_disjuncts(arg)))
    end

    if !(args isa Vector)
        args = collect(Shape, args)
        # respect _union_vec_no_union's assumption
        isempty(args) && return shape_null(Shape)
        length(args) == 1 && return only(args)
    end

    # If this is a big union, check for cancellation
    length(args) > 100 && throw_if_cancelled()
    return _union_vec_no_union(args)
end

# Reproduction:
#=
julia> code_typed(
           _union,
           (Vector{Shape},),
       )
=#
julia> VERSION
v"1.11.0-DEV.638"

julia> include("inference_hang_repro.jl")
_union (generic function with 1 method)

julia> code_typed(
           _union,
           (Vector{Shape},),
       )
# it is hanging here....

Originally posted by @NHDaly in #51603 (comment)

Pausing it after long enough, gives this stack trace:

julia> code_typed(
           _union,
           (Vector{Shape},),
       )
^C
ERROR: InterruptException:
Stacktrace:
   [1] getindex(compact::Core.Compiler.IncrementalCompact, ssa::Core.SSAValue)
     @ Core.Compiler ./compiler/ssair/ir.jl:701
   [2] simple_walk(compact::Core.Compiler.IncrementalCompact, defssa::Any, callback::Core.Compiler.var"#476#477")
     @ Core.Compiler ./compiler/ssair/passes.jl:202
   [3] simple_walk
     @ Core.Compiler ./compiler/ssair/passes.jl:188 [inlined]
   [4] lifted_value(compact::Core.Compiler.IncrementalCompact, old_node_ssa::Any, old_value::Any, lifted_philikes::Vector{…}, lifted_leaves::Core.Compiler.IdDict{…}, reverse_mapping::Core.Compiler.IdDict{…})
     @ Core.Compiler ./compiler/ssair/passes.jl:623
   [5] perform_lifting!(compact::Core.Compiler.IncrementalCompact, visited_philikes::Vector{…}, cache_key::Any, result_t::Any, lifted_leaves::Core.Compiler.IdDict{…}, stmt_val::Any, lazydomtree::Core.Compiler.LazyGenericDomtree{…})
     @ Core.Compiler ./compiler/ssair/passes.jl:731
   [6] sroa_pass!(ir::Core.Compiler.IRCode, inlining::Core.Compiler.InliningState{Core.Compiler.NativeInterpreter})
     @ Core.Compiler ./compiler/ssair/passes.jl:1170
   [7] run_passes_ipo_safe(ci::Core.CodeInfo, sv::Core.Compiler.OptimizationState{…}, caller::Core.Compiler.InferenceResult, optimize_until::Nothing)
     @ Core.Compiler ./compiler/optimize.jl:797
   [8] run_passes_ipo_safe
     @ Core.Compiler ./compiler/optimize.jl:812 [inlined]
   [9] optimize(interp::Core.Compiler.NativeInterpreter, opt::Core.Compiler.OptimizationState{…}, caller::Core.Compiler.InferenceResult)
     @ Core.Compiler ./compiler/optimize.jl:786
  [10] _typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState)
     @ Core.Compiler ./compiler/typeinfer.jl:265
  [11] typeinf(interp::Core.Compiler.NativeInterpreter, frame::Core.Compiler.InferenceState)
     @ Core.Compiler ./compiler/typeinfer.jl:216
  [12]
     @ Core.Compiler ./compiler/typeinfer.jl:863
  [13]
     @ Core.Compiler ./compiler/abstractinterpretation.jl:617
  [14]
     @ Core.Compiler ./compiler/abstractinterpretation.jl:89
  [15]
     @ Core.Compiler ./compiler/abstractinterpretation.jl:2080

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIndicates an unexpected problem or unintended behaviorcompiler:inferenceType inference

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions