Skip to content

Conversation

@graalvmbot
Copy link
Collaborator

The PR introduces Transitive Effect Summary Analysis (TESA):

TESA in a Nutshell

  • Key idea: Give compiler phases the ability to peek behind an invoke without inlining.
  • Context: Native Image, closed-world assumption, sound (over-approximating) call graph available.
  • How: Compute an effect summary for each method, then propagate the effects through the call graph from callees to callers until fixed point is reached.

Overview

  • TESA is a reverse (callees→callers), interprocedural, summary-based analysis that propagates “may” effects from callees back to callers over the call graph.
  • TESA computes concise effect summaries per method and uses them to optimize invokes without requiring inlining.
  • TESA provides a generic TesaEffect interface and a concrete analysis KilledLocationTesawired into the Native Image build. Other analyses (e.g., for side effects, exceptions) are being tested at the moment and may be included in future PRs.
  • The results are applied before compilation (to be killed locations of invokes).

What can be computed by TESA?

  • Memory effects: accessed/written location identities.
  • Side effects: whether the method may have observable side effects.
  • Exceptions: whether the method may throw.
  • Allocations: whether the method may allocate (and related hints).
  • Loops: whether the method contains loops (and a simple depth signal).
  • Size/pressure: code size, register pressure/clobber hints.
  • Anything that can be easily computed from the Graal IR and represented as a finite lattice with low height.

What could be enabled at invoke sites?

  • Code motion when invokes are side-effect-free and/or do not touch memory.
  • Redundant field-load removal when invoke memory effects are disjoint from nearby loads/stores.
  • Exception-path elimination: convert InvokeWithExceptionNode to InvokeNode for non-throwing sites.
  • Hints for inlining: info about allocations, exceptions, loops, ...
  • Fewer spills: exploit low clobber/pressure signals around invokes.
  • Safepoint policy: space/avoid safepoints where callers/callees run fast without loops.

Effect Summary Representation

  • one state per analysis per method
  • effects summaries represented as finite-height lattices with small domains (for fast convergence)

Core Algorithm

  1. Save the call graph computed by reachability analysis:
    • Compute reverse (callee to caller) edges.
    • A single pass over the call graph.
    • May be parallelized.
    • Make sure everything we need survives the after analysis cleanup.
  2. Compute the initial state for each method:
    • After StrengthenGraphs to already utilize the effects of reachability analysis.
    • Each method processed in parallel.
    • Ideally a single pass over the Graal IR (or something "linear-enough").
  3. Fixed-point algorithm:
    • Propagate the effect summaries from callees to callers until a fixed-point is reached.
    • May be parallelized.
    • If the lattices have low height, convergence should be fast.
    • Fixed-point could be avoided by SCC-condensation.
  4. Optimizations:
    • Apply the results by modifying the Invokes and/or their surrounding nodes.
    • Each method processed in parallel.

Safety and Soundness

  • May-fact design: meets are conservative (union-like); unknown or native/unsafe sites fall back to Top.
  • Polymorphism: non-direct invokes meet across all reachable targets.

Current State of TESA

  • Prototype available in this PR.
  • Enabled with -H:+TransitiveEffectSummaryAnalysis (default true).
  • The core infrastructure is in place and an example analysis KilledLocationTesa is implemented and open for review.

Architecture Overview

  • Effects: A small generic interface TesaEffect<T>, LocationEffect as an initial implementations.
  • Analyses: Each effect type is a separate TESA instance extending AbstractTesa<T>, e.g., KilledLocationTesa, or (in future PRs) SideEffectTesa, ThrowsExceptionTesa, AllocationTesa, LoopTesa, ...
  • Call graph: TesaReverseCallGraph records per-invoke target info and caller edges, built from the analysis universe and saved during image build.
  • Propagation: Worklist-driven, callees-to-callers, monotone “may” facts with union-like merges; polymorphic calls merge across all targets.
  • Application: Analyses compute initial per-method facts with a single IR scan, propagate to fixpoint, then apply
    results to compilation graphs before optimization.

Performance Characteristics

  • Single-pass local fact extraction per method.
  • Finite-height lattices with small domains.
  • Worklist propagation converges quickly (200-300ms); SCC condensation can be added if needed for large recursive regions.

Extensibility

  • New analysis can be added by:
    • Defining a an effect for your fact domain.
    • Implementing computeInitialState and optimizeInvoke in a subclass of AbstractTesa<T>.
    • Registering it in TesaEngine.
  • Example use cases: synchronization/blocking flags; refined exception sets; return-value properties (ranges, fresh object returned each time (no aliasing)).

Limitations and future work

  • Location identities: start coarse and refine as needed; cap set sizes to avoid “Top” blow-up.
  • Exception precision: currently boolean; type sets could be added with careful closed-world handling.
  • Optional SCC condensation/topological order for predictable iteration counts on large graphs.

Evaluation

While more than 5% of invokes and 10% of methods are can typically be improved by TESA (see the results of Spring Petclinic below), the effects of the analysis are not visible on any of the peak metrics apart from a very slight inconclusive binary size reduction, which is in the level of noise. Note that the table includes also experimental analyses that are not part of this PR. (They will be more extensively tested and included in followup PRs if they prove useful).

Transitive Effect Summary Analysis Results:
  - Scope: 115611 methods, 428755 invokes
  - Call graph initialization: 267.36 ms
  - Worst case TESA time (analyses run in parallel): 301.77 ms
  - Details:
    ------------------------- | ------------------- | ------------------- | ---------
    Analysis                  |    Methods (%)      |    Invokes (%)      | Time (ms)
    ------------------------- | ------------------- | ------------------- | ---------
    KilledLocationTesa        |      18975 (16.41%) |      36127 ( 8.43%) |   265.99
    KilledLocationSetTesa     |      19470 (16.84%) |      37137 ( 8.66%) |   266.50
    ThrowsExceptionTesa       |      16630 (14.38%) |      29857 ( 6.96%) |   251.88
    AllocationTesa            |      22347 (19.33%) |      40134 ( 9.36%) |   287.81
    LoopTesa                  |      28708 (24.83%) |      72177 (16.83%) |   301.77
    SideEffectTesa            |      17724 (15.33%) |      34625 ( 8.08%) |   266.02

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants