Skip to content

Conversation

@aviatesk
Copy link
Member

@aviatesk aviatesk commented Dec 1, 2021

This commit setups a basic infrastructure for benchmarking
Julia-level compilation pipeline.
InferenceBenchmarks is based on InferenceBenchmarker <: AbstractInterpreter,
which maintains its own global inference cache, and so it allows us to
run the compilation pipeline multiple times while avoiding caches generated
by previous compilation to be reused.

I set up a top-level benchmark group named "inference": InferenceBenchmarks,
which is composed of the following subgroups:

  • "inference": just benchmarks overall Julia-level compilation pipeline
  • "abstract interpretation": benchmarks only abstract interpretation,
    i.e. without optimization
  • "optimization": benchmarks only optimization

Here is an example of benchmark result obtained by comparing these two
commits of JuliaLang/julia 5c357e9 and d515f05:

# built on 5c357e9
using BenchmarkTools, BaseBenchmarks
BaseBenchmarks.load!("inference")
results = run(BaseBenchmarks.SUITE; verbose = true)
BenchmarkTools.save("5c357e9.json", results)

# built on d515f05
using BenchmarkTools, BaseBenchmarks
BaseBenchmarks.load!("inference")
results = run(BaseBenchmarks.SUITE; verbose = true)
BenchmarkTools.save("d515f05.json", results)

# compare
using BenchmarkTools, BaseBenchmarks
base = BenchmarkTools.load("5c357e9.json")[1]
target = BenchmarkTools.load("d515f05.json")[1]
julia> leaves(regressions(judge(minimum(target), minimum(base))))
Any[]

julia> leaves(improvements(judge(minimum(target), minimum(base))))
6-element Vector{Any}:
 (Any["inference", "inference", "rand(Float64)"], TrialJudgement(-2.85% => invariant))
 (Any["inference", "inference", "sin(42)"], TrialJudgement(-2.44% => invariant))
 (Any["inference", "inference", "abstract_call_gf_by_type"], TrialJudgement(-1.97% => invariant))
 (Any["inference", "inference", "println(::QuoteNode)"], TrialJudgement(-0.96% => invariant))
 (Any["inference", "optimization", "sin(42)"], TrialJudgement(+1.26% => invariant))
 (Any["inference", "optimization", "println(::QuoteNode)"], TrialJudgement(-6.97% => improvement))

This result is very satisfying because the refactor added in d515f05
certainly improved Julia-level compilation performance by avoiding
domtree construction in the SROA pass in many cases.

@aviatesk aviatesk force-pushed the inf branch 4 times, most recently from a8f0a4a to d386165 Compare December 1, 2021 11:12
@aviatesk
Copy link
Member Author

aviatesk commented Dec 1, 2021

The failure in Julia nightly is because this added benchmark suite isn't tuned yet, and thus it's tuned like evals=2: https://github.com/JuliaCI/BaseBenchmarks.jl/runs/4379154336?check_suite_focus=true#step:5:7094
This disables setup settings and this causes the failure.

I confirmed this benchmark suite works just correctly on my machine.

@vtjnash
Copy link
Member

vtjnash commented Dec 1, 2021

I think you need to specify evals=1 to @benchmarkable

@aviatesk
Copy link
Member Author

aviatesk commented Dec 2, 2021

Even though I set it manually here?

@vtjnash
Copy link
Member

vtjnash commented Dec 2, 2021

That will work, assuming no other code later calls tune

@aviatesk
Copy link
Member Author

aviatesk commented Dec 2, 2021

Ah, evals = 2 is specified for our test case:

@test begin
run(BaseBenchmarks.SUITE, verbose = true, samples = 1,
evals = 2, gctrial = false, gcsample = false);
true
end

This commit setups a basic infrastructure for benchmarking
Julia-level compilation pipeline.
`InferenceBenchmarks` is based on `InferenceBenchmarker <: AbstractInterpreter`,
which maintains its own global inference cache, and so it allows us to
run the compilation pipeline multiple times while avoiding caches generated
by previous compilation to be reused.

I set up a top-level benchmark group named `"inference": InferenceBenchmarks`,
which is composed of the following subgroups:
- `"inference"`: just benchmarks overall Julia-level compilation pipeline
- `"abstract interpretation"`: benchmarks only abstract interpretation,
  i.e. without optimization
- `"optimization"`: benchmarks only optimization

Here is an example of benchmark result obtained by comparing these two
commits of `JuliaLang/julia` [`5c357e9`](JuliaLang/julia@5c357e9) and [`d515f05`](JuliaLang/julia@d515f05):
```julia
\# built on 5c357e9
using BenchmarkTools, BaseBenchmarks
BaseBenchmarks.load!("inference")
results = run(BaseBenchmarks.SUITE; verbose = true)
BenchmarkTools.save("5c357e9.json", results)

\# built on d515f05
using BenchmarkTools, BaseBenchmarks
BaseBenchmarks.load!("inference")
results = run(BaseBenchmarks.SUITE; verbose = true)
BenchmarkTools.save("d515f05.json", results)

\# compare
using BenchmarkTools, BaseBenchmarks
base = BenchmarkTools.load("5c357e9.json")[1]
target = BenchmarkTools.load("d515f05.json")[1]
```
```
julia> leaves(regressions(judge(minimum(target), minimum(base))))
Any[]

julia> leaves(improvements(judge(minimum(target), minimum(base))))
6-element Vector{Any}:
 (Any["inference", "inference", "rand(Float64)"], TrialJudgement(-2.85% => invariant))
 (Any["inference", "inference", "sin(42)"], TrialJudgement(-2.44% => invariant))
 (Any["inference", "inference", "abstract_call_gf_by_type"], TrialJudgement(-1.97% => invariant))
 (Any["inference", "inference", "println(::QuoteNode)"], TrialJudgement(-0.96% => invariant))
 (Any["inference", "optimization", "sin(42)"], TrialJudgement(+1.26% => invariant))
 (Any["inference", "optimization", "println(::QuoteNode)"], TrialJudgement(-6.97% => improvement))
```

This result is very satisfying because the refactor added in `d515f05`
certainly improved Julia-level compilation performance by avoiding
domtree construction in the SROA pass in many cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants