Skip to content

performance regression on high-dimensional array iteration using CartesianIndices (no simd) #38073

@johnnychen94

Description

@johnnychen94

It turns out that #37829 has increased iteration performance for 2d array, while slowed down the iteration for higher-dimensional(>=4) array...

julia> using BenchmarkTools

julia> function arr_sum(X)
           val = zero(eltype(X))
           R = CartesianIndices(X)
           for i in R
               @inbounds val += X[i]
           end
           val
       end
arr_sum (generic function with 1 method)

julia> X = rand(4, 4, 4, 4, 4, 4);

julia> @btime arr_sum($X)
  5.584 μs (0 allocations: 0 bytes) # 1.6.0-DEV.1262
  5.790 μs (0 allocations: 0 bytes) # 17a3c7702e2cb20171d1211606343fc50533a588
  3.575 μs (0 allocations: 0 bytes) # 9405bf51a726a6383e6911eeb4235ba21ab3daee
  3.572 μs (0 allocations: 0 bytes) # 1.5.2
  5.959 μs (0 allocations: 0 bytes) # 1.0.5

julia> X = rand(64, 64);

julia> @btime arr_sum($X)
  3.627 μs (0 allocations: 0 bytes) # 17a3c7702e2cb20171d1211606343fc50533a588
  3.734 μs (0 allocations: 0 bytes) # 1.5.2

SIMD and LinearIndices are not affected.

simd
julia> using BenchmarkTools

julia> function arr_sum_simd(X)
           val = zero(eltype(X))
           R = CartesianIndices(X)
           @simd for i in R
               @inbounds val += X[i]
           end
           val
       end
arr_sum_simd (generic function with 1 method)

julia> X = rand(4, 4, 4, 4, 4, 4);

julia> @btime arr_sum_simd($X)
  3.593 μs (0 allocations: 0 bytes) # 1.6.0-DEV.1262
  3.827 μs (0 allocations: 0 bytes) # 1.5.2
  3.585 μs (0 allocations: 0 bytes) # 1.0.5
LinearIndices
julia> using BenchmarkTools

julia> function arr_sum_linear(X)
           val = zero(eltype(X))
           R = LinearIndices(X)
           for i in R
               @inbounds val += X[i]
           end
           val
       end
arr_sum_linear (generic function with 1 method)

julia> X = rand(4, 4, 4, 4, 4, 4);

julia> @btime arr_sum_linear($X)
  3.707 μs (0 allocations: 0 bytes) # 1.6.0-DEV.1262
  3.626 μs (0 allocations: 0 bytes) # 1.5.2
  3.796 μs (0 allocations: 0 bytes) # 1.0.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions