Skip to content

Unnecessary allocations in sorting #45326

@LilithHafner

Description

@LilithHafner
@time sort(rand(Int, 3000, 3000); dims=1);
v1.7.2 => 0.715428 seconds (7 allocations: 137.329 MiB, 21.84% gc time)
v1.9.0 => 9.917015 seconds (12.01 k allocations: 201.306 GiB, 92.10% gc time)

The problem is that radix_sort needs scratch space to work, sort!(v, lo, hi, ::AdaptiveSort) allocates similar(v) to create that scratch space, and sort!(A::AbstractArray; dims) repeatedly calls sort!(v, lo, hi, ::AdaptiveSort) with hi - lo << length(v) == length(A). This results in repeated allocations of size length(A) when hi-lo is all that is needed.

Possible solutions:

  1. Continue using QuickSort for sort!(A::AbstractArray; dims). In 1.7.2, we use QuickSort by default to avoid this issue at the cost of losing stability. This is a 1 line of code solution.
  2. Replace similar(v) with similar(v, lo:hi) in sort!(v, lo, hi, ::AdaptiveSort). This would add a dependency on OffsetArrays which would be tricky. This is the only solution that also fixes sort!(rand(100000), 1, 100, AdaptiveSort, Forward).
  3. Support passing in a scratch-space vector into sort!(v, lo, hi, ::AdaptiveSort) and use that feature in sort!(A::AbstractArray; dims). Adaptive sort does not always use a scratch space vector, so there would have to be some overhead. This has the potential to be the most performant solution for sort!(A::AbstractArray; dims).

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceMust go fasterregressionRegression in behavior compared to a previous versionsortingPut things in order

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions