-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Closed
Labels
docsThis change adds or pertains to documentationThis change adds or pertains to documentationsortingPut things in orderPut things in order
Description
Currently, when sorting a collection with sort(collection; by=transformation), the by callable can be called multiple times on the same element (essentially whenever it is compared to another element within the sort algorithm).
MWE
using Random
struct Counter
d::Dict
end
Counter() = Counter(Dict{Any, Int}())
updatecount(counter::Counter, x) = counter.d[x] = get(counter.d, x, 0) + 1
getcount(counter::Counter, x) = get(counter.d, x, 0)
function counted_transformation(x; counter)
updatecount(counter, x)
return x
end
@show collection = shuffle(1:10)
counter = Counter()
sort(collection; by=x->counted_transformation(x; counter=counter))
@show [getcount(counter, x) for x in collection]Output:
collection = shuffle(1:10) = [2, 6, 9, 10, 3, 5, 7, 8, 1, 4]
[getcount(counter, x) for x = collection] = [3, 7, 8, 7, 7, 6, 6, 5, 8, 7]As you can see, for some elements by has been called 7 or 8 times.
This can be quite expensive if the by transformation involves some non-trivial computation.
IMHO, it would be preferable if sort transforms these values only once before sorting.
From the high-level, this could look like this (this allocates a bunch, so this is not a good workaround)
# single transformation sort
function sort_custom(collection; by=identity, kwargs...)
collection_with_transformation = map(x->x=>by(x), collection)
sort!(collection_with_transformation; by=x->last(x))
return map(x -> first(x), collection_with_transformation)
endMetadata
Metadata
Assignees
Labels
docsThis change adds or pertains to documentationThis change adds or pertains to documentationsortingPut things in orderPut things in order