Skip to content

Conversation

quinnj
Copy link
Member

@quinnj quinnj commented Oct 15, 2021

Fixes #3. A few notes on what's proposed in this PR:

  • It defines a Base.Sort.defalg on
    AbstractVector{<:Union{InlineString, Missing}} to be RadixSort;
    this means when someone calls sort(x) where x is some kind of
    InlineString array, it will use the RadixSort implementation also
    included in this PR
  • The implementation is based on the one defined in
    SortingAlgorithms.jl, but updated to be a bit more modern
    stylistically, improved performance, and more generically work on
    primitive types so that InlineStrings can be sorted appropriately
  • I've also added quite a few code comments to try and explain what
    exactly is going on in various parts of the code to the best of my
    knowledge

Still to do in this PR:

  • Add tests; I've done initial testing locally and it seems to be
    working, but obviously we need actual rigorous testing here
  • Performance tuning; I think there are several knobs we can leverage
    to help fine-tune performance for various cases. The current
    implementation does show big speedups for large arrays, but is
    dramatically slower on very small arrays (< 20 elements). I'd like to
    play around with ways we can gain back some of that performance. I'd
    also like to experiment with different radix sizes and some
    short-circuiting ideas that may help speed things up even further.

Fixes #3. A few notes on what's proposed in this PR:

  * It defines a `Base.Sort.defalg` on
  `AbstractVector{<:Union{InlineString, Missing}}` to be `RadixSort`;
  this means when someone calls `sort(x)` where `x` is some kind of
  `InlineString` array, it will use the `RadixSort` implementation also
  included in this PR
  * The implementation is based on the one defined in
  SortingAlgorithms.jl, but updated to be a bit more modern
  stylistically, improved performance, and more generically work on
  primitive types so that `InlineString`s can be sorted appropriately
  * I've also added quite a few code comments to try and explain what
  exactly is going on in various parts of the code to the best of my
  knowledge

Still to do in this PR:
  * Add tests; I've done initial testing locally and it seems to be
  working, but obviously we need actual rigorous testing here
  * Performance tuning; I think there are several knobs we can leverage
  to help fine-tune performance for various cases. The current
  implementation does show big speedups for large arrays, but is
  dramatically slower on very small arrays (< 20 elements). I'd like to
  play around with ways we can gain back some of that performance. I'd
  also like to experiment with different radix sizes and some
  short-circuiting ideas that may help speed things up even further.
@codecov-commenter
Copy link

codecov-commenter commented Oct 15, 2021

Codecov Report

Merging #10 (a07e2b5) into main (c733724) will increase coverage by 0.66%.
The diff coverage is 93.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #10      +/-   ##
==========================================
+ Coverage   89.21%   89.88%   +0.66%     
==========================================
  Files           1        1              
  Lines         371      435      +64     
==========================================
+ Hits          331      391      +60     
- Misses         40       44       +4     
Impacted Files Coverage Δ
src/InlineStrings.jl 89.88% <93.75%> (+0.66%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c733724...a07e2b5. Read the comment docs.

@quinnj
Copy link
Member Author

quinnj commented Oct 16, 2021

cc: @evalparse, @nalimilan @bkamins

Quick update: I've pushed an implementation of cmp for InlineStrings that allows them to avoid a slower generic fallback in Base. I should probably split that out and merge it directly, but we'll see what I can figure out w/ this radix sort stuff in the next day or so.

I did some more benchmarking; RadixSort indeed has some very nice speedups, but not in all cases. The slow cases I could find is when there are < 500 elements in an array, or when the InlineString type is String31 or larger when the # of elements is even up to 50_000 or so. I'm trying to dig into why the performance penalty for String31 and larger. For the small array case, I'm wondering if we should just do a quick check at the start of the RadixSort sort! definition and if it's small enough, switch to MergeSort.

@bkamins
Copy link

bkamins commented Oct 16, 2021

cmp looks good. Thank you.

Regarding performance - I guess we can use some heuristic thresholds to switch between algorithms. Right?

@nalimilan
Copy link
Member

nalimilan commented Oct 16, 2021

Cool, thanks! The fact that radix sort is slow for small vectors has also been spotted at JuliaCollections/SortingAlgorithms.jl#50. It would make sense to adopt a threshold like JuliaCollections/SortingAlgorithms.jl#51 does (comments welcome there BTW).

Have you considered using the RadixSort type from SortingAlgorithms? Ideally we would have a trait like SortingAlgoritmhs.supports_radixsort that we can use e.g. in countmap to use radix supports for all types that support it.

quinnj and others added 2 commits October 16, 2021 21:33
@quinnj
Copy link
Member Author

quinnj commented Oct 17, 2021

I realized we can speed cmp up even further by comparing InlineStrings like integers instead of strings, which makes InlineString MergeSort faster than String MergeSort.

RadixSort is still way faster for larger arrays, except for large InlineString types, like String63, String127, and String255. I want to explore if there are some ways we could optimize radixsort for those types, but we might just bail to MergeSort for those as well.

@quinnj
Copy link
Member Author

quinnj commented Oct 19, 2021

Ok, I think this is ready to go. I played around with different radix sizes for all InlineString types, but the radix size surprisingly didn't make that much of a difference; or at least a dramatic and consistent enough difference to fiddle around with. I was hoping it would make a bigger difference for larger InlineString types, but for those (String31 and larger), MergeSort is consistently the fastest, so I've updated the radix code to only operate on smaller InlineString types. I also updated the MergeSort thresholds, since I was doing so much empirical testing on stuff anyway.

@quinnj quinnj merged commit 431cce5 into main Oct 19, 2021
@quinnj quinnj deleted the jq/radixsort branch October 19, 2021 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Faster sorting of InlineStrings

4 participants