Skip to content

Unify Sorting Implementations #5879

@tustvold

Description

@tustvold

Is your feature request related to a problem or challenge?

Currently there is separate logic to handle in-memory sorts, spilling sorts, and merging sorts, spread across ExternalSorter, SortPreservingMergeStream. This logic is incredibly hard to follow, and maintain, and there is a high likelihood of inconsistency between the implementations.

Additionally, the in-memory sort implementation currently relies on concatenating batches, which for dictionaries is extremely memory inefficient, as it will concatenate the underlying dictionary values.

Describe the solution you'd like

I would like in-memory sort to proceed by first sorting the batches, and then performing a sort preserving merge

Describe alternatives you've considered

No response

Additional context

This would help with #5230

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions