-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
This pattern is common:
SELECT c1, c2
FROM t
ORDER BY c3
LIMIT 10
For example we have queries in IOx like the following (this is the same pattern @NGA-TRAN describes on #7162)
SELECT tag, value1, ...
FROM t
WHERE other_column = 'foo'
ORDER BY time
LIMIT 10
Describe the solution you'd like
If the data IS NOT already sorted, what happens today is a plan like
LIMIT(fetch=10)
SORT(sort_exprs=[c3] fetch=10)
SCAN(...)
And the Sort can take partial advantage of the fetch -- and it will be better after @gruuya 's change in #7180
We can probably do better still with a special operator like the following that uses some specialized structure (perhaps some type of heap)
TOPK(fetch=10, sort_exprs=[c3])
SCAN(...)
Describe alternatives you've considered
If the data is already sorted the right way, DataFusion can just read first N values and stop as described on #7162
Additional context
No response
NGA-TRAN and gruuya
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request