Skip to content

Optimize SELECT min/max queries with limit #7198

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

The following query pattern shows up in many of our usecases:

SELECT tag, max(time)
FROM t
GROUP BY tag
ORDER BY max(time) DESC
LIMIT 10

There may also be predicates

In English this query returns the top 10 groups that had the most recent values

A more specific example, @JayjeetAtGithub found that the Jaeger tool issues this query to show the top ten most recent queries

SELECT "trace_id", MAX("time") AS t FROM 'spans' WHERE "service.name" = 'frontend' AND "time" >= to_timestamp(1688713200000000000) AND "time" <= to_timestamp(1689000240000000000) GROUP BY "trace_id" ORDER BY t DESC LIMIT 20;

Describe the solution you'd like

Implement some sort of optimization for this query

Describe alternatives you've considered

I believe #7191 / #7192 from @avantgardnerio is designed for this use case, so that may be sufficient. I did think it was worth documenting the actual end user effect of the change as a separate item which is why I filed this ticket

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions