Skip to content

WITHIN GROUP query syntax #433

@samuelcolvin

Description

@samuelcolvin

@@frankie567:

I just tried with this new database engine and it's clearly day-and-night in terms of performance. We struggled before to load our dashboards for timespans > 3 hours; now it's able to load 30 days span pretty quickly. Really great move 👏

I've an existing query that's no longer compatible. The goal was to get request durations percentiles. It looks like this:

WITH duration AS (
  SELECT
    time_bucket('%time_bucket_duration%', start_timestamp) AS x,
    extract(milliseconds from end_timestamp - start_timestamp) as duration
  FROM records
  WHERE otel_scope_name = 'opentelemetry.instrumentation.asgi' and parent_span_id is null
)
SELECT
  x,
  percentile_cont(0.99) WITHIN GROUP (ORDER BY duration) as percentile_99,
  percentile_cont(0.95) WITHIN GROUP (ORDER BY duration) as percentile_95
FROM duration
GROUP BY x
ORDER BY x DESC;

But DataFusion states it doesn't support WITHIN GROUP:

Capture d’écran 2024-09-18 à 17 19 35

I'm pretty sure there is another way to achieve that query but don't have time right now to explore this ☺️

Originally posted by @frankie567 in #408 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions