CNDB-15260: Add SAI-specific execution info to slow query logger #2044

adelapena · 2025-10-09T11:43:58Z

What is the issue

The slow query logger is one of the most useful tools when dealing with performance-related incidents. Currently it simply prints the CQL representation of the slow query and its running time. So while it can allow us to identify problematic SAI queries, we have to go to the aggregated metrics where it's difficult to see what matches the problematic query.

I think it would be useful if the slow query logger also printed the SAI query metrics and plan. That way we could examine log records at the time of the incident to see what's going on with individual slow queries, without needing to extract conclusions from aggregated metrics.

What does this PR fix and why was it fixed

This PR adds SAI-specific information to the log messages produced when slow queries are detected. That information consists on the metrics in the QueryContext and the tree view of the query Plan. There is a cassandra.cassandra.sai.slow_query_log.details_enabled property to disable this extended logging.

This is how slow queries are currently logged:

DEBUG [node1_isolatedExecutor:1] node1 2025-10-09 11:14:57,354 MonitoringTask.java:175 - 2 operations were slow in the last 131 msecs:
<SELECT s FROM distributed_test_keyspace.t WHERE (s = s_2 OR s = s_3) AND  ALLOW FILTERING>, time 107 msec - slow timeout 100 msec
<SELECT n FROM distributed_test_keyspace.t WHERE n > 1 AND  ALLOW FILTERING>, was slow 2 times: avg/min/max 104/104/105 msec - slow timeout 100 msec

And this is how they would look like with SAI details:

DEBUG [node1_isolatedExecutor:1] node1 2025-10-09 11:14:57,354 MonitoringTask.java:175 - 2 operations were slow in the last 131 msecs:
<SELECT * FROM distributed_test_keyspace.t WHERE (s = s_2 OR s = s_3) AND  ALLOW FILTERING>, time 111 msec - slow timeout 100 msec
  SAI slow query metrics:
    sstablesHit: 2
    segmentsHit: 2
    partitionsRead: 2
    rowsFiltered: 2
    rowsPreFiltered: 0
    trieSegmentsHit: 2
    bkdPostingListsHit: 0
    bkdSegmentsHit: 0
    bkdPostingsSkips: 0
    bkdPostingsDecodes: 0
    triePostingsSkips: 0
    triePostingsDecodes: 2
    annGraphSearchLatencyNanos: 0
    shadowedPrimaryKeyCount: 0
  SAI slow query plan:
    Limit 2147483647 (rows: 1.8, cost/row: 101.5, cost: 3000.0..3177.6)
     └─ Filter (s = s_2 OR s = s_3) (sel: 1.000000000) (rows: 1.8, cost/row: 101.5, cost: 3000.0..3177.6)
         └─ Fetch (rows: 1.8, cost/row: 101.5, cost: 3000.0..3177.6)
             └─ Union (keys: 1.8, cost/key: 0.1, cost: 3000.0..3000.2)
                 ├─ LiteralIndexScan of t_s_idx (sel: 0.250000000, step: 1.0) (keys: 1.0, cost/key: 0.1, cost: 1500.0..1500.1)
                 │  predicate: Expression{name: s, op: EQ, lower: (s_2, true), upper: (s_2, true), exclusions: []}
                 └─ LiteralIndexScan of t_s_idx (sel: 0.250000000, step: 1.0) (keys: 1.0, cost/key: 0.1, cost: 1500.0..1500.1)
                    predicate: Expression{name: s, op: EQ, lower: (s_3, true), upper: (s_3, true), exclusions: []}
<SELECT * FROM distributed_test_keyspace.t WHERE n > 1 AND  ALLOW FILTERING>, was slow 2 times: avg/min/max 258/103/413 msec - slow timeout 100 msec
  SAI slowest query metrics:
    sstablesHit: 3
    segmentsHit: 3
    partitionsRead: 3
    rowsFiltered: 6
    rowsPreFiltered: 0
    trieSegmentsHit: 0
    bkdPostingListsHit: 3
    bkdSegmentsHit: 3
    bkdPostingsSkips: 0
    bkdPostingsDecodes: 7
    triePostingsSkips: 0
    triePostingsDecodes: 0
    annGraphSearchLatencyNanos: 0
    shadowedPrimaryKeyCount: 0
  SAI slowest query plan:
    Limit 2147483647 (rows: 6.0, cost/row: 101.1, cost: 4500.0..5106.8)
     └─ Filter n > 1 (sel: 1.000000000) (rows: 6.0, cost/row: 101.1, cost: 4500.0..5106.8)
         └─ Fetch (rows: 6.0, cost/row: 101.1, cost: 4500.0..5106.8)
             └─ NumericIndexScan of t_n_idx (sel: 0.857142857, step: 1.0) (keys: 6.0, cost/key: 0.1, cost: 4500.0..4500.6)
                predicate: Expression{name: n, op: RANGE, lower: (1, false), upper: (null, false), exclusions: []}

Note that the queries are printed as invalid CQL and include user data. That is being dealt with in CNDB-15280, which is under review.

github-actions · 2025-10-09T11:44:13Z

ekaterinadimitrova2

I did first round of review and left some questions and suggestions, mostly trivial.
I am wondering how good of an idea is to have to restart a node to disable it if we sense too much noise or performance degradation once enabled?

Sonar didn't run but we can check it with the next CI run, it didn't prevent me from doing initial review.

src/java/org/apache/cassandra/config/CassandraRelevantProperties.java

src/java/org/apache/cassandra/db/monitoring/Monitorable.java

src/java/org/apache/cassandra/index/sai/QueryContext.java

src/java/org/apache/cassandra/index/sai/plan/QueryMonitorableDetails.java

src/java/org/apache/cassandra/index/sai/plan/StorageAttachedIndexSearcher.java

test/distributed/org/apache/cassandra/distributed/test/sai/SlowSAIQueryLoggerTest.java

src/java/org/apache/cassandra/config/CassandraRelevantProperties.java

ekaterinadimitrova2 · 2025-10-20T18:22:36Z

src/java/org/apache/cassandra/db/ReadCommand.java

    public Index.Searcher indexSearcher()
    {
-        return indexQueryPlan == null ? null : indexQueryPlan.searcherFor(this);
+        if (indexQueryPlan == null)


I did not see tests that show that things look ok if we have queries involving non-sai indexes. Can you add some, please?

Added, and also for regular queries.

It seems solar is still complaining on the test coverage for this one and the next check. Can you take a look, please?

I've added an artificial test for the sake of increasing coverage from the current 96.6%.

Why artificial? I see queries and real test case?
for the sake of increasing coverage from the current 96.6%.
This was not about the number at all. There are valid cases as unreachable checks which would exhibit a bug, but this wasn't one of them. Thanks for the added test.

Btw by looking into the new test - this was probably just missing queries in the already existing test which would have hit those paths. As this is about testing the slow query logger, they should be checking probably whether we log or not, as expected.

src/java/org/apache/cassandra/db/monitoring/MonitoringTask.java

src/java/org/apache/cassandra/index/Index.java

adelapena · 2025-10-22T16:04:44Z

I am wondering how good of an idea is to have to restart a node to disable it if we sense too much noise or performance degradation once enabled?

We don't need to restart the node with the current approach. The system property that toggles logging SAI execution details is read every time we detect a slow query, and the system property can be changed at any time. Relevant bits here:

cassandra/src/java/org/apache/cassandra/index/sai/plan/StorageAttachedIndexSearcher.java

Lines 205 to 210 in 5c33fd2

    
           public Monitorable.ExecutionInfo monitorableExecutionInfo() 
        
           { 
        
               return CassandraRelevantProperties.SAI_SLOW_QUERY_LOG_EXECUTION_INFO_ENABLED.getBoolean() 
        
                      ? new QueryMonitorableExecutionInfo(queryContext, controller.buildPlan()) 
        
                      : Monitorable.ExecutionInfo.EMPTY; 
        
           }

ekaterinadimitrova2

I think we are almost ready.
A few small comments plus complaint on test coverage from the tool. I feel it is worth to check it as even before the tool I mentioned we have to test those if clauses. Thanks
The test failures seem to be known to me flakies not related to what we do here.

About the below comment

I am wondering how good of an idea is to have to restart a node to disable it if we sense too much noise or performance degradation once enabled?

We don't need to restart the node with the current approach. The system property that toggles logging SAI execution details is read every time we detect a slow query, and the system property can be changed at any time. Relevant bits here:

I am clear how we check it to decide whether to log or not, but how do you set that property in runtime? I don't see any setter or anything? What did I miss?

src/java/org/apache/cassandra/db/monitoring/MonitoringTask.java

src/java/org/apache/cassandra/index/Index.java

src/java/org/apache/cassandra/db/monitoring/MonitoringTask.java

test/distributed/org/apache/cassandra/distributed/test/sai/SlowSAIQueryLoggerTest.java

ekaterinadimitrova2 · 2025-10-22T20:32:16Z

src/java/org/apache/cassandra/db/ReadCommand.java

    public Index.Searcher indexSearcher()
    {
-        return indexQueryPlan == null ? null : indexQueryPlan.searcherFor(this);
+        if (indexQueryPlan == null)


It seems solar is still complaining on the test coverage for this one and the next check. Can you take a look, please?

adelapena · 2025-10-23T13:58:15Z

I am clear how we check it to decide whether to log or not, but how do you set that property in runtime? I don't see any setter or anything? What did I miss?

You didn't miss anything, I did. I have added a new mbean just for SAI and added the new config property there. Hopefully we can incrementally keep adding other similar past and future properties to it.

I have named it StorageAttachedIndexConfig because of a lack of imagination, but any suggestions are welcome.

adelapena · 2025-10-24T12:46:02Z

After this Slack discussion, I'm removing the SAI mbean since we have other ways to manipulate CassandraRelevantProperties.

src/java/org/apache/cassandra/db/ReadCommand.java

ekaterinadimitrova2

It seems very neat with the supplier but I do not think volatile is enough now for it. Please check the comment.

I am surprised Sonar complaints about code test coverage for the empty execution info, didn't we have it covered already on the previous run?

About the other issue flagged by Sonar - instance can definitely not be final in MonitoringTask.

ekaterinadimitrova2 · 2025-10-29T17:08:10Z

src/java/org/apache/cassandra/db/monitoring/MonitoringTask.java

-
    @VisibleForTesting
-    static MonitoringTask instance = make(REPORT_INTERVAL_MS, MAX_OPERATIONS);
+    public static MonitoringTask instance = make(Math.max(0, CassandraRelevantProperties.SLOW_QUERY_LOG_MONITORING_REPORT_INTERVAL_IN_MS.getInt()),


nit: there is unused import now

ekaterinadimitrova2 · 2025-10-29T17:30:03Z

src/java/org/apache/cassandra/db/ReadCommand.java

    @Nullable
    protected final Index.QueryPlan indexQueryPlan;

+    private volatile Supplier<ExecutionInfo> executionInfoSupplier = ExecutionInfo.EMPTY_SUPPLIER;


The supplier is neat, but now volatile is not enough. Also pointed by Sonar. On a quick look, probably just adding AtomicReference should be probably enough? The fields in QueryMonitorableExecutionInfo are final.

adelapena · 2025-10-29T17:54:38Z

It seems very neat with the supplier but I do not think volatile is enough now for it. Please check the comment.

I have also read that generic warning in Sonar, but wouldn't volatile be enough? There is a single writer thread when the query is executed, and a single read when the monitor picks it.

ekaterinadimitrova2 · 2025-10-29T18:12:15Z

There is a single writer thread when the query is executed, and a single read when the monitor picks it.

Agreed, volatile should be enough actually

ekaterinadimitrova2 · 2025-10-29T18:12:41Z

Let's check the test failures on the other PR and I can approve this one.

adelapena · 2025-10-30T12:05:17Z

Thanks for the feedback. I have removed that unused import and rebased without conflicts. I don't see related failures in CI in either this PR or the one for CNDB.

ekaterinadimitrova2 · 2025-10-30T21:37:09Z

Note that the queries are printed as invalid CQL and include user data. That is being dealt with in https://github.com/riptano/cndb/issues/15280, which is under review.

Just to confirm - after #15280 slow query logging will always redact user data? I did not see any parameters added in CassandraRelevantProperties (I did not review the full PR - only looked at the description and searched for that class)

adelapena · 2025-10-31T13:01:50Z

Just to confirm - after #15280 slow query logging will always redact user data? I did not see any parameters added in CassandraRelevantProperties (I did not review the full PR - only looked at the description and searched for that class)

Yes, #15280 will always redact the query user data printed to logs. When either this issue or #15280 get merged, the other will need an update, whichever gets second. I thought #15280 was going to be first, but it is taking longer than initially expected. I understand that while #15280 is not merged, redaction doesn't exist for this ticket. The tests added here will fail when redaction is ready, and I have just modified the one in the CNDB PR so it will also fail with redaction, so we don't miss it.

Note by the way that here we are not printing any additional user data to the logs, since the slow queries were already being logged, the metrics don't contain user data, and the plan contains the same user data as the query (#15280 also redacts the plan).

adelapena · 2025-11-05T15:17:18Z

@ekaterinadimitrova2 I have added the requested JMH microbenchmark to measure the impact of the extended logging. Here are the results:

Benchmark                                            (enabled)  Mode  Cnt          Score        Error  Units
QueryMonitorableExecutionInfoBench.executionInfo         false  avgt   10         20.180 ±      0.522  ns/op
QueryMonitorableExecutionInfoBench.executionInfo          true  avgt   10       1077.049 ±     19.155  ns/op
QueryMonitorableExecutionInfoBench.logSlowOperation      false  avgt   10      12245.476 ±   1375.457  ns/op
QueryMonitorableExecutionInfoBench.logSlowOperation       true  avgt   10      30474.713 ±  56262.347  ns/op
QueryMonitorableExecutionInfoBench.slowQuery             false  avgt   10  499650847.910 ± 117548.616  ns/op
QueryMonitorableExecutionInfoBench.slowQuery              true  avgt   10  499665972.905 ± 126953.319  ns/op

We can see that the generation of the extended SAI ExecutionInfo takes around one microsecond.

Producing the larger log messages adds some ~20 microseconds to the previous ~10 microseconds per query, making them ~3x slower.

However, the cost of the slow query logging is ~4 orders of magnitude lower than the faster of the slow queries, which take at least 500ms by default. In fact, the impact of the extended logging is lower than the error of the benchmark.

So, as it was expected, the extended logging dilutes into almost nothing when compared to the cost of the reported slow queries themselves.

ekaterinadimitrova2

Reviewed the benchmark test, LGTM

...bench/org/apache/cassandra/test/microbench/index/sai/QueryMonitorableExecutionInfoBench.java

Adds the metrics in IndexContext and the tree view of the query's Plan to the log messages produced for slow SAI queries. There is a new system property named cassandra.sai.slow_query_log.execution_info_enabled to toggle this feature, which is enabled by default. This property can be changed dynamically at any time, without a restart. When SAI-specific information is added to the slow query logger, the log messages for slow queries can take up almost 10x the (uncompressed) disk space. In the worst of the worst cases, with all queries being permanently slower than 500ms, and at least 50 different queries, reporting every 5 seconds (the default), the slow query logger can produce around 120MB of text per day and instance. With the detailed logging, it would be 1.14GB per day. # Conflicts: # src/java/org/apache/cassandra/db/monitoring/MonitoringTask.java # src/java/org/apache/cassandra/index/sai/plan/Plan.java

adelapena · 2025-11-07T12:36:13Z

I have rebased this to include the redaction of logged SELECT queries introduced by CNDB-15280, and updated SlowSAIQueryLoggerTest to consider that redaction.

sonarqubecloud · 2025-11-07T13:18:26Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
97.3% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cassci-bot · 2025-11-07T13:22:48Z

✔️ Build ds-cassandra-pr-gate/PR-2044 approved by Butler

Approved by Butler
See build details here

ekaterinadimitrova2

Looks great!

Adds the metrics in IndexContext and the tree view of the query's Plan to the log messages produced for slow SAI queries. There is a new system property named cassandra.sai.slow_query_log.execution_info_enabled to toggle this feature, which is enabled by default. This property can be changed dynamically at any time, without a restart. When SAI-specific information is added to the slow query logger, the log messages for slow queries can take up almost 10x the (uncompressed) disk space. In the worst of the worst cases, with all queries being permanently slower than 500ms, and at least 50 different queries, reporting every 5 seconds (the default), the slow query logger can produce around 120MB of text per day and instance. With the detailed logging, it would be 1.14GB per day.

… logger (#2044) Adds the metrics in IndexContext and the tree view of the query's Plan to the log messages produced for slow SAI queries. There is a new system property named cassandra.sai.slow_query_log.execution_info_enabled to toggle this feature, which is enabled by default. This property can be changed dynamically at any time, without a restart. When SAI-specific information is added to the slow query logger, the log messages for slow queries can take up almost 10x the (uncompressed) disk space. In the worst of the worst cases, with all queries being permanently slower than 500ms, and at least 50 different queries, reporting every 5 seconds (the default), the slow query logger can produce around 120MB of text per day and instance. With the detailed logging, it would be 1.14GB per day.

adelapena self-assigned this Oct 9, 2025

ekaterinadimitrova2 self-requested a review October 20, 2025 13:55

ekaterinadimitrova2 reviewed Oct 20, 2025

View reviewed changes

adelapena force-pushed the CNDB-15260-main branch from d742940 to 5b0a230 Compare October 21, 2025 11:45

adelapena force-pushed the CNDB-15260-main branch from 5c33fd2 to 13bf6ad Compare October 22, 2025 16:11

ekaterinadimitrova2 reviewed Oct 22, 2025

View reviewed changes

adelapena force-pushed the CNDB-15260-main branch from b47f01b to b271894 Compare October 24, 2025 12:50

ekaterinadimitrova2 reviewed Oct 24, 2025

View reviewed changes

src/java/org/apache/cassandra/db/ReadCommand.java Outdated Show resolved Hide resolved

ekaterinadimitrova2 changed the title ~~CNDB-15260: Add SAI-specific details to slow query logger~~ CNDB-15260: Add SAI-specific execution info to slow query logger Oct 24, 2025

adelapena force-pushed the CNDB-15260-main branch from 36077f6 to cb8ed72 Compare October 28, 2025 00:12

ekaterinadimitrova2 reviewed Oct 29, 2025

View reviewed changes

adelapena force-pushed the CNDB-15260-main branch from ba65451 to 8445e70 Compare October 30, 2025 12:03

ekaterinadimitrova2 reviewed Nov 6, 2025

View reviewed changes

...bench/org/apache/cassandra/test/microbench/index/sai/QueryMonitorableExecutionInfoBench.java Outdated Show resolved Hide resolved

adelapena added 3 commits November 7, 2025 11:45

Adapt SlowSAIQueryLoggerTest to query redaction

dbdea67

Add a seed to the random in QueryMonitorableExecutionInfoBench

0897f42

adelapena force-pushed the CNDB-15260-main branch from 92241c9 to 0897f42 Compare November 7, 2025 12:31

ekaterinadimitrova2 approved these changes Nov 7, 2025

View reviewed changes

adelapena merged commit 27689d3 into main Nov 10, 2025
494 checks passed

adelapena deleted the CNDB-15260-main branch November 10, 2025 11:29

CNDB-15260: Add SAI-specific execution info to slow query logger #2044

CNDB-15260: Add SAI-specific execution info to slow query logger #2044

Uh oh!

Conversation

adelapena commented Oct 9, 2025

What is the issue

What does this PR fix and why was it fixed

Uh oh!

github-actions bot commented Oct 9, 2025 • edited by adelapena Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist before you submit for review

Uh oh!

ekaterinadimitrova2 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ekaterinadimitrova2 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

adelapena Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

ekaterinadimitrova2 Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

adelapena Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

ekaterinadimitrova2 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

ekaterinadimitrova2 Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adelapena commented Oct 22, 2025

Uh oh!

ekaterinadimitrova2 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ekaterinadimitrova2 Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

adelapena commented Oct 23, 2025

Uh oh!

adelapena commented Oct 24, 2025

Uh oh!

Uh oh!

ekaterinadimitrova2 left a comment

Choose a reason for hiding this comment

Uh oh!

ekaterinadimitrova2 Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ekaterinadimitrova2 Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

adelapena commented Oct 29, 2025

Uh oh!

ekaterinadimitrova2 commented Oct 29, 2025

Uh oh!

ekaterinadimitrova2 commented Oct 29, 2025

Uh oh!

adelapena commented Oct 30, 2025

Uh oh!

ekaterinadimitrova2 commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

github-actions bot commented Oct 9, 2025 •

edited by adelapena

Loading

ekaterinadimitrova2 left a comment •

edited

Loading

ekaterinadimitrova2 Oct 23, 2025 •

edited

Loading

ekaterinadimitrova2 Oct 29, 2025 •

edited

Loading

ekaterinadimitrova2 commented Oct 30, 2025 •

edited

Loading

adelapena commented Oct 31, 2025 •

edited

Loading