Skip to content

Conversation

michaeljmarshall
Copy link
Member

…g reads

(cherry picked from commit ae4f187)

What is the issue

...

What does this PR fix and why was it fixed

...

Copy link

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@eolivelli
Copy link

I have confirmed that this patch solves the issue, we will try this docker image with Shadow Proxy

…g reads

Port of CASSANDRA-19497.

Co-authored-by: Caleb Rackliffe <[email protected]>
Co-authored-by: Michael Marshall <[email protected]>
Co-authored-by: Andrés de la Peña <[email protected]>
@adelapena adelapena force-pushed the cndb-11666-may-release branch from f61de5b to 5e29022 Compare July 21, 2025 12:00
@adelapena adelapena marked this pull request as ready for review July 21, 2025 12:00
Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Code is the same as #1883 (+ Version#after)

Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1884 rejected by Butler


2 new test failure(s) in 1 builds
See build details here


Found 2 new test failures

Test Explanation Branch history Upstream history
....r.PendingAntiCompactionTest.testRetriesTimeout regression 🔴
o.a.c.u.b.BinLogTest.testTruncationReleasesLogS... regression 🔴

No known test failures found

@adelapena adelapena merged commit 6db0e9b into cndb-main-release-202505 Jul 23, 2025
483 of 489 checks passed
@adelapena adelapena deleted the cndb-11666-may-release branch July 23, 2025 11:11
adelapena added a commit that referenced this pull request Sep 29, 2025
…g reads (#1884)

Port of CASSANDRA-19497.

Co-authored-by: Caleb Rackliffe <[email protected]>
Co-authored-by: Michael Marshall <[email protected]>
Co-authored-by: Andrés de la Peña <[email protected]>
michaeljmarshall added a commit that referenced this pull request Sep 30, 2025
…sult set (#2024)

(cherry picked from commit ada025c)

Copy of #2023, but targeting
`main`

### What is the issue
riptano/cndb#15485

### What does this PR fix and why was it fixed
This PR fixes a bug introduced to this branch via
#1884. The bug only impacts
SAI file format `aa` when the index file was produced via compaction,
which is why the modified test simply adds coverage to compact the table
and hit the bug.

The bug happens when an iterator produces the same partition across two
different batch fetches from storage. These keys were not collapsed in
the `key.equals(lastKey)` logic because compacted indexes use a row id
per row instead of per partition, and the logic in
`PrimaryKeyWithSource` considers rows with different row ids to be
distinct. However, when we went to materialize a batch from storage, we
hit this code:

```java
        ClusteringIndexFilter clusteringIndexFilter = command.clusteringIndexFilter(firstKey.partitionKey());
        if (cfs.metadata().comparator.size() == 0 || firstKey.hasEmptyClustering())
        {
            return clusteringIndexFilter;
        }
        else
        {
            nextClusterings.clear();
            for (PrimaryKey key : keys)
                nextClusterings.add(key.clustering());
            return new ClusteringIndexNamesFilter(nextClusterings, clusteringIndexFilter.isReversed());
        }
```

which returned `clusteringIndexFilter` for `aa` because those indexes do
not have the clustering information. Therefore, each batch fetched the
whole partition (which was subsequently filtered to the proper results),
and produced a multiplier effect where we saw `batch` many duplicates.

This fix works by comparing partition keys and clustering keys directly,
which is a return to the old comparison logic from before
#1884. There was actually a
discussion about this in the PR to `main`, but unfortunately, we missed
this case
#1883 (comment).

A more proper long term fix might be to remove the logic of creating a
`PrimaryKeyWithSource` for AA indexes. However, I preferred this
approach because it is essentially a `revert` instead of fixing forward
solution.
michaeljmarshall added a commit that referenced this pull request Sep 30, 2025
…sult set (#2023)

### What is the issue
riptano/cndb#15485

### What does this PR fix and why was it fixed
This PR fixes a bug introduced to this branch via
#1884. The bug only impacts
SAI file format `aa` when the index file was produced via compaction,
which is why the modified test simply adds coverage to compact the table
and hit the bug.

The bug happens when an iterator produces the same partition across two
different batch fetches from storage. These keys were not collapsed in
the `key.equals(lastKey)` logic because compacted indexes use a row id
per row instead of per partition, and the logic in
`PrimaryKeyWithSource` considers rows with different row ids to be
distinct. However, when we went to materialize a batch from storage, we
hit this code:

```java
        ClusteringIndexFilter clusteringIndexFilter = command.clusteringIndexFilter(firstKey.partitionKey());
        if (cfs.metadata().comparator.size() == 0 || firstKey.hasEmptyClustering())
        {
            return clusteringIndexFilter;
        }
        else
        {
            nextClusterings.clear();
            for (PrimaryKey key : keys)
                nextClusterings.add(key.clustering());
            return new ClusteringIndexNamesFilter(nextClusterings, clusteringIndexFilter.isReversed());
        }
```

which returned `clusteringIndexFilter` for `aa` because those indexes do
not have the clustering information. Therefore, each batch fetched the
whole partition (which was subsequently filtered to the proper results),
and produced a multiplier effect where we saw `batch` many duplicates.

This fix works by comparing partition keys and clustering keys directly,
which is a return to the old comparison logic from before
#1884. There was actually a
discussion about this in the PR to `main`, but unfortunately, we missed
this case
#1883 (comment).

A more proper long term fix might be to remove the logic of creating a
`PrimaryKeyWithSource` for AA indexes. However, I preferred this
approach because it is essentially a `revert` instead of fixing forward
solution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants