Skip to content

Conversation

blambov
Copy link

@blambov blambov commented Sep 17, 2025

What is the issue

https://github.com/riptano/cndb/issues/10302

What does this PR fix and why was it fixed

Implements the necessary trie machinery to work with trie sets, range and deletion-aware tries, and a memtable that uses it to store deletions in separate per-partition branches of the memtable trie.

Implements a method of skipping over tombstones when converting UnfilteredRowIterator to the filtered RowIterator, which has the effect of ignoring all tombstones when looking for data and speeds up next-live lookups dramatically. Adds a test to demonstrate this effect with the new memtable.

@blambov blambov force-pushed the CNDB-10302 branch 3 times, most recently from e358a60 to 6914cba Compare September 29, 2025 13:00
blambov added 20 commits October 3, 2025 17:47
This also changes the behaviour of subtries to always
include boundaries, their prefixes and their descendant
branches.

This is necessary for well-defined reverse walks and helps
present metadata on the path of queried ranges, and is not
a real limitation for the prefix-free keys that we use.
Range tries are tries made of ranges of coverage, which
track applicable ranges and are mainly to be used to store
deletions and deletion ranges.
Deletion-aware tries combine data and deletion tries. The cursor
of a deletion-aware trie walks the data part of the trie, but
also provides a `deletionBranchCursor` that can return a deletion/
tombstone branch covering the current position and the branch below
it as a range trie. Such a branch can be given only once for any
path in the trie (i.e. there cannot be a deletion branch covering
another deletion branch).

Deletion-aware merges and updates to in-memory tries take deletion
branches into account when merging data so that deleted data is
not produced in the resulting merge.
Implements a row-level trie memtable that uses deletion-aware
tries to store deletions separately from live data, together
with the associated TrieBackedPartition and TriePartitionUpdate.

Every deletion is first converted to its range version (e.g.
deleted rows are now represented as a WHERE ck <= x AND ck >= x,
deleted partitions -- as deletions covering from LT_EXCLUDED
to GT_NEXT_COMPONENT to include static and all normal rows)
and then stored in the deletion path of the trie.
To make tests work, all such ranges are converted back to rows
and partition deletion times on conversion to UnfiteredPartitionIterator.
Adds a new method to UnfilteredRowIterator that is implemented
by the new trie-backed partitions to ask them to stop issuing
tombstones. This is done on filtering (i.e. conversion from
UnfilteredRowIterator to RowIterator) where tombstones have already
done their job and are no longer needed.

Adds JMH tests of tombstones that demonstrate the improvement.
In the initial implementation row deletions were mapped to range tombstones,
which works but isn't compatible with the multitude of tests, which require
deletions to be returned in the form they were made.

This commit changes the representation of deleted rows to use point tombstones.
In addition to making the tests pass, this improves the memory usage of memtables
with row deletions.

Although they only add complexity at this stage, point tombstones (expanded to
apply to the covered branch) will be needed in the next stage of development.
Copy link

sonarqubecloud bot commented Oct 3, 2025

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2005 rejected by Butler


2 regressions found
See build details here


Found 2 new test failures

Test Explanation Runs Upstream
o.a.c.db.repair.PendingAntiCompactionTest.testRetriesTimeout REGRESSION 🔴🔴 0 / 7
o.a.c.utils.binlog.BinLogTest.testTruncationReleasesLogSpace (compression) REGRESSION 🔵🔴 0 / 7

Found 1 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants