forked from apache/cassandra
-
Notifications
You must be signed in to change notification settings - Fork 21
Cndb 10302 #2005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
blambov
wants to merge
22
commits into
main
Choose a base branch
from
CNDB-10302
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Cndb 10302 #2005
+24,616
−5,893
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e358a60
to
6914cba
Compare
This also changes the behaviour of subtries to always include boundaries, their prefixes and their descendant branches. This is necessary for well-defined reverse walks and helps present metadata on the path of queried ranges, and is not a real limitation for the prefix-free keys that we use.
Range tries are tries made of ranges of coverage, which track applicable ranges and are mainly to be used to store deletions and deletion ranges.
Deletion-aware tries combine data and deletion tries. The cursor of a deletion-aware trie walks the data part of the trie, but also provides a `deletionBranchCursor` that can return a deletion/ tombstone branch covering the current position and the branch below it as a range trie. Such a branch can be given only once for any path in the trie (i.e. there cannot be a deletion branch covering another deletion branch). Deletion-aware merges and updates to in-memory tries take deletion branches into account when merging data so that deleted data is not produced in the resulting merge.
Implements a row-level trie memtable that uses deletion-aware tries to store deletions separately from live data, together with the associated TrieBackedPartition and TriePartitionUpdate. Every deletion is first converted to its range version (e.g. deleted rows are now represented as a WHERE ck <= x AND ck >= x, deleted partitions -- as deletions covering from LT_EXCLUDED to GT_NEXT_COMPONENT to include static and all normal rows) and then stored in the deletion path of the trie. To make tests work, all such ranges are converted back to rows and partition deletion times on conversion to UnfiteredPartitionIterator.
Adds a new method to UnfilteredRowIterator that is implemented by the new trie-backed partitions to ask them to stop issuing tombstones. This is done on filtering (i.e. conversion from UnfilteredRowIterator to RowIterator) where tombstones have already done their job and are no longer needed. Adds JMH tests of tombstones that demonstrate the improvement.
In the initial implementation row deletions were mapped to range tombstones, which works but isn't compatible with the multitude of tests, which require deletions to be returned in the form they were made. This commit changes the representation of deleted rows to use point tombstones. In addition to making the tests pass, this improves the memory usage of memtables with row deletions. Although they only add complexity at this stage, point tombstones (expanded to apply to the covered branch) will be needed in the next stage of development.
|
❌ Build ds-cassandra-pr-gate/PR-2005 rejected by Butler2 regressions found Found 2 new test failures
Found 1 known test failures |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the issue
https://github.com/riptano/cndb/issues/10302
What does this PR fix and why was it fixed
Implements the necessary trie machinery to work with trie sets, range and deletion-aware tries, and a memtable that uses it to store deletions in separate per-partition branches of the memtable trie.
Implements a method of skipping over tombstones when converting
UnfilteredRowIterator
to the filteredRowIterator
, which has the effect of ignoring all tombstones when looking for data and speeds up next-live lookups dramatically. Adds a test to demonstrate this effect with the new memtable.