-
Notifications
You must be signed in to change notification settings - Fork 21
CNDB-13693: Make SAI's view referenceable to simplify locking query view #1700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Checklist before you submit for review
|
src/java/org/apache/cassandra/index/sai/view/IndexViewManager.java
Outdated
Show resolved
Hide resolved
|
I will review it tmr. Update: I won't be able to review it on Tuesday. Will do it within this week. |
| if (matcher.apply(cv)) | ||
| { | ||
| // We can exit now because we won't find a better candidate | ||
| var candidate = new PqInfo(searcher.getPQ(), searcher.containsUnitVectors(), segment.metadata.numRows); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: make sure the PQ is safe to use after releasing the view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about adding a SSTableIndex in PqInfo and release it when PqInfo is done
| ((StorageAttachedIndex) cfs.getIndexManager().getIndex(indexMetadata)).getIndexContext().prepareSSTablesForRebuild(sstables); | ||
| assertEmpty(execute("SELECT * FROM %s WHERE a=1")); | ||
| assertEmpty(execute("SELECT * FROM %s WHERE c=1")); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test testIndexRebuildWhenAddingSStableViaRemoteReload was added to test legacy writer behavior that built index on writer. Now writer only builds index on flush, not reload.
During sstable reload, reloaded sstable should have index attached (compaction task would abort if failed to build index; if initial index build is not completed, index is not-queryable, it's fine); unless it's added as shallow sstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test can be removed if you prefer
src/java/org/apache/cassandra/index/sai/view/IndexViewManager.java
Outdated
Show resolved
Hide resolved
|
|
||
| var sstableReaders = new ArrayList<SSTableReader>(saiView.size()); | ||
| // These are already referenced because they are referenced by the same view we just referenced. | ||
| // TODO review saiView.match() method for boolean predicates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is an issue about boolean predicate support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll create it. The point is that we have a data structure in the view to efficiently get the indexes that might satisfy the predicate. Without using match, we're just searching all indexes.
src/java/org/apache/cassandra/index/sai/view/IndexViewManager.java
Outdated
Show resolved
Hide resolved
pkolaczk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very nice simplification and this avoids all the looping with matching the indexes to sstables. However I have a concern if we don't miss any data in a very narrow race window when flush happens.
jasonstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one comment on potential race with shallow sstable, the rest LGTM
| newViewIndexes.clear(); | ||
|
|
||
| // Throw after releasing already referenced indexes | ||
| if (iterations++ > 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be nice have a config property
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is likely that we'll hit this unless there is a bug in the logic. We have a similar ceiling at 1000 for the chunk cache:
cassandra/src/java/org/apache/cassandra/cache/ChunkCache.java
Lines 654 to 662 in 62e1537
| if (buf == null && ++spin == 1000) | |
| { | |
| String msg = String.format("Could not acquire a reference to for %s after 1000 attempts. " + | |
| "This is likely due to the chunk cache being too small for the " + | |
| "number of concurrently running requests.", chunkKey); | |
| throw new RuntimeException(msg); | |
| // Note: this might also be caused by reference counting errors, especially double release of | |
| // chunks. | |
| } |
|
❌ Build ds-cassandra-pr-gate/PR-1700 rejected by Butler2 new test failure(s) in 10 builds Found 2 new test failures
Found 22 known test failures |
…iew (#1700) Fixes: riptano/cndb#13693 The new design is to: 1. Make sai `View` reference-able so that it holds references to the underlying `SSTableIndex`s. When the `view` is released the final time, it releases the indexes. This moves all of the complexity of grabbing references to sstable update time and out of the query path, which seems like a generally good improvement. 2. Observe that `SSTableIndex` holds a reference to its associated sstable reader. Note also that we need to create the memtable index on deletions in order to make the index-first solution work. Also note that sstables indexes are added before they are removed on flush, so this change in design is safe for flush. (cherry picked from commit 8f31782)
…iew (#1700) Fixes: riptano/cndb#13693 The new design is to: 1. Make sai `View` reference-able so that it holds references to the underlying `SSTableIndex`s. When the `view` is released the final time, it releases the indexes. This moves all of the complexity of grabbing references to sstable update time and out of the query path, which seems like a generally good improvement. 2. Observe that `SSTableIndex` holds a reference to its associated sstable reader. Note also that we need to create the memtable index on deletions in order to make the index-first solution work. Also note that sstables indexes are added before they are removed on flush, so this change in design is safe for flush.
…iew (#1700) Fixes: riptano/cndb#13693 The new design is to: 1. Make sai `View` reference-able so that it holds references to the underlying `SSTableIndex`s. When the `view` is released the final time, it releases the indexes. This moves all of the complexity of grabbing references to sstable update time and out of the query path, which seems like a generally good improvement. 2. Observe that `SSTableIndex` holds a reference to its associated sstable reader. Note also that we need to create the memtable index on deletions in order to make the index-first solution work. Also note that sstables indexes are added before they are removed on flush, so this change in design is safe for flush.



What is the issue
Fixes: https://github.com/riptano/cndb/issues/13693
What does this PR fix and why was it fixed
The new design is to:
Viewreference-able so that it holds references to the underlyingSSTableIndexs. When theviewis released the final time, it releases the indexes. This moves all of the complexity of grabbing references to sstable update time and out of the query path, which seems like a generally good improvement.SSTableIndexholds a reference to its associated sstable reader.Note also that we need to create the memtable index on deletions in order to make the index-first solution work.
Also note that sstables indexes are added before they are removed on flush, so this change in design is safe for flush.