-
Notifications
You must be signed in to change notification settings - Fork 0
Implements a more efficient design for the history index table #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit tests for OneRange
and HistoryIndices
.
Reviewable status: 0 of 4 files reviewed, 3 unresolved discussions
src/middlewares/versioned_flat_key_value/history_indices/mod.rs
line 13 at r1 (raw file):
pub const ONE_RANGE_BYTES: usize = 1 << ONE_RANGE_BYTES_LOG; const _: () = assert!(ONE_RANGE_BYTES == 64 || ONE_RANGE_BYTES == 128);
use const_assert!
here.
src/middlewares/versioned_flat_key_value/history_indices/one_range.rs
line 34 at r1 (raw file):
/// - There are more than `(ONE_RANGE_BYTES / 2)` bits. #[derive(Debug, Clone, PartialEq)] pub enum OneRange {
Consider a better name for OneRange
, Four
, Two
.
src/middlewares/versioned_flat_key_value/history_indices/one_range.rs
line 225 at r1 (raw file):
} pub trait Max {
Don't define so many traits for just one function. You can define one trait including everything required for the item in integer list, e.g., trait MyTrait: Ord + Copy + Into<u64>
, and with one function saturating_from
to combine min(T::MAX.into())
and from_u64_unchecked
.
…cro for better readability
… method to handle clamping and conversion, simplifying dependencies.
…ask calculation by using right shift instead of left shift
3989cd2
to
84fdbd7
Compare
…VersionRange to cover Bitmap cases with varying maximum bit indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 3 unresolved discussions (waiting on @ChenxingLi)
src/middlewares/versioned_flat_key_value/history_indices/mod.rs
line 13 at r1 (raw file):
Previously, ChenxingLi (Chenxing Li) wrote…
use
const_assert!
here.
Done.
src/middlewares/versioned_flat_key_value/history_indices/one_range.rs
line 34 at r1 (raw file):
Previously, ChenxingLi (Chenxing Li) wrote…
Consider a better name for
OneRange
,Four
,Two
.
Done.
src/middlewares/versioned_flat_key_value/history_indices/one_range.rs
line 225 at r1 (raw file):
Previously, ChenxingLi (Chenxing Li) wrote…
Don't define so many traits for just one function. You can define one trait including everything required for the item in integer list, e.g.,
trait MyTrait: Ord + Copy + Into<u64>
, and with one functionsaturating_from
to combinemin(T::MAX.into())
andfrom_u64_unchecked
.
Done.
This PR implements a more efficient design for the history index table in our multi-version storage system. The current system uses two tables: an index table that tracks which versions contain changes for each key, and a change table that stores the actual values. This redesign focuses on the index table to improve access patterns and reduce storage overhead.
Key Improvements
Version Range Optimization: Instead of storing each <key, version> pair individually, the new design groups versions into ranges for each key. This significantly reduces the number of records in the index table.
Latest Version Optimization: A special
LATEST
marker is used to quickly identify and access the most recent versions, which are accessed most frequently.Space-efficient Encoding: The
OffsetBasedVersionRange
structure provides a flexible encoding scheme that adapts to the density and magnitude of version numbers:Implementation Details
The new index table uses two types of keys:
<key, LATEST>
for the latest (mutable) record<key, end_version_number>
for previous (immutable) recordsThe values in the index table are represented by the
HistoryIndices
enum, which has two variants:Latest(start_version_number, range_encoding, latest_value)
for the most recent versionsPrevious(range_encoding)
for previous versionsCore Functionality
This PR implements the
HistoryIndices
structure as a standalone component with interfaces needed for integration with the storage system:get_latest_value
: Directly retrieves the latest value from the index tablelast_le
: Finds the most recent version number less than or equal to a given versioncollect_versions_le
: Collects all version numbers less than or equal to a given versionAll errors related to index table corruption are now consistently defined as
StorageError::CorruptedHistoryIndices
.Next Steps
This PR focuses on the data structure implementation. The integration with the actual storage system will be addressed in a subsequent PR.
This change is