-
-
Notifications
You must be signed in to change notification settings - Fork 153
Introduce custom table catalog format #561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not able to query any data that was ingested on older versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use documentation comments.
This PR introduces a new table format that keeps track of data files in the data storage. The format is inspired by Apache Iceberg so has similar naming scheme for things. A snapshot is the main entry point to a table. A snapshot consists of list of list of url to manifest file and primary time stats for pruning. A manifest file contains list of all the actual files present along with the file level statistics. Currently a manifest file is generated per top level partition ( i.e date ).
Update server/src/catalog.rs Co-authored-by: Nick <[email protected]> Signed-off-by: Satyam Singh <[email protected]> Update server/src/catalog/manifest.rs Co-authored-by: Nick <[email protected]> Signed-off-by: Satyam Singh <[email protected]> Update server/src/catalog/column.rs Co-authored-by: Nick <[email protected]> Signed-off-by: Satyam Singh <[email protected]> Update server/src/handlers/http/query.rs Co-authored-by: Nick <[email protected]> Signed-off-by: Satyam Singh <[email protected]> Update server/src/catalog/manifest.rs Co-authored-by: Nick <[email protected]> Signed-off-by: Satyam Singh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM & Tested
Fixes #XXXX.
Description
This PR introduces a new table format that keeps track of all the data files in the data storage. The format is inspired by Apache Iceberg so has similar naming scheme for things.
Currently a manifest file is generated per top level partition ( i.e date ).
This PR has: