-
Notifications
You must be signed in to change notification settings - Fork 426
[server] Recover log and index file for unclean shutdown #1749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b9b5f2d to
ceaf704
Compare
|
Ready for CR @wuchong @swuferhong |
swuferhong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LiebingYu Thanks for your work. I left some comments.
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Show resolved
Hide resolved
ceaf704 to
1e8a582
Compare
| // can potentially skip over more segments. | ||
| LogTablet.rebuildWriterState( | ||
| writerStateManager, logSegments, 0, segment.getBaseOffset(), false); | ||
| int bytesTruncated = segment.recover(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to pass writerStateManager into segment.recover() to update writer state like how kafka does? cc @swuferhong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kafka use writerStateManager in segment.recover() mainly for txn operations. I think Fluss doen't need here. What's your opinion @swuferhong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing offline with @LiebingYu , we confirmed that it‘s necessary to rebuild the writerState1. Although it is not used during recovery, this code path ensures that when writerStateManager.takeSnapshot()` is called, a correct snapshot up to the current segment is written to disk. This way, during recovery, each segment can be restored based on a fully accurate writerState snapshot, without requiring a full global reconstruction.
| segment.getFileLogRecords().file().getAbsoluteFile()); | ||
| } | ||
| recoverSegment(segment); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we catch CorruptIndexException and recover segment for this case like how kafka does? And we should rethrow other exceptions in else branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Kafka throw
CorruptIndexExceptionbecause Kafka will do sanity check for transaction file which may throwCorruptIndexException. But here in fluss we don't do sanity check of index files insegment.sanityCheck. So I think we don't need to catchCorruptIndexExceptionhere.
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Outdated
Show resolved
Hide resolved
fluss-server/src/test/java/org/apache/fluss/server/log/LogLoaderTest.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/log/LogSegment.java
Outdated
Show resolved
Hide resolved
fluss-server/src/main/java/org/apache/fluss/server/log/LogLoader.java
Outdated
Show resolved
Hide resolved
|
I added a commit to fix the |
Purpose
Linked issue: close #1716
Brief change log
Tests
API and Format
Documentation