[FLINK-37416][BugFix][Connectors/DynamoDB] Fix state inconsistency issue in DDB connector when sending split finished event from reader -> enumerator #193

gguptp · 2025-03-25T06:48:14Z

Purpose of the change

Today Flink does not support distributed consistency of events from subtask (Task Manager) to coordinator (Job Manager) - https://issues.apache.org/jira/browse/FLINK-28639. As a result we have a race condition that can lead to a shard and it's children shards stopped being processed after a job restart.

A checkpoint started
Enumerator took a checkpoint (shard was assigned here)
Enumerator sent checkpoint event to reader
Before taking reader checkpoint, a SplitFinishedEvent came up in reader
Reader took checkpoint
Now, just after checkpoint complete, job restarted

This can lead to a shard lineage getting lost because of a shard being in ASSIGNED state in enumerator and not being part of any task manager state.
This PR changes the behaviour by also checkpointing the finished splits events received in between two checkpoints and on restore, those events again getting replayed.

Verifying this change

Please make sure both new and modified tests in this PR follows the conventions defined in our code quality guide: https://flink.apache.org/contributing/code-style-and-quality-common.html#testing

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment
Added unit tests
Manually verified by running the Kinesis connector on a local Flink cluster.

I manually verified this by running the connector in a local flink cluster which was getting restarted every 10 minutes. I see there is no checkpoint discrepancy and there is no issue in the clusster

I also added UTs for the change

Significant changes

(Please check any boxes [x] if the answer is "yes". You can first publish the PR and check them afterwards, for convenience.)

Dependencies have been added or upgraded
Public API has been changed (Public API is any class annotated with @Public(Evolving))
Serializers have been changed
New feature has been introduced
- If yes, how is this documented? (not applicable / docs / JavaDocs / not documented)

AHeise

Approach LGTM in general and I added some smaller nits.

When we discussed the approach with the TreeMap, I forgot that the state of a reader is just a list of splits. So, now you used a special split to track the finished events, which is a plausible solution but also a bit awkward. (We should probably extend the interface to also allow reader-native state)

So, I provided an alternative solution where the information is layered directly inside the split and thus makes the special split unnecessary. PTAL.

...a/org/apache/flink/connector/dynamodb/source/enumerator/DynamoDbStreamsSourceEnumerator.java

...ava/org/apache/flink/connector/dynamodb/source/enumerator/assigner/UniformShardAssigner.java