-
Notifications
You must be signed in to change notification settings - Fork 45
chore: Separate status check from synchronizer functionality #373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cbcd3d6 to
7e0dbce
Compare
| synchronizer.stop() | ||
| timer.stop() | ||
|
|
||
| sync_reader.join(0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of the half second timeout here? My understanding is that if it doesn't stop in .5 seconds, execution will move on and that thread will still continue to process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are trying to shut down in an orderly fashion. The timeout gives us time to close out correctly, but eventually we do have to move on if it isn't halting as expected. The idea is that if close has been called, the python interpreter is going to exit shortly after anyway.
85f59ba to
8e81cdb
Compare
In the previous setup, we would only check the fallback or recovery conditions once the synchronizer returned an update. If the synchronizer was stuck, or nothing was changing in the environment, we would never check the conditions. This configuration also exposed an interesting behavior. If the synchronizer cannot connect, it will emit error updates. Each time we receive an error, we check if we have failed to initialize for the last 10 seconds. If so, we re-create the primary synchronizer. When it continues to fail, the first update will trigger the condition check. And since it has still failed for 10 seconds, it will immediately error out. With this change, we can be assured a synchronizer is given at least 10 seconds to try before the condition is evaluated.
7e0dbce to
38fb5f4
Compare
| synchronizer.stop() | ||
| timer.stop() | ||
|
|
||
| sync_reader.join(0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Thread Termination Failure Leads to Instability
The sync_reader thread is given only a 0.5-second timeout to exit. If it hasn't finished within that time, execution continues while the thread is still running, potentially causing race conditions or resource leaks when a new synchronizer is subsequently created and another reader thread is started.
1799cee to
644cc13
Compare
842e8fa to
e578a2e
Compare
e578a2e to
a37fa56
Compare
ldclient/impl/datasystem/store.py
Outdated
|
|
||
| # Send change events | ||
| if affected_items: | ||
| print("@@@@@ affected items are (delta)", affected_items, "\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a37fa56 to
2ca4af0
Compare
In the previous setup, we would only check the fallback or recovery conditions once the synchronizer returned an update. If the synchronizer was stuck, or nothing was changing in the environment, we would never check the conditions. This configuration also exposed an interesting behavior. If the synchronizer cannot connect, it will emit error updates. Each time we receive an error, we check if we have failed to initialize for the last 10 seconds. If so, we re-create the primary synchronizer. When it continues to fail, the first update will trigger the condition check. And since it has still failed for 10 seconds, it will immediately error out. With this change, we can be assured a synchronizer is given at least 10 seconds to try before the condition is evaluated.
In the previous setup, we would only check the fallback or recovery conditions once the synchronizer returned an update. If the synchronizer was stuck, or nothing was changing in the environment, we would never check the conditions. This configuration also exposed an interesting behavior. If the synchronizer cannot connect, it will emit error updates. Each time we receive an error, we check if we have failed to initialize for the last 10 seconds. If so, we re-create the primary synchronizer. When it continues to fail, the first update will trigger the condition check. And since it has still failed for 10 seconds, it will immediately error out. With this change, we can be assured a synchronizer is given at least 10 seconds to try before the condition is evaluated.
In the previous setup, we would only check the fallback or recovery
conditions once the synchronizer returned an update. If the synchronizer
was stuck, or nothing was changing in the environment, we would never
check the conditions.
This configuration also exposed an interesting behavior. If the
synchronizer cannot connect, it will emit error updates. Each time we
receive an error, we check if we have failed to initialize for the last
10 seconds. If so, we re-create the primary synchronizer.
When it continues to fail, the first update will trigger the condition
check. And since it has still failed for 10 seconds, it will immediately
error out. With this change, we can be assured a synchronizer is given
at least 10 seconds to try before the condition is evaluated.
Note
Separates FDv2 fallback/recovery checks from synchronizer emissions via a timer+reader thread, fixes initial status timestamp, removes streaming context manager hooks, and skips flaky file-source tests in CI.
_consume_synchronizer_resultsnow uses a queue, background reader thread, andRepeatingTasktimer to trigger regular status checks and act without waiting for sync updates.DataSourceStatus.sincewithtime.time()inDataSourceStatusProviderImpl.__enter__,__exit__) fromStreamingDataSource.test_two_phase_initto ensure version progression and add readiness/change events.LD_SKIP_FLAKY_TESTSis set.LD_SKIP_FLAKY_TESTS: truefor test steps on Linux and Windows jobs.Written by Cursor Bugbot for commit fa34244. This will update automatically on new commits. Configure here.