Skip to content

Conversation

@dpkp
Copy link
Owner

@dpkp dpkp commented Feb 14, 2025

The primary goal of this PR is to reduce busy-polling during broker and network failures. Several incremental changes are included:

  • Merge BrokerConnection blacked_out() and connection_delay() logic and eliminate delays / blackouts when cycling through entries from a single dns lookup.
  • Improve request-timeout handling in client.poll() by checking all connections for the next potential in-flight-request timeout.
  • Drop 100ms timeout override when there are no in-flight requests (added prior to full async connection support)
  • Do not mark connection as sending if future immediately resolves (error).
  • Remove flat 100ms timeout for unfinished futures (added prior to full async connection support)
  • Respect connection delays during metadata refresh when all brokers are unavailable.
  • Honor reconnect backoff in connection_delay when connecting. This is primarily used by the producer send thread to defer sending accumulated batches. It was originally 0ms but was changed to float('inf') to avoid busy polling. Really this should just be the reconnect backoff delay so that we both avoid busy polling and also avoid starving the sender thread.
  • Increase default reconnect_backoff_max_ms to 30000 (30 secs). This should make defaults more appropriate for production use, especially in larger fleets.

@dpkp
Copy link
Owner Author

dpkp commented Feb 14, 2025

This should help w/ #2400

This was referenced Feb 14, 2025
@dpkp dpkp merged commit 252e0bd into master Feb 14, 2025
14 checks passed
@dpkp dpkp deleted the dpkp/refresh-metadata-backoff branch February 14, 2025 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants