-
Notifications
You must be signed in to change notification settings - Fork 15
APMSP-1930 add retry logic for test setup requests to test agent #1015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APMSP-1930 add retry logic for test setup requests to test agent #1015
Conversation
BenchmarksComparisonBenchmark execution time: 2025-04-10 13:17:50 Comparing candidate commit 4f6c368 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 52 metrics, 2 unstable metrics. CandidateCandidate benchmark detailsGroup 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
Group 9
Group 10
Group 11
Group 12
Group 13
BaselineOmitted due to size. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1015 +/- ##
==========================================
- Coverage 71.55% 71.50% -0.05%
==========================================
Files 337 337
Lines 50516 50583 +67
==========================================
+ Hits 36145 36170 +25
- Misses 14371 14413 +42
🚀 New features to boost your workflow:
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
paullegranddc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits
| /// # Arguments | ||
| /// | ||
| /// * `req` - A `Request<Body>` representing the HTTP request to be sent. | ||
| /// * `max_retries` - An `i32` specifying the maximum number of retry attempts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the parameter also needs to be updated here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, thank you.
e9519b5 to
4f6c368
Compare
What does this PR do?
It rarely happens, but it is possible for the container running the test agent for integration tests to reset the network connection even after the agent is ready. This PR adds retry logic for the http requests we send to the agent that aren't part of what's actually being tested. Currently that is starting a test session, validating test snapshots, and getting all the traces that the agent has received. This should hopefully fix flakiness where you see an error like
Request failed: hyper_util::client::legacy::Error(SendRequest, hyper::Error(Io, Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }))Motivation
Baklava should be flaky, not our tests
Additional Notes
How to test the change?
I tested the retry logic works by breaking a snapshot and observing that it retried the correct number of times. I don't really know how to test a connection reset by peer scenario easily.