fix(profiler): reduce memory usage for compression #4058

nsrip-dd · 2025-10-21T20:42:10Z

The zstd compression library uses ~8MiB per compressor by default,
primarily for the back-reference window. See this parameter:
https://pkg.go.dev/github.com/klauspost/compress/zstd#WithWindowSize
Since we have an encoder per profile type, this leads to a noticable
increase in memory usage after switching to zstd by default. We can
make the window smaller, but this can negatively affect the compression
ratio. Instead, we can just use a single encoder and share it between
the profile types.

This PR does the bare minimum to implement a single encoder. It's a
bit kludgy to use a separate global lock to guard access to the encoder.
But it's awkward to plumb the synchronization around and keep it more
encapsulated without a bigger refactor.

This will probably make our cycle time slightly longer, since we now wait
for all the processing to complete serially before advancing to the next
profile cycle. It's hard to quantify exactly how much since it depends on
how much profiling data the program produces.

Also worth noting: the execution tracer and CPU profile APIs take a
writer when they start, rather than when we read the data. The tracer in
particular periodically writes out data as it's running. The CPU profiler
technically only writes data to the writer when it's stopped. Either way,
we don't want to hold the global lock in a way that would block either
of these things from completing. So this PR collects this data in a separate
buffer and (re)compresses with the lock held after stopping collection.
We should still come out ahead memory-usage wise by not using 8MiB
per profile type for compression.

pr-commenter · 2025-10-21T20:45:19Z

Benchmarks

Benchmark execution time: 2025-10-28 12:14:12

Comparing candidate commit a37f42c in PR branch PROF-12641-shared-zstd-encoder with baseline commit 399a58b in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 24 metrics, 0 unstable metrics.

datadog-official · 2025-10-21T20:58:29Z

⚠️ Tests

⚠️ Warnings

❄️ 1 New flaky test detected

TestWrapConsumerGroupHandler from github.com/DataDog/dd-trace-go/contrib/IBM/sarama/v2 (Datadog)

Failed

=== RUN   TestWrapConsumerGroupHandler
2025/10/28 12:02:02 Sarama consumer up and running!...
    consumer_group_test.go:146: Message claimed: value = test 1, timestamp = 2025-10-28 12:02:02.066 +0000 UTC, topic = IBM_sarama_TestWrapConsumerGroupHandler
    consumer_group_test.go:88: 
        	Error Trace:	/home/runner/work/dd-trace-go/dd-trace-go/contrib/IBM/sarama/consumer_group_test.go:88
        	Error:      	"[
        	            	name: kafka.produce
        	            	tags: map[string]interface {}{"_dd.base_service":"", "_dd.p.tid":"6900b0ba00000000", "_dd.profiling.enabled":0, "_dd.top_level":1, "component":"IBM/sarama", "language":"go", "messaging.destination.name":"IBM_sarama_TestWrapConsumerGroupHandler", "messaging.kafka.partition":0, "messaging.system":"kafka", "offset":0, "resource.name":"Produce Topic IBM_sarama_TestWrapConsumerGroupHandler", "service.name":"kafka", "span.kind":"producer", "span.name":"kafka.produce", "span.type":"queue"}
...

ℹ️ Info

🧪 All tests passed

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: a37f42c | Docs | Was this helpful? Give us feedback!}

The zstd compression library uses ~8MiB per compressor by default, primarily for the back-reference window. See this parameter: https://pkg.go.dev/github.com/klauspost/compress/zstd#WithWindowSize Since we have an encoder per profile type, this leads to a noticable increase in memory usage after switching to zstd by default. We can make the window smaller, but this can negatively affect the compression ratio. Instead, we can just use a single encoder and share it between the profile types. This commit does the bare minimum to implement a single encoder. It's a bit kludgy to use a separate global lock to guard access to the encoder. But it's awkward to plumb the synchronization around and keep it more encapsulated without a bigger refactor.

nsrip-dd · 2025-10-28T12:46:59Z

/merge

dd-devflow-routing-codex · 2025-10-28T12:47:03Z

View all feedbacks in Devflow UI.

2025-10-28 12:47:03 UTC ℹ️ Start processing command /merge

2025-10-28 12:47:08 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 18m (p90).

2025-10-28 13:03:08 UTC ℹ️ MergeQueue: This merge request was merged

nsrip-dd force-pushed the PROF-12641-shared-zstd-encoder branch from cfe3f71 to 6e78e2e Compare October 24, 2025 12:33

nsrip-dd force-pushed the PROF-12641-shared-zstd-encoder branch from 6e78e2e to a6be279 Compare October 27, 2025 15:56

nsrip-dd marked this pull request as ready for review October 27, 2025 16:06

nsrip-dd requested a review from a team as a code owner October 27, 2025 16:06

felixge mentioned this pull request Oct 28, 2025

refactor(profiler): alternative zstd encoder reuse approach #4078

Merged

7 tasks

refactor(profiler): alternative zstd encoder reuse approach (#4078)

a37f42c

felixge approved these changes Oct 28, 2025

View reviewed changes

dd-devflow bot added mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: queued labels Oct 28, 2025

dd-mergequeue bot merged commit 6854674 into main Oct 28, 2025
267 of 273 checks passed

dd-mergequeue bot deleted the PROF-12641-shared-zstd-encoder branch October 28, 2025 13:03

dd-devflow bot added mergequeue-status: done and removed mergequeue-status: in_progress labels Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(profiler): reduce memory usage for compression #4058

fix(profiler): reduce memory usage for compression #4058

nsrip-dd commented Oct 21, 2025 •

edited

Loading

Uh oh!

pr-commenter bot commented Oct 21, 2025 •

edited

Loading

Uh oh!

datadog-official bot commented Oct 21, 2025 •

edited

Loading

Uh oh!

nsrip-dd commented Oct 28, 2025

Uh oh!

dd-devflow-routing-codex bot commented Oct 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

fix(profiler): reduce memory usage for compression #4058

fix(profiler): reduce memory usage for compression #4058

Conversation

nsrip-dd commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

datadog-official bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

nsrip-dd commented Oct 28, 2025

Uh oh!

dd-devflow-routing-codex bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nsrip-dd commented Oct 21, 2025 •

edited

Loading

pr-commenter bot commented Oct 21, 2025 •

edited

Loading

datadog-official bot commented Oct 21, 2025 •

edited

Loading

dd-devflow-routing-codex bot commented Oct 28, 2025 •

edited

Loading