agents: add runtime control for asset collection #377

santigimeno · 2025-10-10T20:23:33Z

Add comprehensive runtime control for asset collection (CPU/heap profiling, snapshots) with multiple configuration methods.

Implements assetsEnabled flag with full control surface:

NSOLID_ASSETS_ENABLED environment variable
nsolid.start({ assetsEnabled: true/false }) configuration
Runtime helpers: enableAssets() and disableAssets()
Dynamic updates via gRPC reconfigure protocol
Validation in both gRPC and ZMQ agents with proper errors
Continuous profiler integration respects assetsEnabled state

This allows operators to disable expensive profiling operations at runtime without process restarts, improving resource management and production safety.

Also adds missing enableTraces()/disableTraces() helper methods to complete the tracing API surface.

Comprehensive test coverage added for all asset types across gRPC, ZMQ, and parallel test suites.

Summary by CodeRabbit

New Features
- Runtime toggles to enable/disable asset collection and tracing (enableAssets/disableAssets, enableTraces/disableTraces); public getters and config option assetsEnabled.
Behavior Changes
- Asset-related operations (profiling, heap profiles/sampling, snapshots, streams) are gated by assetsEnabled and now return a clear "Assets collection disabled" error when disabled.
- Startup and profiling flows respect assetsEnabled state.
Tests
- Expanded gRPC, ZMQ, and parallel tests to cover toggling, helpers, error responses, restore flows, and tracing enable/disable scenarios.

coderabbitai · 2025-10-10T20:24:11Z

Walkthrough

Adds an assetsEnabled runtime flag across proto, agents (gRPC/ZMQ), core, and JS; introduces an EAssetsDisabled error; adds enable/disable helpers for assets and traces; gates asset collection/startup when disabled; and extends tests and test clients to exercise toggles.

Changes

Cohort / File(s)	Summary
Proto schema `agents/grpc/proto/reconfigure.proto`	Add optional bool field `assetsEnabled = 13` to `ReconfigureBody`.
Generated protobuf (C++) `agents/grpc/src/proto/reconfigure.pb.h`, `agents/grpc/src/proto/reconfigure.pb.cc`	Regenerated to include `assetsEnabled`: new impl member, has-bits/offset updates, accessors/mutators, parse/serialize/descriptor, size/merge/clear/swap changes.
gRPC agent `agents/grpc/src/grpc_agent.h`, `agents/grpc/src/grpc_agent.cc`, `agents/grpc/src/grpc_errors.h`	Add atomic `assets_enabled_` flag (default true); read `assetsEnabled` from config/reconfigure; propagate into reconfigure payloads; short-circuit profiling init when false; add `EAssetsDisabled` error.
ZMQ agent `agents/zmq/src/zmq_agent.h`, `agents/zmq/src/zmq_agent.cc`	Add atomic `assets_enabled_` flag; read from config; gate profiling startup (return error) when assets disabled.
Core config propagation `src/nsolid/nsolid_api.cc`	Include `/assetsEnabled` in diffs; read assetsEnabled and gate reading/applying of continuous profiler settings behind it.
Node runtime / API `lib/nsolid.js`	Add `enableAssets()/disableAssets()` and `enableTraces()/disableTraces()` helpers and `assetsEnabled`/`tracingEnabled` getters; add `optionToBool`; persist/normalize assetsEnabled; throw when starting heap profile/sampling if assets disabled.
Test clients (gRPC & ZMQ) `test/common/nsolid-grpc-agent/`, `test/common/nsolid-zmq-agent/`	Add IPC handlers and TestClient methods: `config()`, `enableAssets()`, `disableAssets()`, `enableTraces()`, `disableTraces()`, `trace()`; centralize send-and-wait messaging; guard Otel registration.
Agent tests (gRPC) `test/agents/test-grpc-*.mjs`	Add tests verifying 500 / EAssetsDisabled when assets disabled; add toggle-via-config and enable/disable helper scenarios; tracing tests updated for phased flows/delays.
Agent tests (ZMQ) `test/agents/test-zmq-*.mjs`	Add toggle-driven tests: expect Invalid arguments when disabled and normal flow when enabled; tracing toggle test added.
Parallel Node tests `test/parallel/test-nsolid-*.js`	Add asset-toggle test sequences for heap profile/streams, sampling/streams, snapshot, profile; add tracing toggle checks and stream toggle helpers.
Other tests & helpers `test/common/*`, various test files	Multiple test helpers and cases added/updated to exercise runtime enable/disable of assets/traces and assert config state after toggles.
Build files `deps/curl/curl.gyp`, `deps/grpc/grpc.gyp`	Add new source additions to included build lists (`lib/curlx/inet_ntop.c`, `src/.../cfsocket_listener.cc`).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant JS as lib/nsolid.js
  participant Core as nsolid_api
  participant Agent as Agent (gRPC/ZMQ)
  participant Prof as Profiler

  rect rgb(250,250,255)
    note over User,JS: Start/update with assetsEnabled
    User->>JS: nsolid.start / nsolid.config (assetsEnabled=X)
    JS->>Core: UpdateConfig({assetsEnabled:X})
    Core->>Agent: Reconfigure(assetsEnabled:X)
    Agent->>Agent: assets_enabled_ = X
  end

  rect rgb(245,255,245)
    note over User,Prof: Asset operation request
    User->>JS: heapProfile()/heapSampling()/snapshot()
    JS->>Agent: request operation
    alt Agent.assets_enabled_ == false
      Agent-->>JS: Error (EAssetsDisabled / 500)
      JS-->>User: Error
    else
      Agent->>Prof: start/init
      Prof-->>Agent: data/result
      Agent-->>JS: Result
      JS-->>User: Result
    end
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Points to review closely:

Proto regeneration diffs in reconfigure.pb.{h,cc} (has-bit/offsets, serialization, merge/clear correctness).
gRPC agent state propagation and concurrency use of assets_enabled_ (atomic interactions and initialization paths).
Error code insertion in grpc_errors.h (macro ordering and enum mapping).
JS runtime changes in lib/nsolid.js that gate APIs and throw when disabled.
Extensive test additions/duplications — ensure no accidental duplicate tests and correct timeouts/delays.

Poem

"I nibble on a tiny flag and twitch my whiskers bright,
Assets sleep when set to off, then leap and dance at night.
Traces pause and hop again, profiles hum back in tune—
Tests clap paws, configs nod, beneath the patchwork moon.
A rabbit grins: the burrow's fixed, the meadow's safe and light." 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "agents: add runtime control for asset collection" directly aligns with the PR's primary objective. The changeset comprehensively implements runtime control mechanisms for asset collection across both gRPC and ZMQ agents, including environment variables, configuration options, runtime helper methods (enableAssets/disableAssets), and dynamic reconfigure protocol support. The title is specific and clearly indicates the main change—enabling operators to control asset collection (CPU/heap profiling, snapshots) at runtime without requiring process restarts. While the PR also introduces tracing control methods, the asset collection control is the primary objective and focus, making the title appropriately scoped and accurate.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch santi/assets_enabled

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a2bb66 and ba2865a.

📒 Files selected for processing (2)

deps/curl/curl.gyp (1 hunks)
deps/grpc/grpc.gyp (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: test-macOS
GitHub Check: lint-js-and-md
GitHub Check: coverage-windows
GitHub Check: build-tarball
GitHub Check: test-linux (ubuntu-24.04)
GitHub Check: test-linux (ubuntu-24.04-arm)

🔇 Additional comments (2)

deps/grpc/grpc.gyp (1)

768-768: I need to check if this file was previously missing from the build configuration or if this is a genuinely new addition.

File exists and is properly integrated; verify the necessity of adding it to the build configuration.

The file cfsocket_listener.cc exists at the expected path and includes trace infrastructure headers (trace_flags.h, trace_impl.h), confirming its connection to gRPC's networking and observability stack. The placement within grpc.gyp is syntactically correct and alphabetically consistent with other cf_engine components.

However, verify that this addition is necessary—specifically, confirm whether this file was previously excluded from the build configuration or is being compiled for the first time, and whether this relates directly to the asset collection feature mentioned in the PR.

deps/curl/curl.gyp (1)

131-131: Based on my verification, the addition of 'lib/curlx/inet_ntop.c' is technically necessary for Windows builds, as Windows configuration deliberately does not define HAVE_INET_NTOP and requires local implementations. The inet_ntop.c source is wrapped with #ifndef HAVE_INET_NTOP and only compiled when the system doesn't provide inet_ntop.

However, this change appears unrelated to the PR's stated purpose (runtime control for asset collection). No references to curl's inet_ntop exist elsewhere in the codebase—Node.js uses libuv's uv_inet_ntop instead. This suggests the change may be a cherry-pick from a separate curl maintenance PR or an accidental merge.

Clarify whether this inet_ntop addition is intentional for this PR or should be moved to a separate curl upgrade PR. If it belongs here, it should be documented in the commit message. If not, revert and submit as a separate PR.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

agents/grpc/src/grpc_agent.cc (1)
1710-1760: Propagate assetsEnabled via reconfigure().

reconfigure() ignores the new proto field, so gRPC-driven config updates can’t flip the assets gate and the feature regresses. Please write the field into the outbound JSON before calling UpdateConfig().
   if (body.has_contcpuprofile()) {
       out["contCpuProfile"] = body.contcpuprofile();
   }
+
+  if (body.has_assetsenabled()) {
+      out["assetsEnabled"] = body.assetsenabled();
+  }

🧹 Nitpick comments (4)

test/parallel/test-nsolid-heap-sampling-stream.js (2)
191-224: Consider refactoring nested callbacks to improve readability.

The test uses four levels of nested setImmediate() callbacks, making the flow difficult to follow. Consider refactoring to async/await or helper functions to flatten the structure.

Example refactor using helper functions:
function runHeapSamplingStreamAssetsToggleTests() {
  // Disable assets through config update
  nsolid.start({
    command: 'localhost:9001',
    data: 'localhost:9002',
    assetsEnabled: false,
  });

  setImmediate(() => {
    testDisabled(() => {
      testEnabled(() => {
        testDisabled(() => {
          testFinalEnabled();
        });
      });
    });
  });
}

function testDisabled(callback) {
  assert.throws(
    () => nsolid.heapSamplingStream(0, 1000),
    { message: 'Heap sampling could not be started' }
  );
  callback();
}

function testEnabled(callback) {
  nsolid.enableAssets();
  setImmediate(() => {
    let profile = '';
    const stream = nsolid.heapSamplingStream(0, 1200);
    stream.on('data', (chunk) => { profile += chunk; });
    stream.on('end', common.mustCall(() => {
      assert(JSON.parse(profile));
      testProfileSchema(JSON.parse(profile));
      callback();
    }));
  });
}

function testFinalEnabled() {
  nsolid.enableAssets();
  setImmediate(() => {
    const stream = nsolid.heapSamplingStream(0, 1200);
    stream.resume();
    stream.on('end', common.mustCall(() => {
      assert.ok(true);
    }));
  });
}
193-199: Consider adding memory allocations for consistency.

The initial test (lines 128-131) performs allocations to ensure there's data to sample. The subsequent heap sampling calls don't do this, which might lead to less deterministic results. While the tests may still pass, adding allocations would make the behavior more consistent.

Add allocations before each heap sampling operation:
     nsolid.enableAssets();
     setImmediate(() => {
+      // Do some allocations to ensure sampling data
+      const arr = [];
+      for (let i = 0; i < 1000; i++) {
+        arr.push(new Array(1000));
+      }
+      
       let profile = '';
       const enabledStream = nsolid.heapSamplingStream(0, 1200);
Apply the same pattern before line 216.

Also applies to: 216-221
test/parallel/test-nsolid-profile.js (1)
120-178: Reduce reliance on fixed sleeps; gate on observed config state

The toggle flow is correct, but hardcoded 100ms delays can flake under slow CI or propagation lag. After each start()/enable/disable step, prefer waiting until nsolid.config.assetsEnabled reflects the expected value before proceeding (or assert it directly), instead of fixed timeouts.

Example pattern:
function waitForAssetsState(expected, deadline = Date.now() + 2000) {
  return new Promise((resolve, reject) => {
    (function poll() {
      if (nsolid.config.assetsEnabled === expected) return resolve();
      if (Date.now() > deadline) return reject(new Error('assetsEnabled not applied'));
      setImmediate(poll);
    })();
  });
}

// Usage:
nsolid.start({ /* … */ assetsEnabled: false });
await waitForAssetsState(false);
// then assert throws, etc.
test/parallel/test-nsolid-heap-profile.js (1)
133-200: Consider using async/await instead of nested setTimeout calls.

The nested setTimeout pattern (lines 141, 167, 173, 193) makes the test flow harder to follow and maintain. Consider refactoring to use async/await with timer promises:
-function runHeapProfileAssetsToggleTests() {
+async function runHeapProfileAssetsToggleTests() {
   // Disable assets via config update
   nsolid.start({
     command: 'localhost:9001',
     data: 'localhost:9002',
     assetsEnabled: false,
   });

-  setTimeout(() => {
+  await setTimeout(100);
     assert.throws(
       () => {
         nsolid.heapProfile();
       },
       {
         message: 'Heap profile could not be started'
       }
     );
     // ... rest of assertions
-  }, 100);
 }
This would improve readability and make the test more maintainable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 987f908 and da96906.

📒 Files selected for processing (32)

agents/grpc/proto/reconfigure.proto (1 hunks)
agents/grpc/src/grpc_agent.cc (4 hunks)
agents/grpc/src/grpc_agent.h (1 hunks)
agents/grpc/src/grpc_errors.h (1 hunks)
agents/grpc/src/proto/reconfigure.pb.cc (18 hunks)
agents/grpc/src/proto/reconfigure.pb.h (4 hunks)
agents/zmq/src/zmq_agent.cc (2 hunks)
agents/zmq/src/zmq_agent.h (1 hunks)
lib/nsolid.js (8 hunks)
src/nsolid/nsolid_api.cc (1 hunks)
test/agents/test-grpc-continuous-profile.mjs (2 hunks)
test/agents/test-grpc-heap-profile.mjs (1 hunks)
test/agents/test-grpc-heap-sampling.mjs (1 hunks)
test/agents/test-grpc-profile.mjs (1 hunks)
test/agents/test-grpc-snapshot.mjs (1 hunks)
test/agents/test-grpc-tracing.mjs (3 hunks)
test/agents/test-zmq-heap-profile.mjs (1 hunks)
test/agents/test-zmq-heap-sampling.mjs (1 hunks)
test/agents/test-zmq-profile.mjs (1 hunks)
test/agents/test-zmq-snapshot.mjs (1 hunks)
test/agents/test-zmq-tracing.mjs (2 hunks)
test/common/nsolid-grpc-agent/client.js (2 hunks)
test/common/nsolid-grpc-agent/index.js (1 hunks)
test/common/nsolid-zmq-agent/client.js (1 hunks)
test/common/nsolid-zmq-agent/index.js (1 hunks)
test/parallel/test-nsolid-config-tracing.js (1 hunks)
test/parallel/test-nsolid-heap-profile-stream.js (1 hunks)
test/parallel/test-nsolid-heap-profile.js (1 hunks)
test/parallel/test-nsolid-heap-sampling-stream.js (2 hunks)
test/parallel/test-nsolid-heap-sampling.js (1 hunks)
test/parallel/test-nsolid-profile.js (1 hunks)
test/parallel/test-nsolid-snapshot.js (1 hunks)

🔇 Additional comments (32)

test/parallel/test-nsolid-heap-sampling-stream.js (1)

179-226: Verify synchronization guarantees for config changes.

The test uses setImmediate() to sequence operations after calling nsolid.start(), enableAssets(), and disableAssets(). However, there's no explicit guarantee that config changes have fully propagated before the next assertion executes. This could lead to flaky test failures if the timing assumptions don't hold in all environments.

Consider one of the following approaches:

If nsolid provides synchronous config updates or completion callbacks, use those instead of setImmediate().

Add explicit verification that the config state has changed before proceeding (e.g., check a status flag).

Document the timing assumptions if setImmediate() is sufficient based on the nsolid implementation.

Run the following script to check if other nsolid tests use similar patterns:

agents/zmq/src/zmq_agent.h (1)

731-731: LGTM! Atomic flag added for thread-safe asset control.

The assets_enabled_ flag is appropriately declared as std::atomic<bool> for thread-safe access and is correctly placed in the profiling section alongside related state variables.

agents/grpc/src/grpc_errors.h (1)

12-13: LGTM! New error type correctly defined.

The EAssetsDisabled error entry is properly added with:

Sequential error code (1008)

Appropriate HTTP status (500) for internal/configuration errors

Clear error message that aligns with test expectations

agents/zmq/src/zmq_agent.cc (2)

1282-1287: LGTM! Configuration reading follows established pattern.

The assetsEnabled configuration is correctly read from the config and stored in the atomic flag, following the same pattern used for other configuration fields.

2115-2117: LGTM! Asset gating correctly implemented.

The early return when assets_enabled_ is false properly prevents profiling operations when assets are disabled. The UV_EINVAL error code maps to "Invalid arguments" (code 422) in create_command_error, which is appropriate for this validation failure.

test/agents/test-grpc-tracing.mjs (3)

5-5: LGTM! Import added for async delays in test flow.

The delay import from node:timers/promises is appropriately added to support the new phase-based test flow that requires waiting for spans to flush between enable/disable cycles.

205-247: LGTM! Comprehensive phase-based test for HTTP tracing.

The test correctly implements a multi-phase flow to verify:

Initial span collection works

Disabling traces prevents span collection (verified with delay and assertion)

Re-enabling traces restores span collection

The phase state machine and early return on 'done' prevent race conditions and ensure proper test sequencing.

273-315: LGTM! Comprehensive phase-based test for custom tracing.

The custom tracing test mirrors the HTTP tracing test structure, providing consistent coverage for the disable/enable functionality across different tracing types.

lib/nsolid.js (8)

133-136: LGTM! Public API surface extended with asset and trace controls.

The four new functions (enableAssets, disableAssets, enableTraces, disableTraces) are appropriately exported and provide runtime control over asset collection and tracing.

195-206: LGTM! Getters provide appropriate defaults.

The new getters correctly:

Default assetsEnabled to true (assets enabled by default)

Default tracingEnabled to false (tracing disabled by default)

Use nullish coalescing (??) to handle undefined config values

413-415: LGTM! Assets guard correctly prevents heap profiling when disabled.

The early check throws ERR_NSOLID_HEAP_PROFILE_START when assets are disabled, preventing the operation before any profiling setup occurs.

497-498: LGTM! Assets guard correctly prevents heap sampling when disabled.

The early check throws ERR_NSOLID_HEAP_SAMPLING_START when assets are disabled, consistent with the heap profiling guard.

604-633: LGTM! Enable/disable functions correctly check state before updating.

All four functions follow a consistent pattern:

Check current state to avoid unnecessary updates

Call updateConfig with the new value

This prevents redundant configuration updates and ensures idempotent behavior.

761-765: LGTM! Config update handling uses normalization helper.

The assetsEnabled configuration is correctly normalized using optionToBool before storing, which handles various input types (boolean, number, string) consistently.

947-956: LGTM! Config initialization with proper defaults.

The assetsEnabled field is correctly initialized with:

Priority order: environment variable > package.json > default (true)

Proper normalization using optionToBool

Sensible default of true when neither source provides a value

1106-1116: LGTM! Flexible normalization helper for boolean configuration.

The optionToBool function appropriately handles:

undefined/null → undefined (allows detecting absence of value)

Boolean → direct return

Number → falsy check (0 → false, non-zero → true)

String → delegates to envToBool for "0"/"false" handling

Other types → undefined (graceful handling)

agents/grpc/proto/reconfigure.proto (1)

20-20: LGTM! Protocol buffer field correctly defined.

The assetsEnabled field is properly added with:

Sequential field number (13)

Optional modifier for backward compatibility

Appropriate boolean type

Consistent camelCase naming

test/common/nsolid-zmq-agent/client.js (1)

141-148: LGTM! IPC message handlers correctly wire up runtime controls.

The four new message handlers properly:

Call the corresponding nsolid functions to enable/disable assets and traces

Follow the fire-and-forget pattern (no response sent)

Are consistent with the existing message handler structure

agents/grpc/src/grpc_agent.h (1)

352-352: LGTM! Atomic flag added for thread-safe asset control.

The assets_enabled_ flag is appropriately declared as std::atomic<bool> for thread-safe access and is correctly placed in the profiling section alongside related profiling state.

test/parallel/test-nsolid-config-tracing.js (1)

24-29: LGTM: runtime tracing helpers keep config in sync

The toggles and corresponding assertions look correct and consistent with earlier start() checks.

test/agents/test-zmq-profile.mjs (1)

120-177: LGTM: ZMQ profile respects assetsEnabled and event flow

Good coverage of disabled → 422 path, enabled happy-path, and final config verification. Resolve guard avoids double-resolve.

test/parallel/test-nsolid-heap-profile-stream.js (1)

68-70: LGTM: deterministic heap profile stream toggling

Nice sequencing around ‘end’ to avoid races; assertions match the API’s disabled vs enabled behavior.

Also applies to: 72-126

test/agents/test-zmq-heap-profile.mjs (1)

342-408: LGTM: ZMQ heap-profile honors assetsEnabled and validates payload

Covers error path, success path, and final config sync. Bootstrap opts prevent stray events. Looks good.

test/agents/test-zmq-snapshot.mjs (1)

235-296: LGTM: ZMQ snapshot gating via assetsEnabled

Error and success paths validated with proper event sequencing and final config check. All good.

test/parallel/test-nsolid-profile.js (1)

107-111: LGTM: assertion formatting

No behavior change; readability improved.

test/agents/test-zmq-heap-sampling.mjs (1)

135-201: LGTM! Comprehensive test for assetsEnabled toggling.

The test properly validates:

Disabling assets via config results in error code 422

Re-enabling assets allows profiling to proceed

Final config state is verified

The use of the resolved flag (lines 170-171) prevents duplicate event processing, which is a good defensive pattern.

test/agents/test-grpc-heap-sampling.mjs (1)

248-378: LGTM! Comprehensive coverage of asset control methods.

The three new tests thoroughly validate asset gating across different control surfaces:

Environment variable (NSOLID_ASSETS_ENABLED)

Configuration API (child.config())

Helper methods (enableAssets/disableAssets)

Each test properly verifies:

Error 500 with message "Assets collection disabled(1008)" when disabled

Successful profiling when enabled

Final config state consistency

The pattern is consistent and maintainable.

test/agents/test-grpc-snapshot.mjs (1)

246-367: LGTM! Consistent test coverage for snapshot operations.

The tests mirror the heap sampling pattern and properly validate snapshot behavior with asset controls. The consistency across test files makes the test suite easier to understand and maintain.

test/agents/test-grpc-heap-profile.mjs (1)

259-389: LGTM! Complete test coverage for heap profiling with asset controls.

The tests maintain consistency with the other asset control tests and properly validate heap profiling behavior across all three control surfaces.

test/common/nsolid-zmq-agent/index.js (1)

187-233: LGTM! Clean helper methods for runtime control.

The four new methods (enableTraces, disableTraces, enableAssets, disableAssets) follow the established async pattern in the TestClient class. The implementation is straightforward and consistent.

test/common/nsolid-grpc-agent/index.js (1)

346-401: LGTM! Excellent refactoring with the #sendAndWait helper.

The introduction of the #sendAndWait helper method (lines 389-401) is a good abstraction that:

Reduces code duplication

Provides a consistent pattern for message handling

Makes the new helper methods (enableAssets, disableAssets, enableTraces, disableTraces) more concise and maintainable

The refactored config() method now uses this helper, demonstrating its utility.

src/nsolid/nsolid_api.cc (1)

1161-1186: Verify continuous profiler is disabled when assetsEnabled is false.

The logic correctly gates contCpuProfile and contCpuProfileInterval reads behind the assetsEnabled check. However, ensure that update_continuous_profiler properly handles the case when both flags are false (which happens when assetsEnabled is false).

Run the following to verify the continuous profiler handling:

test/agents/test-zmq-tracing.mjs

test/parallel/test-nsolid-heap-sampling-stream.js

test/parallel/test-nsolid-snapshot.js

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

test/parallel/test-nsolid-heap-sampling-stream.js (1)
179-187: Also assert the error code for gating failures

When assets are disabled, heapSamplingStream throws ERR_NSOLID_HEAP_SAMPLING_START synchronously. Assert the code property too for stronger checks.
     assert.throws(
       () => {
         nsolid.heapSamplingStream(0, 1000);
       },
       {
-        message: 'Heap sampling could not be started'
+        message: 'Heap sampling could not be started',
+        code: 'ERR_NSOLID_HEAP_SAMPLING_START'
       }
     );
Apply the same change to Lines 204‑211.

Also applies to: 204-211

🧹 Nitpick comments (10)

test/parallel/test-nsolid-heap-sampling-stream.js (1)
173-177: Avoid starting ZMQ in this unit test

No need to pass command/data to nsolid.start() here. It can introduce unwanted ZMQ side-effects. Set only assetsEnabled.
-  nsolid.start({
-    command: 'localhost:9001',
-    data: 'localhost:9002',
-    assetsEnabled: false,
-  });
+  nsolid.start({ assetsEnabled: false });
lib/nsolid.js (3)
413-416: Validate args before gating; remove unused parameter

Move the assetsEnabled check after validateNumber/validateBoolean to preserve argument error semantics consistently.

The options parameter in heapProfileStream is unused; drop it to avoid confusion.
-function heapProfileStream(threadId, duration, trackAllocations, options) {
-  if (getConfig('assetsEnabled') === false)
-    throw new ERR_NSOLID_HEAP_PROFILE_START();
+function heapProfileStream(threadId, duration, trackAllocations) {
+  // Validate arguments first.
   validateNumber(threadId, 'threadId');
   validateNumber(duration, 'duration');
   validateBoolean(trackAllocations, 'trackAllocations');
+  // Gate after validation.
+  if (getConfig('assetsEnabled') === false)
+    throw new ERR_NSOLID_HEAP_PROFILE_START();
947-957: Initialize assetsEnabled from env/pkg with sane default

Solid defaulting logic. Be mindful that empty-string env values may not normalize to boolean (see optionToBool suggestion below).

Consider how NSOLID_ASSETS_ENABLED="" should behave. If it should be treated as “unset”, ensure optionToBool handles empty strings accordingly (see proposed change on Lines 1106‑1116).

1106-1116: optionToBool: handle empty strings explicitly and return strict booleans

If a string is empty, current envToBool returns a falsy value (empty string), not a strict boolean. When stored in config, later strict comparisons (=== false) can misbehave.
function optionToBool(value) {
  if (value === undefined || value === null)
    return undefined;
  if (typeof value === 'boolean')
    return value;
  if (typeof value === 'number')
    return value !== 0;
-  if (typeof value === 'string')
-    return envToBool(value);
+  if (typeof value === 'string') {
+    if (value === '')
+      return undefined; // treat empty string as “unset”
+    return value !== '0' && value.toLowerCase() !== 'false';
+  }
  return undefined;
}
test/parallel/test-nsolid-profile.js (1)
120-182: Reduce flakiness: replace fixed 100ms delays with platform-aware timeouts

Use common.platformTimeout(100) for the nested setTimeout calls to better tolerate slow CI hosts.

Apply this diff within runAssetsToggleTests():
-    }, 100);
+    }, common.platformTimeout(100));
...
-          }, 100);
+          }, common.platformTimeout(100));
...
-            }, 100);
+            }, common.platformTimeout(100));
...
-    }, 100);
+    }, common.platformTimeout(100));
test/agents/test-zmq-heap-profile.mjs (1)

342-409: Confirm error semantics across protocols

Here ZMQ returns 422 "Invalid arguments" when assets are disabled, while gRPC tests assert 500 "Assets collection disabled(1008)". Is this divergence intentional? If not, consider aligning error codes/messages for consistency across agents.
test/parallel/test-nsolid-heap-sampling.js (1)
177-248: Harden timeouts against CI variance

Use common.platformTimeout(100) instead of raw 100ms in nested setTimeout calls to reduce flakiness.

Suggested changes within runHeapSamplingAssetsToggleTests():
-            }, 100);
+            }, common.platformTimeout(100));
-          }, 100);
+          }, common.platformTimeout(100));
-    }, 100);
+    }, common.platformTimeout(100));
-  }, 100);
+  }, common.platformTimeout(100));
test/parallel/test-nsolid-heap-profile.js (1)
133-205: Use platform-aware timeouts to avoid flakes

Replace fixed 100ms in nested setTimeout calls with common.platformTimeout(100).

Apply within runHeapProfileAssetsToggleTests():
-            }, 100);
+            }, common.platformTimeout(100));
-          }, 100);
+          }, common.platformTimeout(100));
-    }, 100);
+    }, common.platformTimeout(100));
-  }, 100);
+  }, common.platformTimeout(100));
test/agents/test-grpc-snapshot.mjs (1)

282-324: Reconfigure propagation timing

After child.config({ assetsEnabled: false/true }), the test immediately calls heapSnapshot(). If reconfigure is eventually consistent, add a small wait or poll child.config() until it reflects the new value before proceeding to avoid rare races.

test/agents/test-grpc-continuous-profile.mjs (1)

275-432: Consider removing or documenting the commented-out code.

This ~160-line commented-out section contains three test cases. If these tests are obsolete, remove them to improve code clarity. If they're deferred for future work, consider opening an issue to track them and adding a brief comment explaining why they're commented out.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da96906 and 74e0d6b.

📒 Files selected for processing (32)

agents/grpc/proto/reconfigure.proto (1 hunks)
agents/grpc/src/grpc_agent.cc (4 hunks)
agents/grpc/src/grpc_agent.h (1 hunks)
agents/grpc/src/grpc_errors.h (1 hunks)
agents/grpc/src/proto/reconfigure.pb.cc (18 hunks)
agents/grpc/src/proto/reconfigure.pb.h (4 hunks)
agents/zmq/src/zmq_agent.cc (2 hunks)
agents/zmq/src/zmq_agent.h (1 hunks)
lib/nsolid.js (8 hunks)
src/nsolid/nsolid_api.cc (1 hunks)
test/agents/test-grpc-continuous-profile.mjs (2 hunks)
test/agents/test-grpc-heap-profile.mjs (1 hunks)
test/agents/test-grpc-heap-sampling.mjs (1 hunks)
test/agents/test-grpc-profile.mjs (1 hunks)
test/agents/test-grpc-snapshot.mjs (1 hunks)
test/agents/test-grpc-tracing.mjs (3 hunks)
test/agents/test-zmq-heap-profile.mjs (1 hunks)
test/agents/test-zmq-heap-sampling.mjs (1 hunks)
test/agents/test-zmq-profile.mjs (1 hunks)
test/agents/test-zmq-snapshot.mjs (1 hunks)
test/agents/test-zmq-tracing.mjs (2 hunks)
test/common/nsolid-grpc-agent/client.js (2 hunks)
test/common/nsolid-grpc-agent/index.js (1 hunks)
test/common/nsolid-zmq-agent/client.js (1 hunks)
test/common/nsolid-zmq-agent/index.js (1 hunks)
test/parallel/test-nsolid-config-tracing.js (1 hunks)
test/parallel/test-nsolid-heap-profile-stream.js (1 hunks)
test/parallel/test-nsolid-heap-profile.js (1 hunks)
test/parallel/test-nsolid-heap-sampling-stream.js (2 hunks)
test/parallel/test-nsolid-heap-sampling.js (1 hunks)
test/parallel/test-nsolid-profile.js (1 hunks)
test/parallel/test-nsolid-snapshot.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (17)

test/common/nsolid-zmq-agent/client.js
agents/grpc/src/grpc_agent.cc
test/agents/test-zmq-snapshot.mjs
agents/grpc/src/grpc_agent.h
test/parallel/test-nsolid-snapshot.js
test/agents/test-zmq-tracing.mjs
test/agents/test-grpc-profile.mjs
agents/zmq/src/zmq_agent.h
agents/zmq/src/zmq_agent.cc
src/nsolid/nsolid_api.cc
test/parallel/test-nsolid-config-tracing.js
test/agents/test-grpc-heap-profile.mjs
test/common/nsolid-grpc-agent/index.js
agents/grpc/src/grpc_errors.h
test/agents/test-zmq-heap-sampling.mjs
test/parallel/test-nsolid-heap-profile-stream.js
agents/grpc/src/proto/reconfigure.pb.h

🔇 Additional comments (21)

lib/nsolid.js (7)

133-137: Public toggles for assets/traces exposed

Good addition. Clear, minimal API with enable/disable helpers.

195-207: Config getters for assetsEnabled/tracingEnabled

Good defaults and exposure via nsolid.config and getters.

497-499: Gating for heapSamplingStream

Looks correct; early throw prevents stream creation when assets disabled.

761-765: assetsEnabled in updateConfig

Normalization via optionToBool and conditional assignment is correct.

604-618: enableAssets/disableAssets helpers

Simple, idempotent toggles are good.

620-634: enableTraces/disableTraces helpers

Consistent with assets toggles.

637-638: Doc nit

Minor doc change acknowledged. No issues.

agents/grpc/proto/reconfigure.proto (1)

20-20: Add assetsEnabled to ReconfigureBody

Looks good. Field number 13 follows existing numbering without conflict.

test/agents/test-zmq-profile.mjs (1)

120-177: ZMQ profile test for assetsEnabled toggle

Good coverage: disables via config, asserts 422, re-enables and validates profile flow; also verifies config state.

test/common/nsolid-grpc-agent/client.js (2)

64-69: Guard otel.register against duplicate registration

Good defensive handling; ignores expected already-registered error.

124-137: Support runtime toggle messages for assets/traces

Handlers invoke nsolid toggles and send acks. Looks correct.

test/parallel/test-nsolid-profile.js (1)

107-112: LGTM on assertion formatting and flow

Multi-line strictEqual improves readability, and invoking runAssetsToggleTests() extends coverage.

test/parallel/test-nsolid-heap-sampling.js (1)

164-168: ** LGTM on assertion formatting**

Message split across lines is fine and consistent with other tests.

test/parallel/test-nsolid-heap-profile.js (1)

124-124: LGTM

Running runHeapProfileAssetsToggleTests() extends coverage for assets toggling.

test/agents/test-grpc-snapshot.mjs (2)

246-280: LGTM: env-based gating coverage

Good validation of NSOLID_ASSETS_ENABLED=0 yielding 500 "Assets collection disabled(1008)".

325-368: LGTM: helper-based toggling

disableAssets()/enableAssets() paths look correct; asserting currentConfig.assetsEnabled ensures state consistency.

test/agents/test-grpc-heap-sampling.mjs (1)

248-378: LGTM! Comprehensive test coverage for assets gating.

The three new test cases thoroughly validate the assetsEnabled feature across different configuration methods (environment variable, config API, and helper methods). The tests correctly verify that heap sampling returns a 500 error when assets are disabled and succeeds when re-enabled.

test/agents/test-grpc-tracing.mjs (1)

5-5: LGTM! Well-structured phase-based testing for tracing toggle.

The phase-based state machine ('initial' → 'disabled' → 'reenabled' → 'done') effectively validates that tracing can be toggled at runtime. The tests correctly verify that no spans are emitted when tracing is disabled and that spans resume after re-enabling.

Also applies to: 205-247, 273-315

test/agents/test-grpc-continuous-profile.mjs (1)

4-4: LGTM! Thorough validation of continuous profiling with assets gating.

The new and modified tests effectively verify that continuous CPU profiling respects the assetsEnabled flag across different configuration methods. The use of profile counters and timing delays provides clear validation of the gating behavior.

Also applies to: 100-224

test/common/nsolid-zmq-agent/index.js (1)

187-233: LGTM! Helper methods follow established patterns.

The four new methods (enableTraces, disableTraces, enableAssets, disableAssets) correctly follow the existing IPC messaging pattern used throughout the TestClient class. The implementation is consistent with other helper methods.

agents/grpc/src/proto/reconfigure.pb.cc (1)

1-1082: Auto-generated protobuf code.

This file contains auto-generated code from the protocol buffer compiler. The changes mechanically add the assetsEnabled field (field 13) to ReconfigureBody with appropriate initialization, serialization, and accessor logic. No manual review required.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

test/agents/test-zmq-heap-sampling.mjs (1)
135-201: LGTM! Completes consistent coverage across asset types.

The test maintains the established pattern for heap sampling, with appropriate sampling-specific validations (lines 189-191). The consistent implementation across all three asset types (snapshot, heap profile, heap sampling) demonstrates thorough test coverage for the assetsEnabled feature.

Note: The three tests share a nearly identical structure. While this duplication is acceptable and even beneficial for test clarity and isolation, consider extracting a reusable test helper if additional asset types are introduced in the future. For example:
async function testAssetsEnabledToggle(playground, startAssetFn, validateAssetFn) {
  // Common toggle logic
}
However, this refactor is entirely optional given the current scope.
test/parallel/test-nsolid-snapshot.js (3)
102-104: Confirm mustSucceed() without handler; otherwise pass a noop for safety

If common.mustSucceed() requires a function, this could throw. Passing a noop keeps semantics identical and avoids coupling to helper internals.
-              nsolid.snapshot(common.mustSucceed());
+              nsolid.snapshot(common.mustSucceed(() => {}));
60-66: Re-invoking nsolid.start: verify idempotency and side-effects

Starting twice in the same process may reinitialize state/transport. If not guaranteed safe, prefer a single start then drive behavior via enableAssets()/disableAssets() and disableSnapshots flags, or reconfigure if supported.

68-76: Strengthen assertions: check the specific error code when assets are disabled

This PR introduces EAssetsDisabled. Asserting the code (in addition to message) avoids conflating “assets disabled” with other failure causes.

Example tweaks:
-    assert.throws(
+    assert.throws(
       () => {
         nsolid.snapshot();
       },
       {
-        message: 'Heap snapshot could not be generated'
+        message: 'Heap snapshot could not be generated',
+        code: 'EAssetsDisabled'
       }
     );
-    assert.notStrictEqual(err.code, 0);
-    assert.strictEqual(err.message, 'Heap snapshot could not be generated');
+    assert.strictEqual(err.code, 'EAssetsDisabled');
+    assert.strictEqual(err.message, 'Heap snapshot could not be generated');
Apply similarly to the later assert. If codes can vary by transport, consider allowing either the new code or falling back to the message match.

Also applies to: 78-81, 90-97

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74e0d6b and de122d2.

📒 Files selected for processing (32)

agents/grpc/proto/reconfigure.proto (1 hunks)
agents/grpc/src/grpc_agent.cc (4 hunks)
agents/grpc/src/grpc_agent.h (1 hunks)
agents/grpc/src/grpc_errors.h (1 hunks)
agents/grpc/src/proto/reconfigure.pb.cc (18 hunks)
agents/grpc/src/proto/reconfigure.pb.h (4 hunks)
agents/zmq/src/zmq_agent.cc (2 hunks)
agents/zmq/src/zmq_agent.h (1 hunks)
lib/nsolid.js (8 hunks)
src/nsolid/nsolid_api.cc (1 hunks)
test/agents/test-grpc-continuous-profile.mjs (2 hunks)
test/agents/test-grpc-heap-profile.mjs (1 hunks)
test/agents/test-grpc-heap-sampling.mjs (1 hunks)
test/agents/test-grpc-profile.mjs (1 hunks)
test/agents/test-grpc-snapshot.mjs (1 hunks)
test/agents/test-grpc-tracing.mjs (3 hunks)
test/agents/test-zmq-heap-profile.mjs (1 hunks)
test/agents/test-zmq-heap-sampling.mjs (1 hunks)
test/agents/test-zmq-profile.mjs (1 hunks)
test/agents/test-zmq-snapshot.mjs (1 hunks)
test/agents/test-zmq-tracing.mjs (2 hunks)
test/common/nsolid-grpc-agent/client.js (2 hunks)
test/common/nsolid-grpc-agent/index.js (1 hunks)
test/common/nsolid-zmq-agent/client.js (1 hunks)
test/common/nsolid-zmq-agent/index.js (1 hunks)
test/parallel/test-nsolid-config-tracing.js (1 hunks)
test/parallel/test-nsolid-heap-profile-stream.js (1 hunks)
test/parallel/test-nsolid-heap-profile.js (1 hunks)
test/parallel/test-nsolid-heap-sampling-stream.js (2 hunks)
test/parallel/test-nsolid-heap-sampling.js (1 hunks)
test/parallel/test-nsolid-profile.js (1 hunks)
test/parallel/test-nsolid-snapshot.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (18)

agents/zmq/src/zmq_agent.h
test/agents/test-zmq-tracing.mjs
test/agents/test-zmq-profile.mjs
agents/grpc/src/grpc_agent.h
agents/grpc/src/grpc_agent.cc
test/agents/test-grpc-heap-profile.mjs
test/parallel/test-nsolid-heap-profile-stream.js
test/common/nsolid-grpc-agent/client.js
agents/grpc/proto/reconfigure.proto
test/common/nsolid-zmq-agent/client.js
test/parallel/test-nsolid-profile.js
agents/zmq/src/zmq_agent.cc
src/nsolid/nsolid_api.cc
test/parallel/test-nsolid-heap-sampling-stream.js
agents/grpc/src/grpc_errors.h
test/common/nsolid-zmq-agent/index.js
test/agents/test-grpc-snapshot.mjs
test/agents/test-grpc-profile.mjs

🔇 Additional comments (36)

test/parallel/test-nsolid-heap-profile.js (2)

124-124: LGTM!

The placement and invocation of the new test function is appropriate, ensuring it runs after the existing test flow completes.

133-204: Verify enableAssets/disableAssets return values and consider timing robustness.

The test provides good coverage of asset toggle scenarios. However, consider these improvements:

Missing verification: The calls to enableAssets() and disableAssets() (lines 166, 172, 193) don't verify success. If these operations can fail or return status, add assertions to confirm the state changes actually occurred.

Timing robustness: The test relies on multiple hardcoded 100ms timeouts (lines 141, 167, 173, 194). In slow CI environments or under load, these may be insufficient and cause flakiness. Consider whether these delays can be replaced with event-driven synchronization or if they need to be more generous.

Callback nesting: The deeply nested structure (5 levels deep starting at line 169) reduces readability. While it follows the existing pattern in this file, consider whether this could be flattened in a future refactor.

To verify the reliability of the timing assumptions, you could run this test repeatedly under load:

test/agents/test-zmq-snapshot.mjs (1)

235-296: LGTM! Well-structured test for runtime asset toggling.

The test correctly validates the assetsEnabled runtime control:

Properly disables assets and verifies the expected 422 error when attempting a snapshot

Re-enables assets and confirms the snapshot succeeds

Verifies the final configuration state

The defensive resolved flag (lines 242, 269-270) appropriately prevents duplicate event processing after test completion.

test/agents/test-zmq-heap-profile.mjs (1)

342-408: LGTM! Consistent implementation for heap profiling.

The test follows the same proven pattern as the snapshot test and includes appropriate heap-profile-specific validations:

Lines 396-398 correctly verify that trackAllocations: false results in empty trace data

Runtime toggling flow is properly implemented with correct error handling

Final state verification confirms assetsEnabled persists after operation completion

agents/grpc/src/proto/reconfigure.pb.cc (6)

49-50: LGTM! Generated field initialization is correct.

The new assetsenabled_ field is properly initialized to false in the constexpr constructor, following the same pattern as other boolean fields like contcpuprofile_.

107-133: LGTM! Has-bit layout correctly updated.

The has-bit index offset (line 107) and field offsets (lines 120, 133) are properly updated to accommodate the new assetsenabled_ field. The hasbit index of 11 for the new field is consistent with it being the 12th optional field (0-indexed).

243-245: LGTM! Memory copy ranges correctly extended.

The copy constructor (lines 243-245) and SharedCtor (lines 263-265) memory ranges are properly updated to include the new assetsenabled_ field as the last member in the contiguous memory block.

Also applies to: 263-265

465-468: LGTM! Clear and serialization logic properly updated.

The Clear operation (lines 465-468) now uses the correct has-bits mask 0x00000f00U (covering bits 8-11), and the serialization code (lines 581-586) correctly emits the new field with tag 13 when present.

Also applies to: 581-586

624-624: LGTM! Size calculation and merge logic correctly updated.

Line 624: The has-bits mask in ByteSizeLong is correctly updated from 0x000005e0U to 0x00000de0U, adding bit 11 for the new field.

Lines 702-714: The MergeImpl function properly handles merging the new assetsenabled_ field when has-bit 0x00000800U is set.

Also applies to: 702-714

739-740: LGTM! Swap logic correctly updated.

The InternalSwap range (lines 739-740) now correctly uses assetsenabled_ as the end marker since it's the last field in the contiguous memory block, replacing the previous end marker contcpuprofile_.

agents/grpc/src/proto/reconfigure.pb.h (4)

233-233: LGTM! Field number and parse table correctly defined.

Line 233: The new field number kAssetsEnabledFieldNumber = 13 is correctly defined in the enum.

Line 408: The TcParseTable template parameter is properly updated from <4, 12, ...> to <4, 13, ...> to reflect the increased field count.

Also applies to: 408-408

392-401: LGTM! Public accessor declarations are complete.

Lines 392-401 declare the standard protobuf accessors for the assetsEnabled field:

Public: has_assetsenabled(), clear_assetsenabled(), assetsenabled(), set_assetsenabled()

Private: _internal_assetsenabled(), _internal_set_assetsenabled()

All follow the standard protobuf API patterns.

442-442: LGTM! Field added to internal struct.

Line 442: The assetsenabled_ field is correctly added as the last boolean member in the Impl_ struct, maintaining consistency with the implementation file.

1175-1201: LGTM! Accessor implementations are complete and correct.

Lines 1175-1201 provide the inline implementations for all assetsEnabled accessors. The implementation:

Uses the correct has-bit mask 0x00000800U (bit 11)

Includes proper TSan annotations for thread safety

Follows the exact pattern of other boolean fields

The generated code is mechanically correct and consistent.

test/parallel/test-nsolid-config-tracing.js (1)

23-29: LGTM!

The test properly validates that the new runtime tracing toggle helpers (disableTraces() and enableTraces()) correctly update the config.tracingEnabled state.

test/agents/test-grpc-tracing.mjs (2)

205-247: Well-structured phase-based testing approach.

The introduction of a phase state machine ('initial' → 'disabled' → 'reenabled' → 'done') properly orchestrates the multi-step tracing toggle test. The 200ms delays ensure spans have time to emit (or not emit when disabled), and the phase guards prevent race conditions after test completion.

273-315: LGTM!

The custom tracing test follows the same solid phase-based pattern as the HTTP tracing test, ensuring consistent validation of the trace enable/disable functionality.

test/parallel/test-nsolid-heap-sampling.js (2)

164-169: LGTM!

The multi-line assertion formatting improves readability, and the subsequent call to runHeapSamplingAssetsToggleTests() properly integrates the new test flow.

177-248: Comprehensive asset toggle testing.

The test function thoroughly validates heap sampling behavior when assets are toggled:

Disables assets and verifies errors are thrown

Re-enables assets and confirms sampling works

Disables again and re-validates error conditions

The timing and sequencing properly ensure each operation completes before the next begins.

test/agents/test-grpc-heap-sampling.mjs (3)

248-285: LGTM!

This test properly validates that heap sampling respects the NSOLID_ASSETS_ENABLED environment variable, returning the expected error when assets are disabled.

287-331: LGTM!

The test correctly validates toggling assetsEnabled via nsolid.start() configuration, verifying both the disabled error state and the successful profile data collection when re-enabled.

333-378: LGTM!

This test properly exercises the runtime helper methods (disableAssets() and enableAssets()), confirming they gate heap sampling operations correctly and update the configuration state.

test/agents/test-grpc-continuous-profile.mjs (4)

100-135: LGTM!

The test properly validates that continuous profiling is suppressed when NSOLID_ASSETS_ENABLED='0', waiting an appropriate duration to confirm no profiles are emitted.

137-180: LGTM!

This test correctly validates that continuous profiling stops/resumes when assetsEnabled is toggled via configuration, with appropriate delays to observe profile emission patterns.

182-224: LGTM!

The test properly exercises the runtime helpers (disableAssets()/enableAssets()) with continuous profiling, confirming that profile emission is gated correctly and configuration state is updated.

275-432: Verify the intent of the extensive commented code.

There are 157 lines of commented-out test code. Please confirm whether:

This code is intentionally disabled for future use

These tests are superseded by the active tests

The commented code should be removed

If this is intentional technical debt or work-in-progress, consider adding a comment explaining the rationale.

test/common/nsolid-grpc-agent/index.js (2)

346-387: Well-designed API extensions for runtime testing.

The new TestClient methods provide a clean interface for testing asset and tracing toggles:

Consistent early returns when #child is null

Unified #sendAndWait helper eliminates code duplication

The config() method's result mapping properly extracts the payload

389-401: LGTM!

The private #sendAndWait helper properly centralizes the request/response pattern with:

Configurable message type expectations

Optional result mapping

Proper listener cleanup after response

lib/nsolid.js (7)

133-136: LGTM!

The new runtime toggle functions (enableAssets, disableAssets, enableTraces, disableTraces) are properly exported on the public API surface.

195-206: LGTM!

The new getters expose the current asset and tracing states with appropriate defaults (assetsEnabled: true, tracingEnabled: false), using nullish coalescing to handle missing config values.

413-416: Proper runtime gating for asset collection.

Both heapProfileStream and heapSamplingStream correctly check assetsEnabled and throw the appropriate error when disabled, preventing expensive profiling operations at runtime.

Also applies to: 496-499

604-633: Well-implemented toggle functions.

All four toggle functions (enableAssets, disableAssets, enableTraces, disableTraces) follow a consistent pattern:

Check current state to avoid unnecessary updates

Call updateConfig with the new state

This ensures idempotent behavior and proper config propagation.

761-765: LGTM!

The updateConfig function properly handles the assetsEnabled option using optionToBool for normalization, ensuring only valid boolean values are assigned.

947-956: Proper configuration initialization for assetsEnabled.

The initialization correctly prioritizes environment variables over package.json settings, with a sensible default of true when neither is specified. This follows the established configuration precedence pattern.

1106-1116: Robust boolean normalization helper.

The optionToBool function properly handles various input types:

Returns undefined for null/undefined (preserving absence)

Passes through booleans unchanged

Converts numbers to boolean (0 = false, non-zero = true)

Uses envToBool for string conversion

This provides consistent boolean normalization across configuration sources.

test/parallel/test-nsolid-snapshot.js (1)

82-107: Good fix: toggles are now chained to snapshot completion, avoiding races

Sequencing enable/disable after the respective snapshot callbacks resolves the prior “in‑flight snapshot” race. This should make the test deterministic.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (8)

test/parallel/test-nsolid-profile.js (1)
120-182: Consider checking for asset-disabled-specific error codes or messages.

The test comprehensively exercises the asset toggle functionality, but all error assertions verify the generic message "CPU profile could not be started" without distinguishing whether the failure is specifically due to disabled assets versus other causes. Since the PR introduces an EAssetsDisabled error, consider adding more specific assertions to improve test precision and debugging.

For example, check for a specific error code when assets are disabled:
     assert.throws(
       () => {
         nsolid.profile();
       },
       {
+        code: 'ERR_NSOLID_ASSETS_DISABLED', // or whatever the actual code is
         message: 'CPU profile could not be started'
       }
     );
This would make test failures easier to diagnose and catch regressions where the wrong error type is returned.
test/parallel/test-nsolid-heap-profile-stream.js (2)
94-96: Minor: Inconsistent stream consumption pattern.

Lines 94-96 consume the stream by listening to 'data' events, while line 117 uses .resume() to consume the stream without listening to data. This inconsistency makes the test harder to follow.

Consider using a consistent pattern. If you don't need the profile data in the final test, use .resume() consistently:
-      let profile = '';
       const enabledStream = nsolid.heapProfileStream(0, 1200, true);
-      enabledStream.on('data', (chunk) => {
-        profile += chunk;
-      });
+      enabledStream.resume();
       enabledStream.on('end', common.mustCall(() => {
-        assert(JSON.parse(profile));
Also applies to: 117-117

119-119: Minor: Weak assertion.

assert.ok(true) is a tautology that doesn't validate any meaningful outcome. The test should either verify that the stream completed successfully (which common.mustCall on the 'end' handler already does) or be removed entirely.

Remove the redundant assertion:
 finalStream.on('end', common.mustCall(() => {
-  assert.ok(true);
 }));
Alternatively, if you want to verify stream completion, the presence of common.mustCall on the 'end' event handler is already sufficient validation.
test/parallel/test-nsolid-heap-sampling-stream.js (3)
192-200: Consider adding allocation activity before heap sampling.

The original test (lines 128-131) creates allocations to ensure heap sampling has data to sample. The re-enabled heap sampling tests might produce minimal or empty profiles without similar allocation activity.

If you want to ensure non-empty profiles in these tests, apply this diff:
     nsolid.enableAssets();
     setImmediate(() => {
+      const arr = [];
+      for (let i = 0; i < 1000; i++) {
+        arr.push(new Array(1000));
+      }
       let profile = '';
221-221: Remove the meaningless assertion.

The assert.ok(true) assertion doesn't validate anything meaningful. The common.mustCall wrapper already ensures the callback is invoked.

Apply this diff to remove the redundant assertion:
-            finalStream.on('end', common.mustCall(() => {
-              assert.ok(true);
-            }));
+            finalStream.on('end', common.mustCall());
180-188: Consider extracting the repeated error assertion pattern.

The error assertion pattern is duplicated twice. Extracting it into a helper function would reduce duplication and improve maintainability.

Consider refactoring like this:
function assertHeapSamplingDisabled() {
  assert.throws(
    () => {
      nsolid.heapSamplingStream(0, 1000);
    },
    {
      code: 'ERR_NSOLID_HEAP_SAMPLING_START',
      message: 'Heap sampling could not be started'
    }
  );
}
Then replace both assertion blocks with calls to assertHeapSamplingDisabled().

Also applies to: 205-213
test/parallel/test-nsolid-heap-profile.js (2)
130-132: Tight 100 ms waits may flake; prefer longer or state-driven waits

CI can be slower; 100 ms can race with config propagation. Consider increasing timeouts (or polling for state) to reduce flakes.

Apply this minimal diff to relax timing:
-  }, common.platformTimeout(100));
-}, common.platformTimeout(100));
+  }, common.platformTimeout(250));
+}, common.platformTimeout(250));
Optional: replace fixed sleeps with a small poll for the desired outcome to avoid magic delays:
async function waitForHeapProfileStartFailure() {
  // Poll until calling heapProfile() throws synchronously as expected.
  for (let i = 0; i < 25; i++) {
    try {
      nsolid.heapProfile();
    } catch (e) {
      assert.strictEqual(e.message, 'Heap profile could not be started');
      return;
    }
    await new Promise((r) => setTimeout(r, common.platformTimeout(20)));
  }
  assert.fail('heapProfile() did not start failing within the expected time');
}
Please verify on slow builders whether 100 ms is sufficient or adopt one of the above.

133-204: Solid toggle tests; add callback end-error checks and (optionally) “disable while active” case

Great coverage of disabled/enabled transitions. Two small gaps:

Also assert the callback form of heapProfileEnd() fails while disabled (mirrors start path).

Optionally, validate that disabling while an active profile is running still allows ending successfully (important runtime contract per PR).

Apply these small insertions to cover the callback end-error while disabled:
@@
     assert.throws(
       () => {
         nsolid.heapProfileEnd();
       },
       {
         message: 'Heap profile could not be stopped'
       }
     );
+    // Callback variant should also surface the disabled state
+    nsolid.heapProfileEnd(common.mustCall((err) => {
+      assert.notStrictEqual(err.code, 0);
+      assert.strictEqual(err.message, 'Heap profile could not be stopped');
+    }));
@@
             assert.throws(
               () => {
                 nsolid.heapProfileEnd();
               },
               {
                 message: 'Heap profile could not be stopped'
               }
             );
+            // Callback variant should also surface the disabled state
+            nsolid.heapProfileEnd(common.mustCall((err) => {
+              assert.notStrictEqual(err.code, 0);
+              assert.strictEqual(err.message, 'Heap profile could not be stopped');
+            }));
Optional addition to assert “disable while active” still allows end to succeed:
// Start a profile, disable assets while active, ensure end still succeeds.
nsolid.enableAssets();
setTimeout(() => {
  nsolid.heapProfile(common.mustSucceed(() => {
    nsolid.disableAssets();
    nsolid.heapProfileEnd(common.mustSucceed(() => {
      // Re-enable for any subsequent tests
      nsolid.enableAssets();
    }));
  }));
}, common.platformTimeout(100));
If the implementation surfaces a specific disabled error code (e.g., EAssetsDisabled), consider asserting it in these disabled paths to strengthen the check. Based on PR objectives.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de122d2 and 22ac97c.

📒 Files selected for processing (32)

agents/grpc/proto/reconfigure.proto (1 hunks)
agents/grpc/src/grpc_agent.cc (4 hunks)
agents/grpc/src/grpc_agent.h (1 hunks)
agents/grpc/src/grpc_errors.h (1 hunks)
agents/grpc/src/proto/reconfigure.pb.cc (18 hunks)
agents/grpc/src/proto/reconfigure.pb.h (4 hunks)
agents/zmq/src/zmq_agent.cc (2 hunks)
agents/zmq/src/zmq_agent.h (1 hunks)
lib/nsolid.js (8 hunks)
src/nsolid/nsolid_api.cc (1 hunks)
test/agents/test-grpc-continuous-profile.mjs (6 hunks)
test/agents/test-grpc-heap-profile.mjs (1 hunks)
test/agents/test-grpc-heap-sampling.mjs (1 hunks)
test/agents/test-grpc-profile.mjs (1 hunks)
test/agents/test-grpc-snapshot.mjs (1 hunks)
test/agents/test-grpc-tracing.mjs (3 hunks)
test/agents/test-zmq-heap-profile.mjs (1 hunks)
test/agents/test-zmq-heap-sampling.mjs (1 hunks)
test/agents/test-zmq-profile.mjs (1 hunks)
test/agents/test-zmq-snapshot.mjs (1 hunks)
test/agents/test-zmq-tracing.mjs (2 hunks)
test/common/nsolid-grpc-agent/client.js (2 hunks)
test/common/nsolid-grpc-agent/index.js (1 hunks)
test/common/nsolid-zmq-agent/client.js (1 hunks)
test/common/nsolid-zmq-agent/index.js (1 hunks)
test/parallel/test-nsolid-config-tracing.js (1 hunks)
test/parallel/test-nsolid-heap-profile-stream.js (1 hunks)
test/parallel/test-nsolid-heap-profile.js (1 hunks)
test/parallel/test-nsolid-heap-sampling-stream.js (2 hunks)
test/parallel/test-nsolid-heap-sampling.js (1 hunks)
test/parallel/test-nsolid-profile.js (1 hunks)
test/parallel/test-nsolid-snapshot.js (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (19)

agents/grpc/proto/reconfigure.proto
test/agents/test-grpc-profile.mjs
test/agents/test-zmq-profile.mjs
agents/grpc/src/grpc_agent.cc
test/parallel/test-nsolid-config-tracing.js
test/common/nsolid-zmq-agent/client.js
src/nsolid/nsolid_api.cc
test/agents/test-zmq-tracing.mjs
test/agents/test-zmq-heap-profile.mjs
test/common/nsolid-zmq-agent/index.js
test/agents/test-grpc-snapshot.mjs
test/agents/test-zmq-heap-sampling.mjs
test/agents/test-grpc-heap-sampling.mjs
test/common/nsolid-grpc-agent/client.js
test/agents/test-zmq-snapshot.mjs
agents/grpc/src/grpc_agent.h
lib/nsolid.js
agents/zmq/src/zmq_agent.h
agents/zmq/src/zmq_agent.cc

🔇 Additional comments (24)

test/parallel/test-nsolid-profile.js (2)

107-111: LGTM!

The assertion formatting improves readability, and the integration of asset toggle testing at this point ensures all baseline profiling tests complete before exercising the new asset control surface.

117-118: LGTM!

Wrapping timeouts with common.platformTimeout() is a best practice that accounts for varying platform speeds and CI environments, reducing test flakiness.

test/parallel/test-nsolid-snapshot.js (3)

56-57: LGTM! Proper test integration.

The new test function is correctly integrated into the existing test flow, ensuring it runs after the previous assertions complete.

60-76: LGTM! Clear test setup and validation.

The test correctly configures nsolid with assets disabled but snapshots enabled, then validates that snapshot operations fail with the appropriate error message when assets are disabled.

78-108: LGTM! Race condition properly addressed.

The test correctly chains callbacks to ensure snapshots complete before toggling asset states, preventing the "Snapshot already in progress" race condition flagged in the previous review. The sequence properly:

Waits for the error callback when assets are disabled (line 78)

Enables assets only after the error is confirmed (line 83)

Waits for the successful snapshot to complete before disabling assets (lines 86-88)

Verifies the expected error after disabling (lines 90-97)

Waits for the final snapshot to complete (line 103)

The use of common.mustSucceed() ensures all expected callbacks are invoked, providing deterministic test behavior.

test/parallel/test-nsolid-heap-profile-stream.js (1)

72-126: Major: Timing/synchronization concerns with nested setImmediate.

The test relies on nested setImmediate callbacks to ensure config changes (nsolid.start(), enableAssets(), disableAssets()) take effect before attempting operations. If these configuration updates are truly asynchronous or involve internal state propagation, a single setImmediate tick might be insufficient, leading to race conditions and flaky test behavior.

Verify whether the configuration methods are synchronous and whether setImmediate provides adequate synchronization:

test/parallel/test-nsolid-heap-sampling.js (4)

164-167: LGTM: Formatting improvement for readability.

The multi-line formatting of the error message assertion improves readability without changing the test logic.

174-175: Good: Consistent use of platform timeouts.

Using common.platformTimeout() ensures the test behaves correctly across different platform speeds and CI environments.

168-168: Good integration point for the new test.

Calling runHeapSamplingAssetsToggleTests() after the existing error path tests is a logical placement that extends the test coverage naturally.

177-248: Well-structured asset toggle test with timing considerations.

The test comprehensively covers the asset enable/disable scenarios for heap sampling:

Tests error paths when assets are disabled (both sync and async)

Properly sequences operations to avoid mid-sampling disruption (line 216 only disables after sampling completes)

Verifies successful sampling after re-enabling assets

However, the test relies heavily on setTimeout delays (100ms) for configuration changes to propagate. While common.platformTimeout() helps, this could still be a source of flakiness on heavily loaded systems.

Additionally, the test checks err.code !== 0 but doesn't verify the specific error code for the "assets disabled" scenario. According to the PR summary, an EAssetsDisabled error was added. Consider verifying the specific error code to make the test more precise and easier to debug if the wrong error is returned.

Run the following script to check if there's a specific error code constant for assets disabled that should be verified:

test/parallel/test-nsolid-heap-sampling-stream.js (1)

138-138: LGTM!

The integration point for the asset toggle tests is well-placed after validating the initial profile.

test/agents/test-grpc-tracing.mjs (2)

205-247: LGTM! Well-structured phase-based testing.

The multi-phase flow (initial → disabled → reenabled → done) effectively validates that tracing can be toggled at runtime and that spans are properly suppressed when tracing is disabled.

273-315: LGTM! Consistent test coverage.

The custom tracing test mirrors the http tracing test structure, ensuring comprehensive coverage of the enableTraces/disableTraces API across different span types.

test/agents/test-grpc-heap-profile.mjs (3)

259-296: LGTM! Proper error handling verification.

The test correctly validates that heap profiling returns a 500 error with code 1008 when assets collection is disabled via the environment variable.

298-342: LGTM! Comprehensive config toggle validation.

The test properly verifies that assetsEnabled can be toggled dynamically via nsolid.start() configuration and that the state changes affect profiling behavior as expected.

344-389: LGTM! Helper API coverage is complete.

The test validates the enableAssets/disableAssets convenience methods and confirms the final state with a config check, ensuring the API surface is working correctly.

agents/grpc/src/grpc_errors.h (1)

12-13: LGTM! Properly defined error type.

The new EAssetsDisabled error is correctly added to the GRPC_ERRORS macro with appropriate HTTP status code (500), message, and runtime code (1008) that aligns with the test expectations.

test/agents/test-grpc-continuous-profile.mjs (3)

121-156: LGTM! Proper verification of disabled continuous profiling.

The test correctly validates that continuous CPU profiles are not emitted when assets collection is disabled via environment variable, using an appropriate wait period to confirm no emissions.

158-201: LGTM! Dynamic toggle behavior is well-tested.

The test properly validates that continuous profiling can be stopped and restarted by toggling assetsEnabled via configuration, with appropriate tolerance for profiles in flight during state transitions.

203-245: LGTM! Helper methods are properly validated.

The test confirms that the disableAssets/enableAssets convenience methods correctly control continuous profiling behavior and verifies the final configuration state.

test/common/nsolid-grpc-agent/index.js (1)

346-401: LGTM! Well-structured test client API.

The new methods provide a clean interface for controlling assets and tracing at runtime. The centralized #sendAndWait helper promotes consistency and reduces duplication, with appropriate null checks for the child process.

agents/grpc/src/proto/reconfigure.pb.cc (1)

1-1082: Auto-generated protobuf code.

This file is automatically generated by the protocol buffer compiler from the proto definition. The changes correctly add support for the new assetsEnabled field (field 13) throughout the serialization, deserialization, copy, merge, and swap operations. The field is properly initialized to false and integrated into the has-bits mechanism.

agents/grpc/src/proto/reconfigure.pb.h (1)

233-233: Auto-generated protobuf header.

This file is automatically generated by the protocol buffer compiler. The changes correctly add the assetsEnabled field (field 13) with:

Public field number enum constant kAssetsEnabledFieldNumber = 13

Public accessors: has_assetsenabled(), clear_assetsenabled(), assetsenabled(), set_assetsenabled()

Private internal helpers: _internal_assetsenabled(), _internal_set_assetsenabled()

Private member variable assetsenabled_ in the Impl_ struct

The inline implementations follow standard protobuf patterns for optional boolean fields with proper has-bit management.

Also applies to: 392-401, 442-442, 1175-1201

test/parallel/test-nsolid-heap-profile.js (1)

124-124: Good placement for asset-toggle coverage

Running the toggle suite after the error-path chain is complete is sensible and avoids overlapping async state. LGTM.

test/parallel/test-nsolid-heap-profile-stream.js

test/parallel/test-nsolid-heap-sampling-stream.js

Add comprehensive runtime control for asset collection (CPU/heap profiling, snapshots) with multiple configuration methods. Implements assetsEnabled flag with full control surface: - NSOLID_ASSETS_ENABLED environment variable - nsolid.start({ assetsEnabled: true/false }) configuration - Runtime helpers: enableAssets() and disableAssets() - Dynamic updates via gRPC reconfigure protocol - Validation in both gRPC and ZMQ agents with proper errors - Continuous profiler integration respects assetsEnabled state This allows operators to disable expensive profiling operations at runtime without process restarts, improving resource management and production safety. Also adds missing enableTraces()/disableTraces() helper methods to complete the tracing API surface. Comprehensive test coverage added for all asset types across gRPC, ZMQ, and parallel test suites.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

agents/grpc/src/grpc_errors.h (1)

12-13: New EAssetsDisabled enum looks correct; consider status code semantics.

Runtime code 1008 fits sequence; macro continuation is correct. Optionally consider HTTP-ish code 403 (forbidden) or 409 (conflict) instead of 500 to better signal operator-controlled disablement; align with clients/tests before changing.

Please confirm clients/tests map 1008 → “Assets collection disabled” and no other components expect 500 here.
agents/grpc/src/grpc_agent.cc (1)
1194-1199: Type‑safety when reading JSON; store atomically.

Guard against non-boolean values and prepare for atomic storage (see race comment below).

Apply:
-  {
-    auto it = config_.find("assetsEnabled");
-    if (it != config_.end()) {
-      assets_enabled_ = *it;
-    }
-  }
+  {
+    auto it = config_.find("assetsEnabled");
+    if (it != config_.end()) {
+      if (it->is_boolean()) {
+        assets_enabled_.store(*it, std::memory_order_relaxed);
+      } else {
+        Debug("Invalid assetsEnabled type; expected boolean\n");
+      }
+    }
+  }
Note: assets_enabled_ change assumes std::atomic (see next comment).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22ac97c and 8a2bb66.

📒 Files selected for processing (8)

agents/grpc/proto/reconfigure.proto (1 hunks)
agents/grpc/src/grpc_agent.cc (5 hunks)
agents/grpc/src/grpc_agent.h (1 hunks)
agents/grpc/src/grpc_errors.h (1 hunks)
agents/grpc/src/proto/reconfigure.pb.cc (18 hunks)
agents/grpc/src/proto/reconfigure.pb.h (4 hunks)
agents/zmq/src/zmq_agent.cc (2 hunks)
agents/zmq/src/zmq_agent.h (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

agents/zmq/src/zmq_agent.h
agents/grpc/src/grpc_agent.h
agents/zmq/src/zmq_agent.cc
agents/grpc/proto/reconfigure.proto

🧰 Additional context used

🧬 Code graph analysis (3)

agents/grpc/src/grpc_agent.cc (1)

agents/zmq/src/zmq_agent.cc (10)

config (430-476)

config (430-430)

config (1263-1378)

config (1263-1263)

config (2760-2795)

config (2760-2760)

config (2888-2905)

config (2888-2888)

config (2983-3000)

config (2983-2983)

agents/grpc/src/proto/reconfigure.pb.h (2)

agents/grpc/src/proto/command.pb.h (5)

google (42-49)

google (72-75)

internal (44-47)

ReconfigureBody (452-582)

ReconfigureBody (837-843)

agents/grpc/src/proto/reconfigure.pb.cc (4)

ReconfigureBody (53-53)

ReconfigureBody (267-270)

_impl_ (59-60)

_impl_ (85-86)

agents/grpc/src/proto/reconfigure.pb.cc (1)

agents/grpc/src/proto/reconfigure.pb.h (16)

ReconfigureBody (94-94)

ReconfigureBody (95-96)

ReconfigureBody (681-693)

ReconfigureBody (694-698)

ReconfigureBody (699-721)

ReconfigureBody (722-726)

ReconfigureBody (727-749)

ReconfigureBody (750-754)

ReconfigureBody (755-777)

ReconfigureBody (778-782)

ReconfigureBody (783-805)

ReconfigureBody (806-810)

ReconfigureBody (811-834)

ReconfigureBody (836-841)

ReconfigureBody (875-887)

ReconfigureBody (890-903)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: lint-js-and-md
GitHub Check: test-linux (ubuntu-24.04-arm)
GitHub Check: test-linux (ubuntu-24.04)
GitHub Check: build-tarball
GitHub Check: coverage-windows
GitHub Check: test-macOS

🔇 Additional comments (5)

agents/grpc/src/grpc_agent.cc (3)

395-398: Good: propagate assetsEnabled in ReconfigureEvent.

Only sets when present; matches ZMQ agent behavior.

Ensure reconfigure.proto field 13 is optional and absent when not set to avoid default false flips.

447-448: Defaulting assets to enabled is fine; confirm env/config precedence.

assets_enabled_ defaults true. Please confirm any NSOLID_ASSETS_ENABLED env or nsolid.start config is applied before first asset command to avoid a short window with the wrong default.

1755-1757: Good: include assetsEnabled in reconfigure output.

Keeps parity with ZMQ agent.

agents/grpc/src/proto/reconfigure.pb.h (1)

233-234: Generated field assetsEnabled = 13 is wired correctly.

Enum, accessors, has-bits (0x00000800U), and parse table size updates look consistent with protoc 6.32.0.

Please confirm these files were re-generated from the updated reconfigure.proto (no manual edits) and that build scripts pin protoc/runtime (6.32.0) to avoid cross-version CI drift.

Also applies to: 392-402, 408-446, 1175-1202

agents/grpc/src/proto/reconfigure.pb.cc (1)

49-51: Proto gencode changes align with assetsEnabled field; no manual edits suggested.

Init/clear/serialize/merge paths and descriptor/table deltas reflect field 13; has-bits masks updated (incl. 0x00000de0 popcount).

Ensure descriptor size/hash changes are committed wherever cached (if any), and that both pb.h/pb.cc are produced by the same protoc (6.32.0).

Also applies to: 107-121, 152-173, 338-429, 465-469, 581-587, 624-625, 702-715, 738-744

coderabbitai · 2025-10-23T21:13:50Z

agents/grpc/src/grpc_agent.cc

+
+  if (!assets_enabled_) {
+    return ErrorType::EAssetsDisabled;
+  }
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

*Critical: data race on assets_enabled_ (and on config_ for contCpuProfile) when called from _from_js.

start_*_from_js invokes do_start_prof_init on potentially non‑agent threads. Reading a non‑atomic bool and nlohmann::json without synchronization is undefined behavior.

Fix options (pick one):

Make flags atomic and stop reading config_ off-thread.

Or route the entire do_start_prof_init to the agent loop (enqueue request), returning immediately to JS.

Minimal atomic fix (recommended now), plus JSON type safety:

- if (!assets_enabled_) { + if (!assets_enabled_.load(std::memory_order_acquire)) { return ErrorType::EAssetsDisabled; } - if (type == ProfileType::kCpu) { - auto it = config_.find("contCpuProfile"); - if (it != config_.end() && it->get<bool>() == true) { - return ErrorType::EInProgressError; - } - } + if (type == ProfileType::kCpu && + cont_cpu_profile_enabled_.load(std::memory_order_acquire)) { + return ErrorType::EInProgressError; + }

Header changes (agents/grpc/src/grpc_agent.h):

+#include <atomic> ... - bool assets_enabled_; + std::atomic<bool> assets_enabled_{true}; + std::atomic<bool> cont_cpu_profile_enabled_{false};

Update config() to snapshot contCpuProfile to atomic:

+ { + auto it = config_.find("contCpuProfile"); + if (it != config_.end()) { + if (it->is_boolean()) { + cont_cpu_profile_enabled_.store(*it, std::memory_order_relaxed); + } else { + Debug("Invalid contCpuProfile type; expected boolean\n"); + cont_cpu_profile_enabled_.store(false, std::memory_order_relaxed); + } + } else { + cont_cpu_profile_enabled_.store(false, std::memory_order_relaxed); + } + }

Optional: if assetsEnabled should also suppress continuous profiles, add a fast check at the start of got_continuous_profile to drop or close streams when disabled.

🏁 Script executed:

#!/bin/bash # Locate *_from_js callsites and assess threading context. rg -nP 'start_(cpu|heap)_(profile|sampling|snapshot)_from_js\s*\(' -C3 --type=cpp # Grep for direct calls to do_start_prof_init outside agent loop callbacks. rg -nP '\bdo_start_prof_init\s*\(' -C3 --type=cpp

Length of output: 17670

I need to inspect the actual implementation of do_start_prof_init and check the current state of the code:

Now let me inspect the actual implementation in grpc_agent.cc to see the current state:

Let me now examine the actual implementation in the grpc_agent.cc file around the line numbers indicated:

[/run_scripts]

🏁 Script executed:

#!/bin/bash # Inspect the actual do_start_prof_init implementation in grpc_agent.cc sed -n '2092,2130p' agents/grpc/src/grpc_agent.cc

Length of output: 1197

Now let me check the header file to see the current declarations:

[/run_scripts]

🏁 Script executed:

#!/bin/bash # Check the declaration of assets_enabled_ and related fields in grpc_agent.h grep -n "assets_enabled_\|cont_cpu_profile\|config_" agents/grpc/src/grpc_agent.h | head -30

Length of output: 279

Let me check the config() method and how config_ is synchronized:

[/run_scripts]

🏁 Script executed:

#!/bin/bash # Find the config() method and config_ update patterns rg -n "config_\s*=" agents/grpc/src/grpc_agent.cc -B2 -A2 | head -60 # Also search for config() method rg -n "void.*config\s*\(" agents/grpc/src/grpc_agent.cc -A5 | head -40

Length of output: 182

Let me find what method contains the config assignment and check for synchronization:

[/run_scripts]

🏁 Script executed:

#!/bin/bash # Find the method containing config_ assignment at line 1019 sed -n '990,1030p' agents/grpc/src/grpc_agent.cc # Check for any mutexes in the header grep -n "mutex\|lock\|Lock" agents/grpc/src/grpc_agent.h

Length of output: 1735

Let me verify one more detail about whether config() can be called concurrently with do_start_prof_init by checking the call sites:

[/run_scripts]

🏁 Script executed:

#!/bin/bash # Check where config() method is called from rg -n "\.config\(" agents/grpc/src/grpc_agent.cc -B1 | head -20 # Check if config_msg_cb_ processes config updates asynchronously or on agent loop sed -n '1250,1350p' agents/grpc/src/grpc_agent.cc | grep -A 30 "config_msg_cb_"

Length of output: 0

Let me try a broader search for how config_msg_cb_ is used:

[/run_scripts]

🏁 Script executed:

#!/bin/bash # Search for config_msg_cb_ definition and usage rg -n "config_msg_cb_" agents/grpc/src/grpc_agent.cc -A 15 | head -50

Length of output: 1111

Critical data race confirmed: unprotected config_ access from concurrent threads.

The code modification at line 2097 (if (!assets_enabled_)) exposes a classic data race:

config_msg_cb_ (agent event loop thread) writes config_ unsafely at line 1019 via config() method

do_start_prof_init (arbitrary thread via *_from_js V8 callbacks) reads config_ unsafely at lines 2110–2115

No synchronization between these threads

Additionally, assets_enabled_ is declared std::atomic<bool> but uses implicit operator! conversion instead of explicit .load(std::memory_order_acquire), bypassing memory ordering guarantees.

Fix as suggested in the review comment:

Replace if (!assets_enabled_) with if (!assets_enabled_.load(std::memory_order_acquire))

Add std::atomic<bool> cont_cpu_profile_enabled_{false} to header

Snapshot contCpuProfile from config_ into the atomic in the config() method (with proper null/type checks)

Use cont_cpu_profile_enabled_.load(std::memory_order_acquire) in the check at line 2110–2115

These changes eliminate the data race by preventing unprotected JSON reads from arbitrary threads.

🤖 Prompt for AI Agents

In agents/grpc/src/grpc_agent.cc around lines 2096–2100 (and the read at 2110–2115), the code performs unsynchronized reads of config_ and uses implicit atomic conversion; replace the implicit check if (!assets_enabled_) with if (!assets_enabled_.load(std::memory_order_acquire)), add a new std::atomic<bool> cont_cpu_profile_enabled_{false} to the header, update the config() writer (called at line ~1019) to safely snapshot contCpuProfile from config_ into cont_cpu_profile_enabled_ after doing null/type checks, and change the arbitrary-thread check at lines 2110–2115 to use cont_cpu_profile_enabled_.load(std::memory_order_acquire) so there are no unprotected JSON reads from non-event-loop threads.

The curl change fix a linking error on Windows while the change in grpc fixes a linking error on macOS.

santigimeno requested review from RafaelGSS and juanarbol October 10, 2025 20:23

santigimeno self-assigned this Oct 10, 2025

coderabbitai bot requested changes Oct 10, 2025

View reviewed changes

test/agents/test-zmq-tracing.mjs Show resolved Hide resolved

test/parallel/test-nsolid-heap-sampling-stream.js Show resolved Hide resolved

test/parallel/test-nsolid-snapshot.js Show resolved Hide resolved

santigimeno force-pushed the santi/assets_enabled branch 2 times, most recently from b3ccdf4 to 74e0d6b Compare October 14, 2025 09:15

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

santigimeno force-pushed the santi/assets_enabled branch from 74e0d6b to de122d2 Compare October 14, 2025 09:35

coderabbitai bot approved these changes Oct 14, 2025

View reviewed changes

coderabbitai bot reviewed Oct 14, 2025

View reviewed changes

santigimeno force-pushed the santi/assets_enabled branch from de122d2 to 22ac97c Compare October 14, 2025 09:51

coderabbitai bot requested changes Oct 14, 2025

View reviewed changes

test/parallel/test-nsolid-heap-profile-stream.js Show resolved Hide resolved

test/parallel/test-nsolid-heap-profile-stream.js Show resolved Hide resolved

test/parallel/test-nsolid-heap-sampling-stream.js Show resolved Hide resolved

coderabbitai bot approved these changes Oct 14, 2025

View reviewed changes

RafaelGSS approved these changes Oct 14, 2025

View reviewed changes

santigimeno force-pushed the santi/assets_enabled branch from 22ac97c to 8a2bb66 Compare October 23, 2025 21:04

coderabbitai bot requested changes Oct 23, 2025

View reviewed changes

deps: add missing files to grpc and curl gyp files

ba2865a

The curl change fix a linking error on Windows while the change in grpc fixes a linking error on macOS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

agents: add runtime control for asset collection #377

agents: add runtime control for asset collection #377

Uh oh!

santigimeno commented Oct 10, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 10, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

agents: add runtime control for asset collection #377

Are you sure you want to change the base?

agents: add runtime control for asset collection #377

Uh oh!

Conversation

santigimeno commented Oct 10, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

santigimeno commented Oct 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 10, 2025 •

edited

Loading