Skip to content

Conversation

chris-oo
Copy link
Member

@chris-oo chris-oo commented Oct 2, 2025

When the threadpool is already running for a given VP because of a spawned thread, the code to online a sidecar vp incorrectly issues a cancel before run_vp is called, which results in an assertion failure due to unexpected saved state as the first run_vp call will exit with a user cancel. Fix this by adding a new boolean field to the notifier closure that the threadpool and other callers can use to indicate if the threadpool called the notifier, or something else.

This occurs when keep-alive is enabled, as the nvme manager has state to restore which will spawn a task on the remote VP that is currently running sidecar due to nvme being restored before vp_set. Reenable sidecar with keep alive tests now that this is fixed.

Fixes #1345.

@chris-oo chris-oo requested a review from a team as a code owner October 2, 2025 20:52
@Copilot Copilot AI review requested due to automatic review settings October 2, 2025 20:52
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR re-enables sidecar functionality in OpenHCL servicing tests by removing the OPENHCL_SIDECAR=off flag from the command line parameters. The change is being tested to see if it reproduces the same issues as another PR.

Key Changes

  • Removes the sidecar disable flag from OpenHCL command line parameters
  • Updates associated comments that referenced the disabled sidecar functionality
  • Affects both keepalive test functions and the test configuration setup

Copy link

github-actions bot commented Oct 2, 2025

Copy link

github-actions bot commented Oct 3, 2025

@chris-oo chris-oo force-pushed the sidecar-ka-reenable branch from 3f44a06 to e87f8f9 Compare October 10, 2025 17:04
@chris-oo chris-oo added the release-ci-required Add to a PR to trigger PR gates in release mode label Oct 10, 2025
@chris-oo chris-oo force-pushed the sidecar-ka-reenable branch from e87f8f9 to 9983357 Compare October 10, 2025 18:47
@chris-oo chris-oo changed the title [WIP] reenable sidecar tests openhcl/threadpool: fix sidecar spawn when threadpool is already started Oct 10, 2025
Copy link

Copy link
Contributor

@mattkur mattkur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, so am signing off to unblock and get CI coverage. Please keep an eye on the test loops (here and Microsoft-internal) to see if this regresses scenarios for which we might not have adequate CI coverage. Thanks @chris-oo

@chris-oo chris-oo merged commit 6548d27 into microsoft:main Oct 14, 2025
50 of 52 checks passed
@chris-oo chris-oo deleted the sidecar-ka-reenable branch October 14, 2025 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-ci-required Add to a PR to trigger PR gates in release mode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

openhcl_servicing_keepalive test fails with sidecar enabled

3 participants