Skip to content

Conversation

cheenamalhotra
Copy link
Member

@cheenamalhotra cheenamalhotra commented Sep 23, 2025

Fixes connection pool concurrency issue caused due to waiting on multiple semaphore handles, instead of only creation handle during pool create request.

Issue was first introduced in PR #2390 (https://github.com/dotnet/SqlClient/pull/2390/files#diff-d7c1dbddf2a7e205d7a85cbafbef036899799672dd5d1c486a64a05979e92230R1627) and has affected MDS v6 onwards.

Needs to be backported to v6.0 and v6.1 branches as well.

…le semaphore handles, instead of only creation handle during pool create request.
@cheenamalhotra cheenamalhotra requested a review from a team as a code owner September 23, 2025 22:04
@Copilot Copilot AI review requested due to automatic review settings September 23, 2025 22:04
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a concurrency issue in the connection pool by changing how semaphore handles are waited on during pool creation requests. Instead of waiting on multiple semaphore handles simultaneously, it now waits only on the creation handle to prevent potential race conditions.

  • Replaces WaitHandle.WaitAny() with direct WaitOne() on the creation semaphore
  • Simplifies the wait result handling logic by removing the intermediate wait result case
  • Removes unnecessary error tracing for non-timeout wait failures

Copy link

codecov bot commented Sep 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.82%. Comparing base (37d8a3f) to head (0775c85).
⚠️ Report is 18 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #3632       +/-   ##
===========================================
+ Coverage   66.13%   90.82%   +24.69%     
===========================================
  Files         276        6      -270     
  Lines       60765      316    -60449     
===========================================
- Hits        40184      287    -39897     
+ Misses      20581       29    -20552     
Flag Coverage Δ
addons 90.82% <ø> (ø)
netcore ?
netfx ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cheenamalhotra cheenamalhotra added this to the 7.0-preview2 milestone Sep 23, 2025
@paulmedynski paulmedynski self-assigned this Sep 24, 2025
Copy link
Contributor

@paulmedynski paulmedynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking to simplify/clarify the semaphore block.

Also, is there a way to test this?

@edwardneal
Copy link
Contributor

This might also need to be backported to v5.1. #2390 ported the change from netcore, which has exactly the same behaviour:

waitResult = WaitHandle.WaitAny(_waitHandles.GetHandles(withCreate: true), CreationTimeout);

v6.0 is the first point that it impacted netfx, but this has been the netcore behaviour from the outset of the driver (even when going back to the PR which added System.Data.SqlClient to its first version of netcore.)

@samsharma2700 samsharma2700 self-assigned this Sep 26, 2025
@mdaigle mdaigle self-assigned this Sep 30, 2025
@cheenamalhotra cheenamalhotra changed the title WIP: Fix connection pool concurrency issue Fix connection pool concurrency issue Oct 2, 2025
@cheenamalhotra
Copy link
Member Author

This might also need to be backported to v5.1. #2390 ported the change from netcore, which has exactly the same behaviour:

waitResult = WaitHandle.WaitAny(_waitHandles.GetHandles(withCreate: true), CreationTimeout);

v6.0 is the first point that it impacted netfx, but this has been the netcore behaviour from the outset of the driver (even when going back to the PR which added System.Data.SqlClient to its first version of netcore.)

I've verified this directly as well as in the test suite, this doesn't seem to cause any concerns with the repro I have, also added as a test case in the PR.

The same test case fails with NetFx (without fix), but passes in NetCore without any changes.

I'm going to give that a pass unless there are any reported concerns. It maybe possible that there's a different difference between NetFx and NetCore source that is part of the problem. But I'll leave that until code merge happens for all the layers.

The test case will fail if after merge there is inconsistency in functionality and we'll know!

mdaigle
mdaigle previously approved these changes Oct 2, 2025
paulmedynski
paulmedynski previously approved these changes Oct 3, 2025
@cheenamalhotra cheenamalhotra merged commit 12934f4 into main Oct 3, 2025
250 of 252 checks passed
@cheenamalhotra cheenamalhotra deleted the dev/cheena/pool-concurrency-fix branch October 3, 2025 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants