-
Notifications
You must be signed in to change notification settings - Fork 315
Add continue request usage #3605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@Wraith2 This looks awesome - looks like you also fixed the issue with increased allocations.. I will give this a try against the repro VM asap |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
That was fixed in the original fix so it's already merged to main. 4b891d8 Since i've just reviewed the other PR i'll cc a couple of people who were interested in it and may want to test this if it generates artifacts, @MichelZ @rhuijben |
|
Excellent! I've built the PR locally and kicked off my stress tests, including the XML case. These will likely take a few days to fully run, but I'll post the results when available. |
The performance on the xml is shockingly bad. When I realised just how slow it was going on the current main branch I thought I must have either written the test to loop 1000 times or somehow broken SqlCachedReader. |
|
@backstromjoel @valentiniliescu @nilzzzzzz @CSharpFiasco @luca-domenichini @warappa @stevendarby @wjrogers @AnderssonPeter @p10tyr @Eli-Black-Work @BradBarnich @igbenic You have all expressed an interest in this performance issue - please test this PR build! |
On our side, we tried to repro one issue that was happening with 6.1.0 (randomly receiving null values for a json serialized string field).
From a performance perspective, I cannot add valuable info as our test was quite simple and not benchmarked. |
|
@Wraith2 The |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3605 +/- ##
==========================================
- Coverage 65.48% 0 -65.49%
==========================================
Files 275 0 -275
Lines 61518 0 -61518
==========================================
- Hits 40288 0 -40288
+ Misses 21230 0 -21230
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@Wraith2 How do I set an appcontext swith for an entire xUnit test assembly? |
|
Probably in the ctor but because of the caching in the library once you've set it and the value has been read you can't change the value. To do that you'll need to use private reflection as we do in the tests here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment!
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs
Outdated
Show resolved
Hide resolved
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
@Wraith2 The |
|
@Wraith2 FYI, I added this class to the test project: using Xunit.Sdk;
[assembly: TestFramework("Microsoft.EntityFrameworkCore.AssemblyFixture", "Microsoft.EntityFrameworkCore.SqlServer.FunctionalTests")]
namespace Microsoft.EntityFrameworkCore;
public sealed class AssemblyFixture : XunitTestFramework
{
public AssemblyFixture(IMessageSink messageSink)
:base(messageSink)
{
AppContext.SetSwitch("Switch.Microsoft.Data.SqlClient.UseCompatibilityAsyncBehaviour", false);
}
} |
|
The stress-testing passed as expected, but it's worth noting that we currently need two AppContext switches to enable this: AppContext.SetSwitch("Switch.Microsoft.Data.SqlClient.UseCompatibilityAsyncBehaviour", false);
AppContext.SetSwitch("Switch.Microsoft.Data.SqlClient.UseCompatibilityProcessSni", false);If the latter switch is not explicitly set, we see increased memory usage but not the performance improvements. Async and sync are very close to parity in normal circumstances, remote SQL Server instances saw nearly identical results in my benchmarks. I noticed that in situations where the SQL Server is able to keep up with the supply of PLP data, the speed improvements here mean that SqlClient is able to call code paths in TryReadByteArray which copy packet buffers around when building the snapshot. This is the current speed bottleneck as far as I can tell. While nothing's needed for this PR, it looks like SQL Server will try to adjust the size of the PLP data blocks it sends to align with the number of bytes left in each TDS packet. With that in mind, we could avoid copying memory; if a read will take us precisely to the end of the current TDS packet, we could construct a correctly-sized I also saw a minor (~5%) performance bump in XML data when I changed the |
|
@edwardneal Thanks for the info about the switches, I was not aware of that. Let me try to add both to the EF Core test suite |
|
The public static bool UseCompatibilityAsyncBehaviour
{
get
{
if (UseCompatibilityProcessSni)
{
// If ProcessSni compatibility mode has been enabled then the packet
// multiplexer has been disabled. The new async behaviour using continue
// point capture is only stable if the multiplexer is enabled so we must
// return true to enable compatibility async behaviour using only restarts.
return true;
}
if (s_useCompatibilityAsyncBehaviour == Tristate.NotInitialized)
{
if (!AppContext.TryGetSwitch(UseCompatibilityAsyncBehaviourString, out bool returnedValue) || returnedValue)
{
s_useCompatibilityAsyncBehaviour = Tristate.True;
}
else
{
s_useCompatibilityAsyncBehaviour = Tristate.False;
}
}
return s_useCompatibilityAsyncBehaviour == Tristate.True;
}
}which unless I can't mentally process boolean logic (it's happened) means that it defaults to true and you have to turn it off. However because If you're settings both flags explicitly you'll have to manage your combinations and that's likely what's happening in our tests but for casual users it should only require one settings. |
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParser.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs
Show resolved
Hide resolved
|
Is this waiting for anything other than a second review @dotnet/sqlclientdevteam ? |
|
@Wraith2 I've been out for a couple weeks. I'll start taking a look tomorrow! |
|
Thanks for all your work on this! Assuming no problems are found, will this PR be included in a 6.1.x release? Those questions are especially relevant for all users who don't know about this problem and just get SqlClient as an EF dependency. EF 10 uses SqlClient 6.1 and won't update to any non-LTS version. And a small PS: You bolded the wrong lines in the "XmlRead" performance result table in the PR description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks reasonable, but I'm struggling to see what actually changed in the logic. It seems like the if (isPlp) check just shifted down a level. The request continue behavior you described in the PR description was already present, wasn't it? The only related change I see is the addition of RequestContinue(true) within TryReadByteArrayWithContinue. I don't see any spot that newly checks whether a continue has been requested, unless I'm missing it.
|
The only important change is the addition of the calls to RequestContinue. Everything needed by those calls was already present but could not be activated. The request enables continue mode only for the read of the data that we know supports it. The revert of the flag, the conditional logic around ContinueEnabled, it was all already present but was previously only capable of working in async-retry mode. |
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParserStateObject.cs
Show resolved
Hide resolved
|
@Wraith2 where should I add this switch call in my code (not tests)? Which version of SqlClient will take this switch into account? Any timeline when it will be released? Thx for your hard work! Br |
That's your choice and will depend on how you structure your code. As long as the switch value is set before the first read is called then it will be active, the value is cached so once set it cannot be changed.
The version released from the |
Thx for the clarification. |
|
@rizi - I've added this to the 7.0.0 Preview 2 milestone, release expected mid/late October. |
Thx for the info |
See line 6778 of change to netcore TdsParser in dotnet#3605
Builds on the stable base provided by #3534 and adds the optional async-continue capability back in. It is optional, it is not on by default in this build. To use it you must set a switch:
AppContext.SetSwitch("Switch.Microsoft.Data.SqlClient.UseCompatibilityAsyncBehaviour", false);this will use the new multiplexer and async-continue support already defaults to enabled.This changes the way that continue is used from the original implementation. Originally continue was always available which proved to cause problems with routes through the code that did not expect to be able to fail causing lost context. This new approach runs as if continue is disabled until a function is called which explicitly requests continue mode to be enabled. The only functions that can do this are the ones that read multi-packet string or binary data. The request is cleared when the read completes so it cannot affect other reads from the same packet.
Benchmark results:
All Reads were 10 Mib of data.
BinaryRead
StringRead
XmlRead
Here are the benchmark files that I used to get these numbers. You'll need connection strings and to generate a random 10Mib xml files from somewhere on the web. Benchmarks.zip
The async xml numbers are pretty unbelievable. If you replicate or improve the benches it would be good to get perf corroboration.
@ErikEJ this will be the build you'll want to test and then possible make available to people, as discussed on the previous PR.
@edwardneal if you have the resources could you run your fantastic test suite against this version and see if you can identify any new bugs?
@dotnet/sqlclientdevteam can you run CI please.