-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Background and motivation
For ASP.NET Core's StringUtilities the ASCII values of the range (0x00, 0x80) are considered valid, whilst Ascii.ToUtf16 treats the whole ASCII range [0x00, 0x80) as valid. In order to base StringUtilities on the Ascii-APIs and avoid custom vectorized code in ASP.NET Core internals \0 should be allowed to be treated as invalid. See dotnet/aspnetcore#45962 for further info.
API Proposal
namespace System.Buffers.Text
{
public static class Ascii
{
// existing methods
+ public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten, bool treatNullAsInvalid = false);
}
}The new ASCII-APIs will get added to .NET 8, so w/o breaking change an optional argument could be added.
namespace System.Buffers.Text
{
public static class Ascii
{
// existing methods
- public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten);
+ public static OperationStatus ToUtf16(ReadOnlySpan<byte> source, Span<char> destination, out int bytesConsumed, out int charsWritten, bool treatNullAsInvalid = false);
}
}API Usage
private static unsafe void GetHeaderName(ReadOnlySpan<byte> source, Span<char> buffer)
{
OperationStatus status = Ascii.ToUtf16(source, buffer, out _, out _, treatNullAsInvalid: true);
if (status != OperationStatus.Done)
{
KestrelBadHttpRequestException.Throw(RequestRejectionReason.InvalidCharactersInHeaderName);
}
}Alternative Designs
No response
Risks
The value for treatNullAsInvalid will be given as constant, so the JIT should be able to dead-code eliminate any code needed for "default case" (whole ASCII-range incl. \0), so no perf-regression should be expected.
Besides treating \0 as special value which is optinally treated as invalid I don't expect any other value to be considered special enough for optional exclusion.