-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Closed
Description
Description
Converting from GB18030 encoded data to string throws an exception in net9.
Reproduction Steps
This code throws the exception on net9 but not on net8.
using System.Text;
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
var encoding = Encoding.GetEncoding("GB18030");
ReadOnlySpan<byte> encodedBytes = [0x95, 0x32, 0xB7, 0x37];
// This call throws, encoding GetString or GetChars as well.
var actual = encoding.GetCharCount(encodedBytes);
Console.WriteLine($"CharCount: {actual}");
var bytes = encoding.GetBytes("𠈓");
Console.WriteLine($"EncodedBytes of character match encodedBytes span: {bytes.AsSpan().SequenceEqual(encodedBytes)}");Expected behavior
The conversion from bytes to chars does not throw an exception.
Actual behavior
Unhandled exception. System.ArgumentException: The output char buffer is too small to contain the decoded characters, encoding 'Chinese Simplified (GB18030)' fallback 'System.Text.DecoderReplacementFallback'. (Parameter 'chars')
at System.Text.EncodingNLS.ThrowCharsOverflow()
at System.Text.EncodingNLS.ThrowCharsOverflow(DecoderNLS decoder, Boolean nothingDecoded)
at System.Text.EncodingCharBuffer.AddChar(Char ch1, Char ch2, Int32 numBytes)
at System.Text.GB18030Encoding.GetChars(Byte* bytes, Int32 byteCount, Char* chars, Int32 charCount, DecoderNLS baseDecoder)
at System.Text.GB18030Encoding.GetCharCount(Byte* bytes, Int32 count, DecoderNLS baseDecoder)
at System.Text.Encoding.GetCharCount(ReadOnlySpan`1 bytes)
at Program.<Main>$(String[] args) in C:\git\Receiver\SYC_Appliance\Test\GB18030DecodeFailure\Program.cs:line 12
Regression?
Yes
Known Workarounds
none
Configuration
.net 9
Windows
x64
Other information
There was a change (861164c) that modified the if statement in EncodingCharBuffer.AddChar. Maybe the logic for uninitialized chars and "counting" has changed.