-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
In a nutshell, there are certain characters where CharUnicodeInfo.GetUnicodeCategory returns the correct value, but Char.GetUnicodeCategory returns the wrong value. One such character is U+00B6 PILCROW SIGN, where CharUnicodeInfo returns OtherPunctuation (which is correct) and where Char returns OtherSymbol (which is incorrect). This also affects the behavior of dependent methods like Char.IsPunctuation and Char.IsLower.
MSDN says this behavior is intentional to preserve back-compat, but it is extraordinarily confusing to have two methods with the same name have different behavior.
One solution would be to update Char.GetUnicodeCategory to stay in sync with CharUnicodeInfo.GetUnicodeCategory. This is a breaking change, but it's the type of breaking change that is normally allowed in side-by-side major version updates.
An alternative is to mark Char.GetUnicodeCategory, Char.IsPunctuation, Char.IsLower, etc. as obsolete and to direct users to call into CharUnicodeInfo instead. This preserves existing behavior and provides a migration story to get developers on to the APIs which provide correct results.