-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
protobuf.js version: 6.10.1
The utf8.read function seems to inserting extra unicode characters sometimes.
Here's a repro (https://repl.it/@masfrost/pbjs-bad-decode) where I check utf8.read against WHATWG TextEncoder. The repro file is not very minimal but this issue seems to be pretty common for us when decoding strings.
FYI for future readers, we monkey patched the library and forced it to use TextDecoder/TextEncoder here https://github.com/replit/crosis/blob/v5.0.3/src/fixUtf8.ts
I think maybe using the standard TextEncoder/Decoder might be the best thing to do here, encoding is just too complicated and I'm sure these standard libraries are faster. Happy to put up a PR if that's an option, otherwise, I don't really have enough time to go splunking into utf8 land.