Skip to content

Correctly map UTF-8 positions to UTF-16  #202

@aochagavia

Description

@aochagavia

The LSP specification says that positions are UTF-16 based, but the positions we use are UTF-8 based. We should correctly map from UTF-8 positions to UTF-16 positions.

See the code in https://github.com/rust-analyzer/rust-analyzer/blob/d1b242262a6617b22140bddd0bed23115c260e74/crates/ra_lsp_server/src/conv.rs#L52

@matklad proposed the following approach on Discord:

We need to add a thing, like LineIndex, but which track non-ascii characters.
I think such index could be an HashSet of lines which contain non-ascii chars. Then, for such lines, we'll do a linear scan to switc between UTF-16/UTF-8 offsets

Metadata

Metadata

Assignees

Labels

E-has-instructionsIssue has some instructions and pointers to code to get started

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions