-
Notifications
You must be signed in to change notification settings - Fork 14k
Add is_ascii function optimized for LoongArch64 for [u8]
#146699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Similar to x86_64, on LoongArch64 we use the `vmskltz.b` instruction to test the high bit in a lane. For longer input cases, the performance improvement is significant. For unaligned cases close to 32 bytes in length, there's some regression, but it seems acceptable. | core benches (MB/s) | Before | After | % | |--------------------------------------------------------|--------|--------|---------| | ascii::is_ascii::short::case00_libcore | 1000 | 1000 | 0.00 | | ascii::is_ascii::medium::case00_libcore | 8000 | 8000 | 0.00 | | ascii::is_ascii::long::case00_libcore | 183947 | 436875 | +137.50 | | ascii::is_ascii::unaligned_head_medium::case00_libcore | 7750 | 2818 | -63.64 | | ascii::is_ascii::unaligned_head_long::case00_libcore | 317681 | 436812 | +37.50 | | ascii::is_ascii::unaligned_tail_medium::case00_libcore | 7750 | 3444 | -55.56 | | ascii::is_ascii::unaligned_tail_long::case00_libcore | 155311 | 436812 | +181.25 | | ascii::is_ascii::unaligned_both_medium::case00_libcore | 7500 | 3333 | -55.56 | | ascii::is_ascii::unaligned_both_long::case00_libcore | 174700 | 436750 | +150.00 |
Collaborator
Member
|
r? libs |
Member
|
@bors r+ |
Collaborator
matthiaskrgr
added a commit
to matthiaskrgr/rust
that referenced
this pull request
Nov 2, 2025
…crum Add `is_ascii` function optimized for LoongArch64 for [u8] Similar to x86_64, on LoongArch64 we use the `vmskltz.b` instruction to test the high bit in a lane. For longer input cases, the performance improvement is significant. For unaligned cases close to 32 bytes in length, there's some regression, but it seems acceptable. | core benches (MB/s) | Before | After | % | |--------------------------------------------------------|--------|--------|---------| | ascii::is_ascii::short::case00_libcore | 1000 | 1000 | 0.00 | | ascii::is_ascii::medium::case00_libcore | 8000 | 8000 | 0.00 | | ascii::is_ascii::long::case00_libcore | 183947 | 436875 | +137.50 | | ascii::is_ascii::unaligned_head_medium::case00_libcore | 7750 | 2818 | -63.64 | | ascii::is_ascii::unaligned_head_long::case00_libcore | 317681 | 436812 | +37.50 | | ascii::is_ascii::unaligned_tail_medium::case00_libcore | 7750 | 3444 | -55.56 | | ascii::is_ascii::unaligned_tail_long::case00_libcore | 155311 | 436812 | +181.25 | | ascii::is_ascii::unaligned_both_medium::case00_libcore | 7500 | 3333 | -55.56 | | ascii::is_ascii::unaligned_both_long::case00_libcore | 174700 | 436750 | +150.00 |
This was referenced Nov 2, 2025
bors
added a commit
that referenced
this pull request
Nov 2, 2025
Rollup of 7 pull requests Successful merges: - #146573 (Constify Range functions) - #146699 (Add `is_ascii` function optimized for LoongArch64 for [u8]) - #148026 (std: don't leak the thread closure if destroying the thread attributes fails) - #148135 (Ignore unix socket related tests for VxWorks) - #148211 (clippy fixes and code simplification) - #148395 (Generalize branch references) - #148405 (Fix suggestion when there were a colon already in generics) r? `@ghost` `@rustbot` modify labels: rollup
rust-timer
added a commit
that referenced
this pull request
Nov 3, 2025
Rollup merge of #146699 - heiher:is-ascii-lsx, r=Mark-Simulacrum Add `is_ascii` function optimized for LoongArch64 for [u8] Similar to x86_64, on LoongArch64 we use the `vmskltz.b` instruction to test the high bit in a lane. For longer input cases, the performance improvement is significant. For unaligned cases close to 32 bytes in length, there's some regression, but it seems acceptable. | core benches (MB/s) | Before | After | % | |--------------------------------------------------------|--------|--------|---------| | ascii::is_ascii::short::case00_libcore | 1000 | 1000 | 0.00 | | ascii::is_ascii::medium::case00_libcore | 8000 | 8000 | 0.00 | | ascii::is_ascii::long::case00_libcore | 183947 | 436875 | +137.50 | | ascii::is_ascii::unaligned_head_medium::case00_libcore | 7750 | 2818 | -63.64 | | ascii::is_ascii::unaligned_head_long::case00_libcore | 317681 | 436812 | +37.50 | | ascii::is_ascii::unaligned_tail_medium::case00_libcore | 7750 | 3444 | -55.56 | | ascii::is_ascii::unaligned_tail_long::case00_libcore | 155311 | 436812 | +181.25 | | ascii::is_ascii::unaligned_both_medium::case00_libcore | 7500 | 3333 | -55.56 | | ascii::is_ascii::unaligned_both_long::case00_libcore | 174700 | 436750 | +150.00 |
github-actions bot
pushed a commit
to rust-lang/rust-analyzer
that referenced
this pull request
Nov 3, 2025
Rollup of 7 pull requests Successful merges: - rust-lang/rust#146573 (Constify Range functions) - rust-lang/rust#146699 (Add `is_ascii` function optimized for LoongArch64 for [u8]) - rust-lang/rust#148026 (std: don't leak the thread closure if destroying the thread attributes fails) - rust-lang/rust#148135 (Ignore unix socket related tests for VxWorks) - rust-lang/rust#148211 (clippy fixes and code simplification) - rust-lang/rust#148395 (Generalize branch references) - rust-lang/rust#148405 (Fix suggestion when there were a colon already in generics) r? `@ghost` `@rustbot` modify labels: rollup
github-actions bot
pushed a commit
to rust-lang/rustc-dev-guide
that referenced
this pull request
Nov 3, 2025
Rollup of 7 pull requests Successful merges: - rust-lang/rust#146573 (Constify Range functions) - rust-lang/rust#146699 (Add `is_ascii` function optimized for LoongArch64 for [u8]) - rust-lang/rust#148026 (std: don't leak the thread closure if destroying the thread attributes fails) - rust-lang/rust#148135 (Ignore unix socket related tests for VxWorks) - rust-lang/rust#148211 (clippy fixes and code simplification) - rust-lang/rust#148395 (Generalize branch references) - rust-lang/rust#148405 (Fix suggestion when there were a colon already in generics) r? `@ghost` `@rustbot` modify labels: rollup
github-actions bot
pushed a commit
to rust-lang/miri
that referenced
this pull request
Nov 3, 2025
Rollup of 7 pull requests Successful merges: - rust-lang/rust#146573 (Constify Range functions) - rust-lang/rust#146699 (Add `is_ascii` function optimized for LoongArch64 for [u8]) - rust-lang/rust#148026 (std: don't leak the thread closure if destroying the thread attributes fails) - rust-lang/rust#148135 (Ignore unix socket related tests for VxWorks) - rust-lang/rust#148211 (clippy fixes and code simplification) - rust-lang/rust#148395 (Generalize branch references) - rust-lang/rust#148405 (Fix suggestion when there were a colon already in generics) r? `@ghost` `@rustbot` modify labels: rollup
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
S-waiting-on-bors
Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
T-libs
Relevant to the library team, which will review and decide on the PR/issue.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Similar to x86_64, on LoongArch64 we use the
vmskltz.binstruction to test the high bit in a lane.For longer input cases, the performance improvement is significant. For unaligned cases close to 32 bytes in length, there's some regression, but it seems acceptable.