Modified the loop that counts non-ASCII characters in _string_from_ge… #14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When scanning lots of data, the existing code can spend significant amounts of time in the loop that checks for non-ASCII characters.
Unfortunately, all compilers I tested failed to properly vectorize the loop; on the other hand, it is easy to just use a bitmask + popcount instruction to check 8 characters per loop iteration. This should also be nice for speculative execution, as there are no branches to mispredict inside the loop any more. In my benchmarks, the new version is about 6-8x faster.
This may lead to a performance regression on pre-2010 intel CPUs - but I am not sure that matters still?