Skip to content

vector_algorithms.cpp: Remove the distinction between SSE2 and SSE4.2 #4536

@StephanTLavavej

Description

@StephanTLavavej

During code review, I've prevented two bugs where usage of post-SSE2 instructions was being incorrectly guarded with _Use_sse2() - see #4384 (comment) and #4495 (comment). This is extremely hazardous, and the correctness of the STL shouldn't depend on whether I've had 270 mg of caffeine every single time I've reviewed a vectorization PR.

At this time, we still need to support the tiny fraction (~0.7%, I've heard) of processors that have SSE2 but not SSE4.2. However, we don't need to extend novel optimizations to them - they were perfectly happy running classic STL algorithms up to 2019.

We should prevent this class of mistakes by removing the distinction between SSE2 and SSE4.2 in vector_algorithms.cpp. That is, we should test for the presence of SSE4.2 only, before attempting to use anything up to and including SSE4.2. (This will supersede the error-prone _Traits::_Sse_available().)

We'll still need a distinction between "SSE4.2 is available" and "AVX2 is available", but I consider this to be much less dangerous, because AVX/AVX2 intrinsics and types are very distinctive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementSomething can be improvedfixedSomething works now, yay!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions