Apply hardware intrinsics to `BitArray.*Shift` #113299

kzrnm · 2025-03-09T05:23:59Z

benchmark


BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3194)
13th Gen Intel Core i5-13500, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.100-preview.3.25125.5
  [Host]     : .NET 10.0.0 (10.0.25.12411), X64 RyuJIT AVX2
  Job-UQJRRD : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-IYICYK : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Method	Toolchain	N	Mean	Ratio	Allocated	Alloc Ratio
LeftShift	\main\corerun.exe	256	45.553 ns	1.00	-	NA
LeftShift	\pr\corerun.exe	256	10.037 ns	0.22	-	NA

RightShift	\main\corerun.exe	256	56.281 ns	1.00	-	NA
RightShift	\pr\corerun.exe	256	9.565 ns	0.17	-	NA

LeftShift	\main\corerun.exe	65536	10,493.384 ns	1.00	-	NA
LeftShift	\pr\corerun.exe	65536	1,699.862 ns	0.16	-	NA

RightShift	\main\corerun.exe	65536	11,729.467 ns	1.00	-	NA
RightShift	\pr\corerun.exe	65536	1,860.553 ns	0.16	-	NA

LeftShift	\main\corerun.exe	16777216	3,054,162.161 ns	1.00	-	NA
LeftShift	\pr\corerun.exe	16777216	1,051,349.065 ns	0.34	-	NA

RightShift	\main\corerun.exe	16777216	3,341,553.776 ns	1.00	-	NA
RightShift	\pr\corerun.exe	16777216	639,720.483 ns	0.19	-	NA

Benchmark code

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using System.Collections;

[MemoryDiagnoser(false)]
[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
public class ShiftTest
{
    [Params([
        1 << 8,
        1 << 16,
        1 << 24,
    ])]
    public int N;
    const int shiftSize = 13;
    BitArray b;

    [GlobalSetup]
    public void Setup()
    {
        var bytes = new byte[N];
        new Random(227).NextBytes(bytes);
        b = new BitArray(bytes);
    }

    [Benchmark] public BitArray LeftShift() => b.LeftShift(shiftSize);
    [Benchmark] public BitArray RightShift() => b.RightShift(shiftSize);
}

eiriktsarpalis · 2025-07-01T12:11:40Z

Hi @kzrnm, apologies for the delay in reviewing this. Is there a chance you could rebase your changes on top of the latest main? I don't have write permissions in your PR branch to do it myself.

kzrnm · 2025-07-03T17:19:23Z

I updated to match the latest implementation.
In commit e96a2b8, the implementation for shifting as a 32-bit integer has been updated to apply a mask within the byte range, aligning it with the behavior of the Apply method.

Copilot

Pull Request Overview

This PR optimizes the BitArray.LeftShift and BitArray.RightShift methods by leveraging hardware intrinsics (Vector512, Vector256, Vector128) to achieve significant performance improvements. The benchmark results show 3-6x speedup across different array sizes.

Key Changes:

Replaced int-based shifting with byte-based operations using SIMD vectors
Introduced vectorized processing for bulk shift operations with fallback to scalar code
Added comprehensive test coverage for various bit array sizes and shift amounts

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
BitArray.cs	Refactored `RightShift` and `LeftShift` to use byte-level operations with SIMD intrinsics (Vector512/256/128), replacing the previous int-based implementation
BitArray_OperatorsTests.cs	Enhanced test coverage with additional test cases covering various sizes (1023-1025) and shift amounts, plus validation of high-order bit clearing

Copilot · 2025-10-24T16:08:37Z

src/libraries/System.Private.CoreLib/src/System/Collections/BitArray.cs

+
+                    while (fromIndex >= 5)
+                    {
+                        uint lo = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)(fromIndex -= 4))) << shiftCount) & shiftMask;


[nitpick] The decrement operation fromIndex -= 4 is embedded within the offset calculation, which reduces readability. Consider separating this into two statements: first decrement fromIndex, then perform the read operation.

Suggested change

uint lo = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)(fromIndex -= 4))) << shiftCount) & shiftMask;

fromIndex -= 4;

uint lo = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)fromIndex)) << shiftCount) & shiftMask;

Copilot · 2025-10-24T16:08:37Z

src/libraries/System.Private.CoreLib/src/System/Collections/BitArray.cs

+                        uint lo = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)(fromIndex -= 4))) << shiftCount) & shiftMask;
+                        uint hi = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)(fromIndex - 1))) >>> carryCount) & carryMask;
+                        uint result = hi | lo;
+                        Unsafe.WriteUnaligned(ref Unsafe.AddByteOffset(ref p, toIndex -= 4), result);


[nitpick] The decrement operation toIndex -= 4 is embedded within the offset calculation, which reduces readability. Consider separating this into two statements: first decrement toIndex, then perform the write operation.

Suggested change

Unsafe.WriteUnaligned(ref Unsafe.AddByteOffset(ref p, toIndex -= 4), result);

toIndex -= 4;

Unsafe.WriteUnaligned(ref Unsafe.AddByteOffset(ref p, toIndex), result);

Copilot · 2025-10-24T16:08:37Z

src/libraries/System.Private.CoreLib/src/System/Collections/BitArray.cs

+
+                while (fromIndex >= TVector.ElementCount + 1)
+                {
+                    TVector hi = TVector.LoadUnsafe(ref p, (nuint)(fromIndex -= TVector.ElementCount)) << shiftCount;


[nitpick] The decrement operation fromIndex -= TVector.ElementCount is embedded within the offset calculation, which reduces readability. Consider separating this into two statements: first decrement fromIndex, then perform the load operation.

Suggested change

TVector hi = TVector.LoadUnsafe(ref p, (nuint)(fromIndex -= TVector.ElementCount)) << shiftCount;

fromIndex -= TVector.ElementCount;

TVector hi = TVector.LoadUnsafe(ref p, (nuint)fromIndex) << shiftCount;

Copilot · 2025-10-24T16:08:38Z

src/libraries/System.Private.CoreLib/src/System/Collections/BitArray.cs

+                    TVector hi = TVector.LoadUnsafe(ref p, (nuint)(fromIndex -= TVector.ElementCount)) << shiftCount;
+                    TVector lo = TVector.LoadUnsafe(ref p, (nuint)(fromIndex - 1)) >>> carryCount;
+                    TVector result = hi | lo;
+                    result.StoreUnsafe(ref p, (nuint)(toIndex -= TVector.ElementCount));


[nitpick] The decrement operation toIndex -= TVector.ElementCount is embedded within the offset calculation, which reduces readability. Consider separating this into two statements: first decrement toIndex, then perform the store operation.

Suggested change

result.StoreUnsafe(ref p, (nuint)(toIndex -= TVector.ElementCount));

toIndex -= TVector.ElementCount;

result.StoreUnsafe(ref p, (nuint)toIndex);

Copilot · 2025-10-24T16:08:38Z

src/libraries/System.Collections/tests/BitArray/BitArray_OperatorsTests.cs

+            (int byteIndex, int bitOffeset) = Math.DivRem(length, BitsPerByte);
+            if (bitOffeset != 0)
+            {
+                Span<byte> bs = CollectionsMarshal.AsBytes(ba);
+                Assert.Equal(byteIndex + 1, bs.Length);
+                Assert.Equal(0, bs[byteIndex] >> bitOffeset);


Corrected spelling of 'bitOffeset' to 'bitOffset'.

Suggested change

(int byteIndex, int bitOffeset) = Math.DivRem(length, BitsPerByte);

if (bitOffeset != 0)

{

Span<byte> bs = CollectionsMarshal.AsBytes(ba);

Assert.Equal(byteIndex + 1, bs.Length);

Assert.Equal(0, bs[byteIndex] >> bitOffeset);

(int byteIndex, int bitOffset) = Math.DivRem(length, BitsPerByte);

if (bitOffset != 0)

{

Span<byte> bs = CollectionsMarshal.AsBytes(ba);

Assert.Equal(byteIndex + 1, bs.Length);

Assert.Equal(0, bs[byteIndex] >> bitOffset);

ghost added the area-System.Collections label Mar 9, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 9, 2025

eiriktsarpalis added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 1, 2025

kzrnm added 2 commits July 3, 2025 02:19

Add test caces for BitArray.*Shift

e29dedd

hardware intrinsic in BitArray.*Shift

a01656a

kzrnm force-pushed the BitArrayShift branch from aae4e4d to a01656a Compare July 2, 2025 17:54

dotnet-policy-service bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 2, 2025

ClearHighExtraBits

a482701

eiriktsarpalis requested a review from tannergooding July 3, 2025 13:13

This was referenced Jul 3, 2025

WASM - Dev certificate error on Windows #116695

Closed

SendAsync_OperationCanceledException_RecordsActivitiesWithCorrectErrorInfo test failed with "Exception type was not an exact match" #117161

Closed

Update logic of Int32

e96a2b8

Merge branch 'main' into BitArrayShift

7681ebc

jeffhandley requested a review from PranavSenthilnathan September 1, 2025 21:18

jeffhandley assigned PranavSenthilnathan Sep 1, 2025

build-analysis bot mentioned this pull request Sep 2, 2025

AF: *(_UNCHECKED_OBJECTREF *)handle == NULL (HndCreateHandle called by getJitHandleForObject) #117138

Open

Merge branch 'main' into BitArrayShift

1c77a92

tannergooding requested a review from Copilot October 24, 2025 16:07

Copilot AI reviewed Oct 24, 2025

View reviewed changes

build-analysis bot mentioned this pull request Oct 24, 2025

Test failure: System.IO.Tests.FileInfo_SymbolicLinks.ResolveLinkTarget_Succeeds #120380

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Apply hardware intrinsics to `BitArray.*Shift` #113299

Apply hardware intrinsics to `BitArray.*Shift` #113299

kzrnm commented Mar 9, 2025

Uh oh!

eiriktsarpalis commented Jul 1, 2025

Uh oh!

kzrnm commented Jul 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 24, 2025

Uh oh!

Copilot AI Oct 24, 2025

Uh oh!

Copilot AI Oct 24, 2025

Uh oh!

Copilot AI Oct 24, 2025

Uh oh!

Copilot AI Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	uint lo = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)(fromIndex -= 4))) << shiftCount) & shiftMask;
	fromIndex -= 4;
	uint lo = (Unsafe.ReadUnaligned<uint>(ref Unsafe.AddByteOffset(ref p, (uint)fromIndex)) << shiftCount) & shiftMask;

	Unsafe.WriteUnaligned(ref Unsafe.AddByteOffset(ref p, toIndex -= 4), result);
	toIndex -= 4;
	Unsafe.WriteUnaligned(ref Unsafe.AddByteOffset(ref p, toIndex), result);

	TVector hi = TVector.LoadUnsafe(ref p, (nuint)(fromIndex -= TVector.ElementCount)) << shiftCount;
	fromIndex -= TVector.ElementCount;
	TVector hi = TVector.LoadUnsafe(ref p, (nuint)fromIndex) << shiftCount;

	result.StoreUnsafe(ref p, (nuint)(toIndex -= TVector.ElementCount));
	toIndex -= TVector.ElementCount;
	result.StoreUnsafe(ref p, (nuint)toIndex);

Apply hardware intrinsics to BitArray.*Shift #113299

Are you sure you want to change the base?

Apply hardware intrinsics to BitArray.*Shift #113299

Conversation

kzrnm commented Mar 9, 2025

benchmark

Uh oh!

eiriktsarpalis commented Jul 1, 2025

Uh oh!

kzrnm commented Jul 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Apply hardware intrinsics to `BitArray.*Shift` #113299

Apply hardware intrinsics to `BitArray.*Shift` #113299