Skip to content

JIT: Investigate what to do about store forwarding stalls due to block copies #100769

@jakobbotsch

Description

@jakobbotsch

When struct copies are transformed into block copies it can lead to store forwarding stalls if the block copy involves padding that was never written. #96524 and #100750 (comment) are examples that show some of the potential cost of these stalls. #99835 (comment) has some discussion as well.

It's possible to generate these struct copies without accessing any padding, but it is at the expense of larger code that is probably slower if the source was also written as a block op, so it is not clear what the right trade off is.

I am also not sure how good the CPUs are at reconstructing the source from several stores. For example, if we wrote both Span<T>._reference and Span<T>._length as 8 bytes, would a 16-byte SIMD read still stall? If it doesn't then perhaps we could alleviate some issues by cheaply extending some stores to cover padding as well.

cc @dotnet/jit-contrib @stephentoub

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions