Skip to content

Benchmarks

Dave edited this page Feb 7, 2024 · 3 revisions

Feb 2024 Update

  • OpenMP Extension to parallelize loops and add SIMD instructions

Comparison for $x < 256\times 256$ 2d Matrix:

image

Relative performance for $x > 256\times 256$ 2d Matrix:

image

Clone this wiki locally