Skip to content

Conversation

@Sergio0694
Copy link
Member

@Sergio0694 Sergio0694 commented Dec 15, 2020

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR does a few things:

  • Speed optimizations to the 2D pass convolution processor (powering gaussian blur, sharpen, etc.)
  • Speed optimizations to the bokeh blur
  • Some general codegen optimizations that should apply to all common pixel conversions, etc.

Benchmarks

Here's a preview of the current improvements for the gaussian blur benchmark:

image

And here's some more bokeh blur optimizations compared to master, after #1475 got merged:

image

@Sergio0694 Sergio0694 added this to the 1.1.0 milestone Dec 15, 2020
@codecov
Copy link

codecov bot commented Dec 15, 2020

Codecov Report

Merging #1477 (5601559) into master (a8cae3f) will decrease coverage by 0.07%.
The diff coverage is 78.37%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1477      +/-   ##
==========================================
- Coverage   83.55%   83.48%   -0.08%     
==========================================
  Files         741      740       -1     
  Lines       32462    32559      +97     
  Branches     3648     3652       +4     
==========================================
+ Hits        27125    27181      +56     
- Misses       4625     4665      +40     
- Partials      712      713       +1     
Flag Coverage Δ
unittests 83.48% <78.37%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...s/Convolution/Convolution2PassProcessor{TPixel}.cs 60.86% <58.85%> (-39.14%) ⬇️
...mageSharp/ColorSpaces/Companding/SRgbCompanding.cs 100.00% <100.00%> (ø)
src/ImageSharp/Common/Helpers/Numerics.cs 97.80% <100.00%> (+0.15%) ⬆️
...rp/PixelFormats/Utils/Vector4Converters.Default.cs 100.00% <100.00%> (ø)
...ssing/Processors/Convolution/BokehBlurProcessor.cs 100.00% <100.00%> (ø)
...ocessors/Convolution/BokehBlurProcessor{TPixel}.cs 99.35% <100.00%> (+0.01%) ⬆️
...Processors/Convolution/BoxBlurProcessor{TPixel}.cs 100.00% <100.00%> (ø)
...cessors/Convolution/ConvolutionProcessorHelpers.cs 100.00% <100.00%> (ø)
...ssors/Convolution/GaussianBlurProcessor{TPixel}.cs 100.00% <100.00%> (ø)
...rs/Convolution/GaussianSharpenProcessor{TPixel}.cs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1a0fb6...5601559. Read the comment docs.

@Sergio0694 Sergio0694 marked this pull request as ready for review December 15, 2020 22:59
/// <param name="channel">The channel value.</param>
/// <returns>The <see cref="float"/> representing the nonlinear channel value.</returns>
[MethodImpl(InliningOptions.ShortMethod)]
public static float Compress(float channel) => channel <= 0.0031308F ? 12.92F * channel : (1.055F * MathF.Pow(channel, 0.416666666666667F)) - 0.055F;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we ever figure out how to do an accurate SIMD enable approximation of this we would be laughing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pow(channel, 0.416666666666667F) => exp(channel * log(0.416666666666667F))

log(0.416666666666667F) == -0.875468737353899935628f

So...

public static void Compress(ref Vector4 vector)
{
    var channels = Unsafe.As<Vector4, Vector128<float>>(ref vector);
    var log = Vector128.Create(-0.875468737353899935628f);

    channels = Sse.Multiply(channels, log);

    channels = Exp(channels); // Isn't simd intrinsic

    if (Fma.IsSupported)
    {
        channels = Fma.MultiplyAdd(Vector128.Create(1.055F), channels, Vector128.Create(-0.055F));
    }
    else
    {
        channels = Sse.Add(Sse.Multiply(Vector128.Create(1.055F), channels), Vector128.Create(-0.055F));
    }

    Unsafe.As<Vector4, Vector128<float>>(ref vector) = channels;
}

But Exp isn't a Simd intrinsic; however you can approximate it with these sequences sse_mathfun or avx_mathfun?

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very, very nice! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants