diff options
author | Frank Barchard <fbarchard@google.com> | 2016-10-24 15:37:08 -0700 |
---|---|---|
committer | Frank Barchard <fbarchard@google.com> | 2016-10-24 15:37:08 -0700 |
commit | f5d5bd88d660232038fe06ed735fe95d2b9f9b61 (patch) | |
tree | 261e96410fe8dc820135178029b0e488b3f2bda6 /source/convert_from.cc | |
parent | 451af5e922e026c266d25abc92e7519acfc9a4c5 (diff) | |
download | libyuv-f5d5bd88d660232038fe06ed735fe95d2b9f9b61.tar.gz |
Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions
R=fbarchard@google.com
BUG=libyuv:634
Performance Gains :- (vs C vectorized)
I422ToARGBRow_MSA : ~1.6x
I422ToRGBARow_MSA : ~1.6x
I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x
Performance Gains :- (vs C non-vectorized)
I422ToARGBRow_MSA : ~7x
I422ToRGBARow_MSA : ~7x
I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x
Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.
Review URL: https://codereview.chromium.org/2430313005 .
Diffstat (limited to 'source/convert_from.cc')
-rw-r--r-- | source/convert_from.cc | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/source/convert_from.cc b/source/convert_from.cc index 89d24f47..7847622c 100644 --- a/source/convert_from.cc +++ b/source/convert_from.cc @@ -459,6 +459,14 @@ static int I420ToRGBAMatrix(const uint8* src_y, int src_stride_y, I422ToRGBARow = I422ToRGBARow_DSPR2; } #endif +#if defined(HAS_I422TORGBAROW_MSA) + if (TestCpuFlag(kCpuHasMSA)) { + I422ToRGBARow = I422ToRGBARow_Any_MSA; + if (IS_ALIGNED(width, 8)) { + I422ToRGBARow = I422ToRGBARow_MSA; + } + } +#endif for (y = 0; y < height; ++y) { I422ToRGBARow(src_y, src_u, src_v, dst_rgba, yuvconstants, width); @@ -848,6 +856,14 @@ int I420ToRGB565Dither(const uint8* src_y, int src_stride_y, I422ToARGBRow = I422ToARGBRow_DSPR2; } #endif +#if defined(HAS_I422TOARGBROW_MSA) + if (TestCpuFlag(kCpuHasMSA)) { + I422ToARGBRow = I422ToARGBRow_Any_MSA; + if (IS_ALIGNED(width, 8)) { + I422ToARGBRow = I422ToARGBRow_MSA; + } + } +#endif #if defined(HAS_ARGBTORGB565DITHERROW_SSE2) if (TestCpuFlag(kCpuHasSSE2)) { ARGBToRGB565DitherRow = ARGBToRGB565DitherRow_Any_SSE2; |