Age | Commit message (Collapse) | Author |
|
that run out of memory compiling unittests.
Update build files to include the new tests and source code.
Bug: libyuv:956
Change-Id: I6ec0beb6dc9570f0597d7df1835d616489dbaece
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5103585
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
HAS_SCALEARGBROWDOWNEVEN_RVV wasn't defined,
so we cannot use ScaleARGBRowDownEven_RVV & ScaleARGBRowDownEvenBox_RVV.
- Seperate to two conditional statements when selecting DownEven or DownEvenBox.
- Also, add HAS_SCALEARGBROWDOWNEVEN_RVV and disable it by default.
Bug: libyuv:965
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Change-Id: Ic7ec40520b64131a456c6f3eea0639b3620f11ae
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4882441
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
- MSAN fails on most inline assembly, unaware of what the load and store instructions do.
- MSAN is also failing on row_any functions, which memcpy a correct number of pixels into a buffer that is SIMD vector sized, apply SIMD to the full vector, and then memcpy the exact number of resulting pixels to the output buffer. MSAN wants the temporary buffer to be initialized. Which genenerally is done with a memset(buf, 0, sizeof(buf)); to satisify MSAN.
- RVV may not require disabling MSAN, since row functions are all 'any' number of elements, and implementation is intrinsics.
Bug: b/297979878
Change-Id: Ic21200689c0c7d2c85bb1de3eef38570137d3d8b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4832740
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
|
|
Bug: libyuv:965
Change-Id: I9b02abd13ab3345288655fa7a16383f59cf66bb8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4750230
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
|
|
* Run on SiFive internal FPGA:
Test Case Speedup
I420ScaleDownBy3by8_None 4.2
I420ScaleDownBy3by8_Linear 1.7
I420ScaleDownBy3by8_Bilinear 1.7
I420ScaleDownBy3by8_Box 1.7
I444ScaleDownBy3by8_None 4.2
I444ScaleDownBy3by8_Linear 1.8
I444ScaleDownBy3by8_Bilinear 1.8
I444ScaleDownBy3by8_Box 1.8
Change-Id: Ic2e98de2494d9e7b25f5db115a7f21c618eaefed
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4711857
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
ScaleUVRowUp2_(Bi)linear_RVV function is equal to other platforms' ScaleRowUp2_(Bi)linear_Any_XXX.
We process entire row in this function.
Other platforms only implement non-edge part of image and process edge with scalar.
ScaleRowUp2_(Bi)linear_Any_XXX: Combine ScaleRowUp2_(Bi)linear_XXX(non-edge) + ScaleRowUp2_(Bi)linear_C(edge) by SBUH2LANY/SU2BLANY.
* Run on SiFive internal FPGA:
Test case RVV function Speedup
I444ScaleFrom640x360_Bilinear ScaleRowUp2_Bilinear_RVV 8.21
I444ScaleFrom640x360_Linear ScaleRowUp2_Linear_RVV 8.08
UVScaleFrom640x360_Bilinear ScaleUVRowUp2_Bilinear_RVV 7.80
UVScaleFrom640x360_Linear ScaleUVRowUp2_Linear_RVV 7.03
Change-Id: I539245ce51858f077506a78f0e7e82377ac6a95d
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4666062
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Run on SiFive internal FPGA:
Test case RVV function Speedup
I444ScaleDownBy3by4_None ScaleRowDown34_RVV 5.8
I444ScaleDownBy3by4_Linear ScaleRowDown34_0/1_Box_RVV 6.5
I444ScaleDownBy3by4_Bilinear ScaleRowDown34_0/1_Box_RVV 6.3
Bug: libyuv:956
Change-Id: I8ef221ab14d631e14f1ba1aaa25d2b30d4e710db
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4607777
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Run on SiFive internal FPGA:
Test case RVV function Speedup
I444ScaleDownBy3_Box ScaleAddRow_RVV+ScaleAddCols(scalar) 2.8
ARGBScaleDownBy2_None ScaleARGBRowDown2_RVV 2.2
ARGBScaleDownBy2_Linear ScaleARGBRowDown2Linear_RVV 5.0
ARGBScaleDownBy2_Box ScaleARGBRowDown2Box_RVV 4.3
ARGBScaleDownBy4_None ScaleARGBRowDownEven_RVV 1.2
ARGBScaleDownBy8_Box ScaleARGBRowDownEvenBox_RVV 3.2
ARGBScaleDownBy4_Box ScaleARGBRowDown2Box_RVV 4.5
I444ScaleDownBy2_None ScaleRowDown2_RVV 5.8
I444ScaleDownBy2_Linear ScaleRowDown2Linear_RVV 6.1
I444ScaleDownBy2_Box ScaleRowDown2Box_RVV 5.0
I444ScaleDownBy4_None ScaleRowDown4_RVV 3.6
I444ScaleDownBy4_Box ScaleRowDown4Box_RVV 3.5
UVScaleDownBy2_None ScaleUVRowDown2_RVV 5.8
UVScaleDownBy2_Linear ScaleUVRowDown2Linear_RVV 5.6
UVScaleDownBy2_Box ScaleUVRowDown2Box_RVV 4.1
UVScaleDownBy4_None ScaleUVRowDown4_RVV 1.7
UVScaleDownBy4_Box ScaleUVRowDown2Box_RVV 4.5
avg-speedup: 4
Note: Specialize ScaleUVRowDown with step_size=4 by ScaleUVRowDown4_RVV.
Bug: libyuv:956
Change-Id: If9604a6aadf681193f282507602c57c726332202
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4601684
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
- update cpu_id to use "re" for fopen to avoid leaking handles if a thread is started while the file is open.
Bug: libyuv:958
Change-Id: I1af9de68fce12e440e1226fc8070634ccb1bf090
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4417176
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:950
Change-Id: Ic9a094463af875aefd927023f730b5f35f8551de
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4154630
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:950
Change-Id: I5a77bca9a0230fe00abd810939e217833a14683f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4134524
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
- fix crash when width is not a multiple of 16
- apply clang format
- bump version
Bug: libyuv:940, b/240094327
Change-Id: Ic18e5b7b64f78f26e8b7d8440bf490a679bda200
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3812594
Reviewed-by: Wan-Teh Chang <wtc@google.com>
|
|
- Define HAS_SCALEROWUP2_BILINEAR_16_SSE2: it's now fixed.
- Correct function name to ScaleRowUp2_Bilinear_16_Any_SSE2:
this row function uses only SSE2 instructions.
Bug: libyuv:882
Change-Id: Ib1c7ac5b09997cb5b32bc54109d8c566af762433
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3800842
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
- Undefine HAS_SCALEROWUP2_BILINEAR_16_SSE2
- Save XMM7 in ScaleRowUp2_Bilinear_16_SSE2().
- Rename HAS_SCALEROWUP2LINEAR_xxx to HAS_SCALEROWUP2_LINEAR_xxx
- DetileSplitUVRow_C() is implemented using SplitUVRow_C().
- Changes to unit_test/planar_test.cc.
Bug: libyuv:882
Change-Id: I0a8e8e5fb43bdf58ded87244e802343eacb789f2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3795063
Reviewed-by: Wan-Teh Chang <wtc@google.com>
|
|
Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: Ib135d0b4ff17665f6a4ab60edb782a7b314219a4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3696042
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
|
|
This reverts commit 60254a1d846a93a4d7559009004cdd91bcc04d82.
Reason for revert: breaks PaintCanvasVideoRendererTest.HighBitDepth
Original change's description:
> I210ToI420, InterpolatePlane_16, and ScalePlane Vertical-only asan fix
>
> - Add I210ToI420 to convert 10 bit 4:2:2 YUV to 4:2:0 8 bit
> - Add NEON InterpolateRow_16 for fast 10 bit scaling
> - When scaling up, set step to interpolate toward height - 1 to avoid buffer overread
> - When scaling down, center the 2 rows used for source to achieve filtering.
> - CopyPlane check for 0 size and return
>
> Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
> Change-Id: I63e8580710a57812b683c2fe40583ac5a179c4f1
> Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3687552
> Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
> Reviewed-by: richard winterton <rrwinterton@gmail.com>
Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: Icc05bb340db0e7fe864061fb501d0a861c764116
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3692886
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
|
|
- Add I210ToI420 to convert 10 bit 4:2:2 YUV to 4:2:0 8 bit
- Add NEON InterpolateRow_16 for fast 10 bit scaling
- When scaling up, set step to interpolate toward height - 1 to avoid buffer overread
- When scaling down, center the 2 rows used for source to achieve filtering.
- CopyPlane check for 0 size and return
Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: I63e8580710a57812b683c2fe40583ac5a179c4f1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3687552
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
|
|
Bug: libyuv:928
xed -i scale_gcc.o:
SYM ScaleUVRowUp2_Linear_16_SSE2:
XDIS 0: LOGICAL SSE2 660FEFED pxor xmm5, xmm5
XDIS 4: SSE SSE2 660F76E4 pcmpeqd xmm4, xmm4
XDIS 8: SSE SSE2 660F72D41F psrld xmm4, 0x1f
XDIS d: SSE SSE2 660F72F401 pslld xmm4, 0x1
XDIS 12: DATAXFER SSE2 F30F7E07 movq xmm0, qword ptr [rdi]
XDIS 16: DATAXFER SSE2 F30F7E4F04 movq xmm1, qword ptr [rdi+0x4]
XDIS 1b: SSE SSE2 660F61C5 punpcklwd xmm0, xmm5
XDIS 1f: SSE SSE2 660F61CD punpcklwd xmm1, xmm5
XDIS 23: DATAXFER SSE2 660F6FD0 movdqa xmm2, xmm0
XDIS 27: DATAXFER SSE2 660F6FD9 movdqa xmm3, xmm1
XDIS 2b: SSE SSE2 660F70D24E pshufd xmm2, xmm2, 0x4e
XDIS 30: SSE SSE2 660F70DB4E pshufd xmm3, xmm3, 0x4e
XDIS 35: SSE SSE2 660FFED4 paddd xmm2, xmm4
XDIS 39: SSE SSE2 660FFEDC paddd xmm3, xmm4
XDIS 3d: SSE SSE2 660FFED0 paddd xmm2, xmm0
XDIS 41: SSE SSE2 660FFED9 paddd xmm3, xmm1
XDIS 45: SSE SSE2 660FFEC0 paddd xmm0, xmm0
XDIS 49: SSE SSE2 660FFEC9 paddd xmm1, xmm1
XDIS 4d: SSE SSE2 660FFEC2 paddd xmm0, xmm2
XDIS 51: SSE SSE2 660FFECB paddd xmm1, xmm3
XDIS 55: SSE SSE2 660F72D002 psrld xmm0, 0x2
XDIS 5a: SSE SSE2 660F72D102 psrld xmm1, 0x2
XDIS 5f: SSE SSE4 660F382BC1 packusdw xmm0, xmm1
XDIS 64: DATAXFER SSE2 F30F7F06 movdqu xmmword ptr [rsi], xmm0
XDIS 68: MISC BASE 488D7F08 lea rdi, ptr [rdi+0x8]
XDIS 6c: MISC BASE 488D7610 lea rsi, ptr [rsi+0x10]
XDIS 70: BINARY BASE 83EA04 sub edx, 0x4
XDIS 73: COND_BR BASE 7F9D jnle 0x12 <ScaleUVRowUp2_Linear_16_SSE2+0x12>
XDIS 75: RET BASE C3 ret
SYM ScaleUVRowUp2_Bilinear_16_SSE2:
XDIS 0: LOGICAL SSE2 660FEFFF pxor xmm7, xmm7
XDIS 4: SSE SSE2 660F76F6 pcmpeqd xmm6, xmm6
XDIS 8: SSE SSE2 660F72D61F psrld xmm6, 0x1f
XDIS d: SSE SSE2 660F72F603 pslld xmm6, 0x3
XDIS 12: DATAXFER SSE2 F30F7E07 movq xmm0, qword ptr [rdi]
XDIS 16: DATAXFER SSE2 F30F7E4F04 movq xmm1, qword ptr [rdi+0x4]
XDIS 1b: SSE SSE2 660F61C7 punpcklwd xmm0, xmm7
XDIS 1f: SSE SSE2 660F61CF punpcklwd xmm1, xmm7
XDIS 23: DATAXFER SSE2 660F6FD0 movdqa xmm2, xmm0
XDIS 27: DATAXFER SSE2 660F6FD9 movdqa xmm3, xmm1
XDIS 2b: SSE SSE2 660F70D24E pshufd xmm2, xmm2, 0x4e
XDIS 30: SSE SSE2 660F70DB4E pshufd xmm3, xmm3, 0x4e
XDIS 35: SSE SSE2 660FFED0 paddd xmm2, xmm0
XDIS 39: SSE SSE2 660FFED9 paddd xmm3, xmm1
XDIS 3d: SSE SSE2 660FFEC0 paddd xmm0, xmm0
XDIS 41: SSE SSE2 660FFEC9 paddd xmm1, xmm1
XDIS 45: SSE SSE2 660FFEC2 paddd xmm0, xmm2
XDIS 49: SSE SSE2 660FFECB paddd xmm1, xmm3
XDIS 4d: DATAXFER SSE2 F30F7E1477 movq xmm2, qword ptr [rdi+rsi*2]
XDIS 52: DATAXFER SSE2 F30F7E5C7704 movq xmm3, qword ptr [rdi+rsi*2+0x4]
XDIS 58: SSE SSE2 660F61D7 punpcklwd xmm2, xmm7
XDIS 5c: SSE SSE2 660F61DF punpcklwd xmm3, xmm7
XDIS 60: DATAXFER SSE2 660F6FE2 movdqa xmm4, xmm2
XDIS 64: DATAXFER SSE2 660F6FEB movdqa xmm5, xmm3
XDIS 68: SSE SSE2 660F70E44E pshufd xmm4, xmm4, 0x4e
XDIS 6d: SSE SSE2 660F70ED4E pshufd xmm5, xmm5, 0x4e
XDIS 72: SSE SSE2 660FFEE2 paddd xmm4, xmm2
XDIS 76: SSE SSE2 660FFEEB paddd xmm5, xmm3
XDIS 7a: SSE SSE2 660FFED2 paddd xmm2, xmm2
XDIS 7e: SSE SSE2 660FFEDB paddd xmm3, xmm3
XDIS 82: SSE SSE2 660FFED4 paddd xmm2, xmm4
XDIS 86: SSE SSE2 660FFEDD paddd xmm3, xmm5
XDIS 8a: DATAXFER SSE2 660F6FE0 movdqa xmm4, xmm0
XDIS 8e: DATAXFER SSE2 660F6FEA movdqa xmm5, xmm2
XDIS 92: SSE SSE2 660FFEE0 paddd xmm4, xmm0
XDIS 96: SSE SSE2 660FFEEE paddd xmm5, xmm6
XDIS 9a: SSE SSE2 660FFEE0 paddd xmm4, xmm0
XDIS 9e: SSE SSE2 660FFEE5 paddd xmm4, xmm5
XDIS a2: SSE SSE2 660F72D404 psrld xmm4, 0x4
XDIS a7: DATAXFER SSE2 660F6FEA movdqa xmm5, xmm2
XDIS ab: SSE SSE2 660FFEEA paddd xmm5, xmm2
XDIS af: SSE SSE2 660FFEC6 paddd xmm0, xmm6
XDIS b3: SSE SSE2 660FFEEA paddd xmm5, xmm2
XDIS b7: SSE SSE2 660FFEE8 paddd xmm5, xmm0
XDIS bb: SSE SSE2 660F72D504 psrld xmm5, 0x4
XDIS c0: DATAXFER SSE2 660F6FC1 movdqa xmm0, xmm1
XDIS c4: DATAXFER SSE2 660F6FD3 movdqa xmm2, xmm3
XDIS c8: SSE SSE2 660FFEC1 paddd xmm0, xmm1
XDIS cc: SSE SSE2 660FFED6 paddd xmm2, xmm6
XDIS d0: SSE SSE2 660FFEC1 paddd xmm0, xmm1
XDIS d4: SSE SSE2 660FFEC2 paddd xmm0, xmm2
XDIS d8: SSE SSE2 660F72D004 psrld xmm0, 0x4
XDIS dd: DATAXFER SSE2 660F6FD3 movdqa xmm2, xmm3
XDIS e1: SSE SSE2 660FFED3 paddd xmm2, xmm3
XDIS e5: SSE SSE2 660FFECE paddd xmm1, xmm6
XDIS e9: SSE SSE2 660FFED3 paddd xmm2, xmm3
XDIS ed: SSE SSE2 660FFED1 paddd xmm2, xmm1
XDIS f1: SSE SSE2 660F72D204 psrld xmm2, 0x4
XDIS f6: SSE SSE4 660F382BE0 packusdw xmm4, xmm0
XDIS fb: DATAXFER SSE2 F30F7F22 movdqu xmmword ptr [rdx], xmm4
XDIS ff: SSE SSE4 660F382BEA packusdw xmm5, xmm2
XDIS 104: DATAXFER SSE2 F30F7F2C4A movdqu xmmword ptr [rdx+rcx*2], xmm5
XDIS 109: MISC BASE 488D7F08 lea rdi, ptr [rdi+0x8]
XDIS 10d: MISC BASE 488D5210 lea rdx, ptr [rdx+0x10]
XDIS 111: BINARY BASE 4183E804 sub r8d, 0x4
XDIS 115: COND_BR BASE 0F8FF7FEFFFF jnle 0x12 <ScaleUVRowUp2_Bilinear_16_SSE2+0x12>
XDIS 11b: RET BASE C3 ret
Change-Id: Ia20860e9c3c45368822cfd8877167ff0bf973dcc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3587602
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:915, b/215425056
Change-Id: Iccab1ed3f6d385f02895d44faa94d198ad79d693
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3424820
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:916
Change-Id: I345b7e271ceb4b32fe91e292915e66be40812810
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3415817
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Optimize 20 functions in source/scale_lsx.cc file.
All test cases passed on loongarch platform.
Bug: libyuv:913
Change-Id: I85bcb3b0bfd9461bb6f93202546507352cbd624a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351469
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
- reenable Intel SIMD unaffected by BIT_EXACT
- add bit exact version of ARGBAttenuate, which uses ARM version of formula.
- add bit exact version of ARGBUnatenuate, which mimics the AVX code.
Apply clang format to cleanup code.
Bug: libyuv:908, b/202888439
Change-Id: Ie842b1b3956b48f4190858e61c02998caedc2897
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3224702
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
|
|
- C code use ARM path, so NEON and C match
- C used on Intel platforms, disabling AVX.
Bug: libyuv:908, b/202888439
Change-Id: Ie035a150a60d3cf4ee7c849a96819d43640cf020
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3223507
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
|
|
Add tests of all macros used by libyuv public headers
When a 1 step conversion is added, a 2 step test can compare
the old 2 step method to the 1 step. A 1 step unittest is
also added which compares C to SIMD. Making the 2 step
conversions measure performance of the 2 steps allows the
old 2 step performance to be compared to 1 step.
All macros used in public headers are added to an ifdef test.
Showing them in a unittest allows some diagnostics when
a test is failing.
Bug: libyuv:901
Change-Id: I7ffa6ed0cb3b506fa1b7fd4b7b1b729658c3c266
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2857916
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:900, libyuv:848, b/178283356, b/185922513
Change-Id: I7697953753391c555a778198db36412c853fb29e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2844962
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
|
|
Bug: libyuv:900, libyuv:848, b/178283356, b/185922513
Change-Id: Iee7d9970c7991856c8f51158cd12ec72ee9c57eb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2844779
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
|
|
Bug: libyuv:843
Change-Id: I0104c8fcaeed09e83d2fd654c6a5e7d41bcb74cf
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2727775
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
|
|
R=fbarchard@chromium.org
Change-Id: I4a869aefdc16e34357a615727711594c5d8e3a80
Bug: libyuv:882
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2719842
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
miscellaneous cleanup of other code/comments
Bug: libyuv:873, libyuv:877
Change-Id: I0d8caf9a65908ff8898b25494f7c724775f84fa3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2692930
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
These are 16 bit bi-planar convert functions to scale UV plane to
Y plane's size using (bi)linear filter.
libyuv_unittest --gtest_filter=*ToP41*
R=fbarchard@chromium.org
Bug: libyuv:872
Change-Id: I3cb4fafe2b2c9eedd0d91cf4c619abb9ee107bc1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2690102
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
These are bi-planar convert functions to scale UV plane to Y plane's size using (bi)linear filter.
libyuv_unittest --gtest_filter=*ToNV24*
R=fbarchard@chromium.org
Change-Id: I3d98f833feeef00af3c903ac9ad0e41bdcbcb51f
Bug: libyuv:872
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2682152
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
These functions use (bi)linear filter, to scale U and V planes to the size of Y plane.
This will help enhance the quality of YUV to RGB conversion.
Also added 10bit and 12bit version:
I010ToI410
I210ToI410
I012ToI412
I212ToI412
libyuv_unittest --gtest_filter=LibYUVConvertTest.I42*ToI444*:LibYUVConvertTest.I*1*ToI41*
R=fbarchard@chromium.org
Change-Id: Ie4a711a5ba28f2ff1f44c021f7a5c149022264c5
Bug: libyuv:872
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2658097
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: b/171884264
Change-Id: I6a94bde0aa05e681bb4590ea8beec33a61ddbfc9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2518361
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Intel SkylakeX
UVTest3x (1925 ms)
UVTest4x (2915 ms)
PlaneTest3x (2040 ms)
PlaneTest4x (4292 ms)
ARGBTest3x (2079 ms)
ARGBTest4x (1854 ms)
Pixel 2
ARGBTest3x (3602 ms)
ARGBTest4x (4064 ms)
PlaneTest3x (3331 ms)
PlaneTest4x (8977 ms)
UVTest3x (3473 ms)
UVTest4x (6970 ms)
Bug: b/171798872, b/171884264
Change-Id: Iebc70fed907857b6cb71a9baf2aba9861ef1e3f7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2505601
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Intel SkylakeX
Was SSSE3 UVScaleDownBy4_Box (2496 ms)
Now AVX2 UVScaleDownBy4_Box (1983 ms)
Was SSSE3 UVScaleDownBy2_Box (380 ms)
Now AVX2 UVScaleDownBy2_Box (360 ms)
Pixel 4 aarch32
Was UVScaleDownBy4_Box (4295 ms)
Now UVScaleDownBy4_Box (3307 ms)
Was UVScaleDownBy2_Box (1022 ms)
Now UVScaleDownBy2_Box (778 ms)
Bug: libuyv:838
Change-Id: Ic823fa15e5761c1b9a897da27341adbf1ed39883
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2470196
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libuyv:838
Change-Id: Id9fb3282a3e86143d76b5e0cb557f0523a88b3c8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2465578
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:718, libyuv:838, b/168918847
Change-Id: I3300c1e7d51407b9c3201cf52b68e2e11346ff5f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2427868
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
|
|
Improves playback performance for 1080p video on www.youku.com
BUG=libyuv:841
Change-Id: Iabe7693fba276162af0290863f46e214ab86fb6c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1790959
Reviewed-by: Miguel Casas <mcasas@chromium.org>
|
|
Bug: libyuv:821
Change-Id: I4a6b9bee2c2fae199c73c9ec7ecb32bde37c1852
Tested: out/Release/libyuv_unittest --gtest_filter=*ScaleFrom1920x1080_Box --libyuv_width=160 --libyuv_height=90 --libyuv_repeat=1000
Reviewed-on: https://chromium-review.googlesource.com/c/1298598
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
|
|
When loading or storing the data, the unaligned address will greatly degrade
the optimization performance, so non-aligned access instructions are required
on the loongson platform.
Also delete the optimization function:ScaleARGBFilterCols_MMI,
because it degraded the performance.
BUG=libyuv:804
R=fbarchard@chromium.org
Change-Id: If4c15886a21cdcbac7ae8b336292e4549acf1e47
Reviewed-on: https://chromium-review.googlesource.com/1164627
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.
Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).
In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.
BUG=libyuv:804
Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I023699a7aa61ea3f5e4a21647112691ea5739281
Reviewed-on: https://chromium-review.googlesource.com/902170
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
|
|
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: git cl lint
Change-Id: I51cf8107a8db17fbc9952d610f3e4d7aac5aa743
Reviewed-on: https://chromium-review.googlesource.com/882217
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
Append _t to all sized types.
uint64 becomes uint64_t etc
Bug: libyuv:774
Test: try bots build on all platforms
Change-Id: Ide273d7f8012313d6610415d514a956d6f3a8cac
Reviewed-on: https://chromium-review.googlesource.com/879922
Reviewed-by: Miguel Casas <mcasas@chromium.org>
|
|
Bug: libyuv:702
Test: try bots pass
Change-Id: I76d74b5f02fe9843418108b84742e2f714d1ab0a
Reviewed-on: https://chromium-review.googlesource.com/855656
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
|
|
Bug: libyuv:765
Test: build for mips still passes
Change-Id: I99105ad3951d2210c0793e3b9241c178442fdc37
Reviewed-on: https://chromium-review.googlesource.com/826404
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
|
|
clang does not require -msse2 or -msse for inline, except
the "x" parameter. So change this to "m" for 32 bit. 64 bit
requires sse2 so use "x" for 64 bit.
gcc requires -msse for xmm registers in clobber list.
Reduce compiler requirement from -msse2 to -msse for enabling
assembly.
Bug: libyuv:754, libyuv:757
Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk
Change-Id: I86df72cfee80b7d349561c1fd7c97ad360767255
Reviewed-on: https://chromium-review.googlesource.com/759303
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
|
|
cleanup to remove ifdefs around functions affected by
a clang bug.
gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true"
ninja -v -C out/Release libyuv_unittest
Bug: libyuv:634
Test: build for mips with clang
Change-Id: I278b368dbb2fe89082240e280267d0a27a214c78
Reviewed-on: https://chromium-review.googlesource.com/757980
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
|
|
This reverts commit 01e994d74e4e3937ee1a3efdc048320a1e51f818.
Change-Id: Ie76710d0f4e641e071889c5125fd3be23cdcdb59
Reviewed-on: https://chromium-review.googlesource.com/758499
Reviewed-by: Frank Barchard <fbarchard@google.com>
|