This optimizes OpOver for RGBAF32 by using vc

Desktop / KDE / Calligra - Thorsten Zachmann [zagge.de] - 3 September 2015 22:55 UTC

New tests and benchmarks for the code have been added.

QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() Testing Composite Op: "normal" ( "RGBF32 Legacy" ) QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned Mask SrcRand DstRand" RESULT: 143 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "DstUnalig Mask SrcRand DstRand" RESULT: 142 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "SrcUnalig Mask SrcRand DstRand" RESULT: 143 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Unaligned Mask SrcRand DstRand" RESULT: 144 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcRand DstRand" RESULT: 59 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcZero DstRand" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcUnit DstRand" RESULT: 21 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcRand DstZero" RESULT: 48 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcZero DstZero" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcUnit DstZero" RESULT: 18 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcRand DstUnit" RESULT: 22 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcZero DstUnit" RESULT: 9 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() "Aligned NoMask SrcUnit DstUnit" RESULT: 16 msec PASS : KisCompositionBenchmark::testRgbF32CompositeOverLegacy() QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() Testing Composite Op: "normal" ( "RGBF32 Optimized" ) QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned Mask SrcRand DstRand" RESULT: 17 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "DstUnalig Mask SrcRand DstRand" RESULT: 17 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "SrcUnalig Mask SrcRand DstRand" RESULT: 28 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Unaligned Mask SrcRand DstRand" RESULT: 27 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcRand DstRand" RESULT: 17 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcZero DstRand" RESULT: 4 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcUnit DstRand" RESULT: 13 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcRand DstZero" RESULT: 16 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcZero DstZero" RESULT: 4 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcUnit DstZero" RESULT: 12 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcRand DstUnit" RESULT: 13 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcZero DstUnit" RESULT: 4 msec QDEBUG : KisCompositionBenchmark::testRgbF32CompositeOverOptimized() "Aligned NoMask SrcUnit DstUnit" RESULT: 12 msec PASS : KisCompositionBenchmark::testRgbF32CompositeOverOptimized()

(cherry picked from commit 3be07eee35505bf754f12617e5d9059c03c44d7c)

85fad50 This optimizes OpOver for RGBAF32 by using vc
krita/benchmarks/kis_composition_benchmark.cpp | 348 +++++++++++++++-----
krita/benchmarks/kis_composition_benchmark.h | 5 +
libs/pigment/compositeops/KoCompositeOps.h | 11 +
.../compositeops/KoOptimizedCompositeOpFactory.cpp | 5 +
.../compositeops/KoOptimizedCompositeOpFactory.h | 1 +
.../KoOptimizedCompositeOpFactoryPerArch.cpp | 9 +
.../KoOptimizedCompositeOpFactoryPerArch.h | 3 +
...KoOptimizedCompositeOpFactoryPerArch_Scalar.cpp | 8 +
.../compositeops/KoOptimizedCompositeOpOver128.h | 290 ++++++++++++++++
libs/pigment/compositeops/KoStreamedMath.h | 161 ++++++++-
10 files changed, 745 insertions(+), 96 deletions(-)

Upstream: quickgit.kde.org


  • Share