powerpc: Remove power4 mpa optimization

System Internals / glibc - Adhemerval Zanella [linaro.org] - 29 April 2019 11:43 EDT

This patch removes the POWER4 optimized mpa optimization used currently on all powerpc targets. In fact for newer chips, GCC generates *worse* code than generic implementation as below. One possibilty would to add ifunc variants for the mpa routines (as x86_64), but it will add complexity only for older chips (and one would need to check if power5, power5+, and power6 do benefict from this optimization), and only for specific implementation (since most used one such as sin, cos, exp, pow where optimized to avoid calling the slow multiprecision path).

- POWER9 patched $ ./testrun.sh benchtests/bench-atan "atan": { "": { "duration": 5.12565e+09, "iterations": 1.552e+08, "max": 100.552, "min": 7.799, "mean": 33.0261 }, "144bits": { "duration": 5.12745e+09, "iterations": 825000, "max": 7517.17, "min": 6186.3, "mean": 6215.09 } } $ ./testrun.sh benchtests/bench-acos "acos": { "": { "duration": 5.21741e+09, "iterations": 1.269e+08, "max": 191.738, "min": 7.931, "mean": 41.1144 }, "slow": { "duration": 5.25999e+09, "iterations": 198000, "max": 26681.7, "min": 26463.6, "mean": 26565.6 } }

- POWER9 master $ ./testrun.sh benchtests/bench-atan "atan": { "": { "duration": 5.12815e+09, "iterations": 1.552e+08, "max": 134.788, "min": 7.803, "mean": 33.0422 }, "144bits": { "duration": 5.1209e+09, "iterations": 447000, "max": 11615.8, "min": 11301.8, "mean": 11456.2 } } $ ./testrun.sh benchtests/bench-acos "acos": { "": { "duration": 5.22272e+09, "iterations": 1.269e+08, "max": 115.981, "min": 7.931, "mean": 41.1562 }, "slow": { "duration": 5.28723e+09, "iterations": 96000, "max": 55434.1, "min": 54820.6, "mean": 55075.3 } }

- POWER8 patched $ taskset -c 16 ./testrun.sh benchtests/bench-acos "acos": { "": { "duration": 5.16398e+09, "iterations": 9.99e+07, "max": 174.408, "min": 8.645, "mean": 51.6915 }, "slow": { "duration": 5.16982e+09, "iterations": 96000, "max": 54830.5, "min": 53703.8, "mean": 53852.3 } }
- POWER8 master $ taskset -c 16 ./testrun.sh benchtests/bench-acos "acos": { "": { "duration": 5.17019e+09, "iterations": 9.99e+07, "max": 186.127, "min": 8.633, "mean": 51.7537 }, "slow": { "duration": 5.34225e+09, "iterations": 90000, "max": 60353.2, "min": 59155.3, "mean": 59358.4 } }

- POWER7 patched $ taskset -c 16 benchtests/bench-asin "asin": { "": { "duration": 5.15559e+09, "iterations": 6.5e+07, "max": 193.335, "min": 12.227, "mean": 79.3168 }, "slow": { "duration": 5.20538e+09, "iterations": 80000, "max": 65705.2, "min": 64299.4, "mean": 65067.3 } }
- POWER7 master $ taskset -c 16 benchtests/bench-asin "asin": { "": { "duration": 5.15446e+09, "iterations": 6.5e+07, "max": 184.575, "min": 12.226, "mean": 79.2994 }, "slow": { "duration": 5.20616e+09, "iterations": 80000, "max": 65705.1, "min": 64336.6, "mean": 65076.9 } }

Checked on powerpc-linux-gnu (built without --with-cpu, with--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch).

- sysdeps/powerpc/power4/fpu/Makefile: Remove file.
- sysdeps/powerpc/power4/fpu/mpa-arch.h: Likewise.
- sysdeps/powerpc/power4/fpu/mpa.c: Likewise.

c4c0848bbb powerpc: Remove power4 mpa optimization
ChangeLog | 6 +
sysdeps/powerpc/power4/fpu/Makefile | 5 -
sysdeps/powerpc/power4/fpu/mpa-arch.h | 56 ---------
sysdeps/powerpc/power4/fpu/mpa.c | 214 ----------------------------------
4 files changed, 6 insertions(+), 275 deletions(-)

Upstream: sourceware.org


  • Share