Support fused multiply-adds in fully-masked reductions

Programming / Compilers / GCC - rsandifo [138bc75d-0d04-0410-961f-82ee72b054a4] - 12 July 2018 13:01 EDT

This patch adds support for fusing a conditional add or subtract with a multiplication, so that we can use fused multiply-add and multiply-subtract operations for fully-masked reductions. E.g. for SVE we vectorise:

double res = 0.0; for (int i = 0; i < n; ++i) res += x[i] * y[i];

using a fully-masked loop in which the loop body has the form:

res_1 = PHI<0(preheader), res_2(latch)>; avec = .MASK_LOAD (loop_mask, a) bvec = .MASK_LOAD (loop_mask, b) prod = avec * bvec; res_2 = .COND_ADD (loop_mask, res_1, prod, res_1);

where the last statement does the equivalent of:

res_2 = loop_mask ? res_1 + prod : res_1;

(operating elementwise). The point of the patch is to convert the last two statements into:

res_s = .COND_FMA (loop_mask, avec, bvec, res_1, res_1);

which is equivalent to:

res_2 = loop_mask ? fma (avec, bvec, res_1) : res_1;

(again operating elementwise).

2018-07-12 Richard Sandiford Alan Hayward David Sherwood

gcc/
- internal-fn.h (can_interpret_as_conditional_op_p): Declare.
- internal-fn.c (can_interpret_as_conditional_op_p): New function.
- tree-ssa-math-opts.c (convert_mult_to_fma_1): Handle conditional plus and minus and convert them into IFN_COND_FMA-based sequences. (convert_mult_to_fma): Handle conditional plus and minus.

gcc/testsuite/
- gcc.dg/vect/vect-fma-2.c: New test.
- gcc.target/aarch64/sve/reduc_4.c: Likewise.
- gcc.target/aarch64/sve/reduc_6.c: Likewise.
- gcc.target/aarch64/sve/reduc_7.c: Likewise.

e3798ed9f88 Support fused multiply-adds in fully-masked reductions
gcc/ChangeLog | 10 +++
gcc/internal-fn.c | 56 ++++++++++++
gcc/internal-fn.h | 3 +
gcc/testsuite/ChangeLog | 9 ++
gcc/testsuite/gcc.dg/vect/vect-fma-2.c | 17 ++++
gcc/testsuite/gcc.target/aarch64/sve/reduc_4.c | 18 ++++
gcc/testsuite/gcc.target/aarch64/sve/reduc_6.c | 17 ++++
gcc/testsuite/gcc.target/aarch64/sve/reduc_7.c | 17 ++++
gcc/tree-ssa-math-opts.c | 118 +++++++++++++------------
9 files changed, 209 insertions(+), 56 deletions(-)

Upstream: gcc.gnu.org


  • Share