Add support for reductions in fully-masked loops

Programming / Compilers / GCC - rsandifo [138bc75d-0d04-0410-961f-82ee72b054a4] - 13 January 2018 17:59 EST

This patch removes the restriction that fully-masked loops cannot have reductions. The key thing here is to make sure that the reduction accumulator doesn't include any values associated with inactive lanes; the patch adds a bunch of conditional binary operations for doing that.

2018-01-13 Richard Sandiford Alan Hayward David Sherwood

gcc/
- doc/md.texi (cond_add@var{mode}, cond_sub@var{mode}) (cond_and@var{mode}, cond_ior@var{mode}, cond_xor@var{mode}) (cond_smin@var{mode}, cond_smax@var{mode}, cond_umin@var{mode}) (cond_umax@var{mode}): Document.
- optabs.def (cond_add_optab, cond_sub_optab, cond_and_optab) (cond_ior_optab, cond_xor_optab, cond_smin_optab, cond_smax_optab) (cond_umin_optab, cond_umax_optab): New optabs.
- internal-fn.def (COND_ADD, COND_SUB, COND_MIN, COND_MAX, COND_AND) (COND_IOR, COND_XOR): New internal functions.
- internal-fn.h (get_conditional_internal_fn): Declare.
- internal-fn.c (cond_binary_direct): New macro. (expand_cond_binary_optab_fn): Likewise. (direct_cond_binary_optab_supported_p): Likewise. (get_conditional_internal_fn): New function.
- tree-vect-loop.c (vectorizable_reduction): Handle fully-masked loops. Cope with reduction statements that are vectorized as calls rather than assignments.
- config/aarch64/aarch64-sve.md (cond_): New insns.
- config/aarch64/iterators.md (UNSPEC_COND_ADD, UNSPEC_COND_SUB) (UNSPEC_COND_SMAX, UNSPEC_COND_UMAX, UNSPEC_COND_SMIN) (UNSPEC_COND_UMIN, UNSPEC_COND_AND, UNSPEC_COND_ORR) (UNSPEC_COND_EOR): New unspecs. (optab): Add mappings for them. (SVE_COND_INT_OP, SVE_COND_FP_OP): New int iterators. (sve_int_op, sve_fp_op): New int attributes.

gcc/testsuite/
- gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors.
- gcc.target/aarch64/sve/reduc_1.c: Expect the loop operations to be predicated.
- gcc.target/aarch64/sve/slp_5.c: Check for a fully-masked loop.
- gcc.target/aarch64/sve/slp_7.c: Likewise.
- gcc.target/aarch64/sve/reduc_5.c: New test.
- gcc.target/aarch64/sve/slp_13.c: Likewise.
- gcc.target/aarch64/sve/slp_13_run.c: Likewise.

88fefa8f868 Add support for reductions in fully-masked loops
gcc/ChangeLog | 30 ++++++++
gcc/config/aarch64/aarch64-sve.md | 24 +++++++
gcc/config/aarch64/iterators.md | 42 ++++++++++-
gcc/doc/md.texi | 36 ++++++++++
gcc/internal-fn.c | 36 ++++++++++
gcc/internal-fn.def | 18 +++++
gcc/internal-fn.h | 2 +
gcc/optabs.def | 9 +++
gcc/testsuite/ChangeLog | 13 ++++
gcc/testsuite/gcc.dg/vect/pr60482.c | 4 +-
gcc/testsuite/gcc.target/aarch64/sve/reduc_1.c | 53 ++++++++------
gcc/testsuite/gcc.target/aarch64/sve/reduc_5.c | 38 ++++++++++
gcc/testsuite/gcc.target/aarch64/sve/slp_13.c | 52 ++++++++++++++
gcc/testsuite/gcc.target/aarch64/sve/slp_13_run.c | 28 ++++++++
gcc/testsuite/gcc.target/aarch64/sve/slp_5.c | 9 +++
gcc/testsuite/gcc.target/aarch64/sve/slp_7.c | 9 +++
gcc/tree-vect-loop.c | 88 +++++++++++++++++------
17 files changed, 442 insertions(+), 49 deletions(-)

Upstream: gcc.gnu.org


  • Share