This patch adds SVE patterns that combine a PTRUE-predicated comparison with a separate AND. The main benefit is for optimising ANDs with the loop predicate, as in the testcase. However, one of the potential drawbacks is that it triggers even for cases in which two naturally-parallel comparisons are ANDed together. Whether that's a win or a less will depend on the schedule, but it has the potential to be a win more often than a loss.
The combine patterns are undeniably ugly. One way of getting around them would be to allow 1->1 "splits" when combining 2 instructions, as well as 1->2 splits when combining more than 2 instructions (although that wouldn't really be a split). Another would be to have a way of defining target-specific rtx simplifications. branches/ARM/sve-branch has a prototype implementation of that, but it would need some clean-up before being ready to submit. It would also be good to make it closer to the match.pd style.
Until then, I think what the combine patterns are doing is the "correct" implementation given the current infrastructure.
2018-05-08 Richard Sandiford Alan Hayward David Sherwood
- config/aarch64/aarch64-sve.md (*pred_cmp
- gcc.target/aarch64/sve/vcond_6.c: Do not expect any ANDs. XFAIL the BIC test.
- gcc.target/aarch64/sve/vcond_7.c: New test.
- gcc.target/aarch64/sve/vcond_7_run.c: Likewise.
30dd727b610 [AArch64] Predicated SVE comparison folds
gcc/ChangeLog | 9 +
gcc/config/aarch64/aarch64-sve.md | 120 ++++++++++++
gcc/testsuite/ChangeLog | 9 +
gcc/testsuite/gcc.target/aarch64/sve/vcond_6.c | 10 +-
gcc/testsuite/gcc.target/aarch64/sve/vcond_7.c | 216 +++++++++++++++++++++
gcc/testsuite/gcc.target/aarch64/sve/vcond_7_run.c | 40 ++++
6 files changed, 402 insertions(+), 2 deletions(-)