As Jakub says in the PR, the problem here was that the x86/built-in
version of the scatter support was using a bogus scatter_src_dt when calling vect_get_vec_def_for_stmt_copy (and had since it was added). The patch uses the vect_def_type from the original call to vect_is_simple_use instead.
However, Jakub also pointed out that other parts of the load and store code passed the vector operand rather than the scalar operand to
vect_is_simple_use. That probably works most of the time since a constant scalar operand should give a constant vector operand, and likewise for external and internal definitions. But it definitely seems more robust to pass the scalar operand.
The patch avoids the issue for gather and scatter offsets by using the cached gs_info.offset_dt. This is safe because gathers and scatters are never grouped, so there's only one statement operand to consider. The patch also caches the vect_def_type for mask operands, which is safe because grouped masked operations share the same mask.
That just leaves the store rhs. We still need to recalculate the
vect_def_type there since different store values in the group can have different definition types. But since we still have access to the original scalar operand, it seems better to use that instead.
2018-01-20 Richard Sandiford
gcc/ PR tree-optimization/83940
- tree-vect-stmts.c (vect_truncate_gather_scatter_offset): Set offset_dt to vect_constant_def rather than vect_unknown_def_type. (vect_check_load_store_mask): Add a mask_dt_out parameter and use it to pass back the definition type. (vect_check_store_rhs): Likewise rhs_dt_out. (vect_build_gather_load_calls): Add a mask_dt argument and use it instead of a call to vect_is_simple_use. (vectorizable_store): Update calls to vect_check_load_store_mask and vect_check_store_rhs. Use the dt returned by the latter instead of scatter_src_dt. Use the cached mask_dt and gs_info.offset_dt instead of calls to vect_is_simple_use. Pass the scalar rather than the vector operand to vect_is_simple_use when handling second and subsequent copies of an rhs value. (vectorizable_load): Update calls to vect_check_load_store_mask and vect_build_gather_load_calls. Use the cached mask_dt and gs_info.offset_dt instead of calls to vect_is_simple_use.
gcc/testsuite/ PR tree-optimization/83940
- gcc.dg/torture/pr83940.c: New test.
2c528f7041f Fix vect_def_type handling in x86 scatter support (PR 83940)
gcc/ChangeLog | 20 +++++++
gcc/testsuite/ChangeLog | 5 ++
gcc/testsuite/gcc.dg/torture/pr83940.c | 9 ++++
gcc/tree-vect-stmts.c | 99 +++++++++++++++-------------------
4 files changed, 78 insertions(+), 55 deletions(-)