tree-optimization/116818 - try VMAT_GATHER_SCATTER also for SLP

When not doing SLP and we end up with VMAT_ELEMENTWISE we consider
using strided loads, aka VMAT_GATHER_SCATTER.  The following moves
this logic down to also apply to SLP where we now can end up
using VMAT_ELEMENTWISE as well.

	PR tree-optimization/116818
	* tree-vect-stmts.cc (get_group_load_store_type): Consider
	VMAT_GATHER_SCATTER instead of VMAT_ELEMENTWISE also for SLP.
	(vectorizable_load): For single-lane VMAT_GATHER_SCATTER also
	ignore permutations.
This commit is contained in:
Richard Biener 2024-09-23 15:24:01 +02:00 committed by Richard Biener
parent 3db9e99165
commit b1c7095a1d

View file

@ -2260,22 +2260,22 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info,
}
}
}
/* As a last resort, trying using a gather load or scatter store.
??? Although the code can handle all group sizes correctly,
it probably isn't a win to use separate strided accesses based
on nearby locations. Or, even if it's a win over scalar code,
it might not be a win over vectorizing at a lower VF, if that
allows us to use contiguous accesses. */
if (*memory_access_type == VMAT_ELEMENTWISE
&& single_element_p
&& loop_vinfo
&& vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo,
masked_p, gs_info))
*memory_access_type = VMAT_GATHER_SCATTER;
}
/* As a last resort, trying using a gather load or scatter store.
??? Although the code can handle all group sizes correctly,
it probably isn't a win to use separate strided accesses based
on nearby locations. Or, even if it's a win over scalar code,
it might not be a win over vectorizing at a lower VF, if that
allows us to use contiguous accesses. */
if (*memory_access_type == VMAT_ELEMENTWISE
&& single_element_p
&& loop_vinfo
&& vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo,
masked_p, gs_info))
*memory_access_type = VMAT_GATHER_SCATTER;
if (*memory_access_type == VMAT_GATHER_SCATTER
|| *memory_access_type == VMAT_ELEMENTWISE)
{
@ -10063,7 +10063,8 @@ vectorizable_load (vec_info *vinfo,
get_group_load_store_type. */
if (slp
&& SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()
&& !(memory_access_type == VMAT_ELEMENTWISE
&& !((memory_access_type == VMAT_ELEMENTWISE
|| memory_access_type == VMAT_GATHER_SCATTER)
&& SLP_TREE_LANES (slp_node) == 1))
{
slp_perm = true;