tree-optimization/115372 - failed store-lanes in some cases
The gcc.target/riscv/rvv/autovec/struct/struct_vect-4.c testcase shows that we sometimes fail to use store-lanes even though it should be profitable. We're currently relying on vect_slp_prefer_store_lanes_p at the point we run into the first SLP discovery mismatch with obviously limited information. For the case at hand we have 3, 5 or 7 lanes of VnDImode [2, 2] vectors with the first mismatch at lane 2 so the new group size is 1. The heuristic says that might be an OK split given the rest is a multiple of the vector lanes. Now we continue discovery but in the end mismatches result in uniformly single-lane SLP instances which we can handle via interleaving but of course are prime candidates for store-lanes. The following patch re-assesses with the extra knowledge now just relying on the fact whether the target supports store-lanes for the given group size. PR tree-optimization/115372 * tree-vect-slp.cc (vect_build_slp_instance): Compute the uniform, if, number of lanes of the RHS sub-graphs feeding the store and if uniformly one, use store-lanes if the target supports that.
This commit is contained in:
parent
618871ff09
commit
f594008dcc
1 changed files with 18 additions and 0 deletions
|
@ -3957,6 +3957,7 @@ vect_build_slp_instance (vec_info *vinfo,
|
|||
/* Calculate the unrolling factor based on the smallest type. */
|
||||
poly_uint64 unrolling_factor = 1;
|
||||
|
||||
unsigned int rhs_common_nlanes = 0;
|
||||
unsigned int start = 0, end = i;
|
||||
while (start < group_size)
|
||||
{
|
||||
|
@ -3978,6 +3979,10 @@ vect_build_slp_instance (vec_info *vinfo,
|
|||
calculate_unrolling_factor
|
||||
(max_nunits, end - start));
|
||||
rhs_nodes.safe_push (node);
|
||||
if (start == 0)
|
||||
rhs_common_nlanes = SLP_TREE_LANES (node);
|
||||
else if (rhs_common_nlanes != SLP_TREE_LANES (node))
|
||||
rhs_common_nlanes = 0;
|
||||
start = end;
|
||||
if (want_store_lanes || force_single_lane)
|
||||
end = start + 1;
|
||||
|
@ -4015,6 +4020,19 @@ vect_build_slp_instance (vec_info *vinfo,
|
|||
}
|
||||
}
|
||||
|
||||
/* Now re-assess whether we want store lanes in case the
|
||||
discovery ended up producing all single-lane RHSs. */
|
||||
if (rhs_common_nlanes == 1
|
||||
&& ! STMT_VINFO_GATHER_SCATTER_P (stmt_info)
|
||||
&& ! STMT_VINFO_STRIDED_P (stmt_info)
|
||||
&& compare_step_with_zero (vinfo, stmt_info) > 0
|
||||
&& (vect_store_lanes_supported (SLP_TREE_VECTYPE (rhs_nodes[0]),
|
||||
group_size,
|
||||
SLP_TREE_CHILDREN
|
||||
(rhs_nodes[0]).length () != 1)
|
||||
!= IFN_LAST))
|
||||
want_store_lanes = true;
|
||||
|
||||
/* Now we assume we can build the root SLP node from all stores. */
|
||||
if (want_store_lanes)
|
||||
{
|
||||
|
|
Loading…
Add table
Reference in a new issue