MATCH: Sink convert for vec_cond
Convert be sinked into a vec_cond if both sides fold. Unlike other unary operations, we need to check that we still can handle this vec_cond's first operand is the same as the new truth type. I tried a few different versions of this patch: view_convert to the new truth_type but that does not work as we always support all vec_cond afterwards. using expand_vec_cond_expr_p; but that would allow too much. I also tried to see if view_convert can be handled here but we end up with: _3 = VEC_COND_EXPR <_2, { Nan(-1), Nan(-1), Nan(-1), Nan(-1) }, { 0.0, 0.0, 0.0, 0.0 }>; Which isel does not know how to handle as just being a view_convert from `vector(4) <signed-boolean:32>` to `vector(4) float` and causes a regression with `g++.target/i386/pr88152.C` Note, in the case of the SVE testcase, we will sink negate after the convert and be able to remove a few extra instructions in the end. Also with this change gcc.target/aarch64/sve/cond_unary_5.c will now pass. Committed as approved after a bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu. gcc/ChangeLog: PR tree-optimization/111006 PR tree-optimization/110986 * match.pd: (op(vec_cond(a,b,c))): Handle convert for op. gcc/testsuite/ChangeLog: PR tree-optimization/111006 * gcc.target/aarch64/sve/cond_convert_7.c: New test.
This commit is contained in:
parent
1e3003ca08
commit
70c50c8727
2 changed files with 31 additions and 0 deletions
|
@ -4704,6 +4704,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|
|||
(op (vec_cond:s @0 @1 @2))
|
||||
(vec_cond @0 (op! @1) (op! @2))))
|
||||
|
||||
/* Sink unary conversions to branches, but only if we do fold both
|
||||
and the target's truth type is the same as we already have. */
|
||||
(simplify
|
||||
(convert (vec_cond:s @0 @1 @2))
|
||||
(if (VECTOR_TYPE_P (type)
|
||||
&& types_match (TREE_TYPE (@0), truth_type_for (type)))
|
||||
(vec_cond @0 (convert! @1) (convert! @2))))
|
||||
|
||||
/* Sink binary operation to branches, but only if we can fold it. */
|
||||
(for op (tcc_comparison plus minus mult bit_and bit_ior bit_xor
|
||||
lshift rshift rdiv trunc_div ceil_div floor_div round_div
|
||||
|
|
23
gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c
Normal file
23
gcc/testsuite/gcc.target/aarch64/sve/cond_convert_7.c
Normal file
|
@ -0,0 +1,23 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-O2 -ftree-vectorize -moverride=sve_width=256 -fdump-tree-optimized" } */
|
||||
|
||||
/* This is a modified reduced version of cond_unary_5.c */
|
||||
|
||||
void __attribute__ ((noipa))
|
||||
f0 (unsigned short *__restrict r,
|
||||
int *__restrict a,
|
||||
int *__restrict pred)
|
||||
{
|
||||
for (int i = 0; i < 1024; ++i)
|
||||
{
|
||||
int p = pred[i]?-1:0;
|
||||
r[i] = p ;
|
||||
}
|
||||
}
|
||||
|
||||
/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, p[0-7]+/z, #-1} 1 } } */
|
||||
/* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.[hs], p[0-7]+/z, #1} } } */
|
||||
|
||||
/* { dg-final { scan-tree-dump-not "VIEW_CONVERT_EXPR " "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " = -" "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " = \\\(vector" "optimized" } } */
|
Loading…
Add table
Reference in a new issue