internal-fn,vect: Refactor widen_plus as internal_fn
DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively. With the exception that they provide convenience wrappers for a single vector to vector conversion, a hi/lo split or an even/odd split. Each definition for <NAME> will require either signed optabs named <UOPTAB> and <SOPTAB> (for widening) or a single <OPTAB> (for narrowing) for each of the five functions it creates. For example, for widening addition the DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions: IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two optabs, one for signed and one for unsigned. Aarch64 implements the hi/lo split optabs: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_<su>add_lo_<mode> -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. 2023-06-05 Andre Vieira <andre.simoesdiasvieira@arm.com> Joel Hutton <joel.hutton@arm.com> Tamar Christina <tamar.christina@arm.com> gcc/ChangeLog: * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename this ... (vec_widen_<su>add_lo_<mode>): ... to this. (vec_widen_<su>addl_hi_<mode>): Rename this ... (vec_widen_<su>add_hi_<mode>): ... to this. (vec_widen_<su>subl_lo_<mode>): Rename this ... (vec_widen_<su>sub_lo_<mode>): ... to this. (vec_widen_<su>subl_hi_<mode>): Rename this ... (vec_widen_<su>sub_hi_<mode>): ...to this. * doc/generic.texi: Document new IFN codes. * internal-fn.cc (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (narrowing_fn_p): New function. (direct_internal_fn_optab): Change visibility. * internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an internal_fn that expands into multiple internal_fns for widening. (IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD, IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, IFN_VEC_WIDEN_MINUS_LO, IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define widening plus,minus functions. * internal-fn.h (direct_internal_fn_optab): Declare new prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (Narrowing_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus optabs. * optabs.def (OPTAB_D): Define widen add, sub optabs. * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo or even/odd split. (vect_recog_sad_pattern): Refactor to use new IFN codes. (vect_recog_widen_plus_pattern): Likewise. (vect_recog_widen_minus_pattern): Likewise. (vect_recog_average_pattern): Likewise. * tree-vect-stmts.cc (vectorizable_conversion): Add support for _HILO IFNs. (supportable_widening_operation): Likewise. * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used.
This commit is contained in:
parent
fe29963d40
commit
2f482a0736
12 changed files with 375 additions and 47 deletions
|
@ -4698,7 +4698,7 @@
|
|||
[(set_attr "type" "neon_<ADDSUB:optab>_long")]
|
||||
)
|
||||
|
||||
(define_expand "vec_widen_<su>addl_lo_<mode>"
|
||||
(define_expand "vec_widen_<su>add_lo_<mode>"
|
||||
[(match_operand:<VWIDE> 0 "register_operand")
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
|
||||
|
@ -4710,7 +4710,7 @@
|
|||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "vec_widen_<su>addl_hi_<mode>"
|
||||
(define_expand "vec_widen_<su>add_hi_<mode>"
|
||||
[(match_operand:<VWIDE> 0 "register_operand")
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
|
||||
|
@ -4722,7 +4722,7 @@
|
|||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "vec_widen_<su>subl_lo_<mode>"
|
||||
(define_expand "vec_widen_<su>sub_lo_<mode>"
|
||||
[(match_operand:<VWIDE> 0 "register_operand")
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
|
||||
|
@ -4734,7 +4734,7 @@
|
|||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "vec_widen_<su>subl_hi_<mode>"
|
||||
(define_expand "vec_widen_<su>sub_hi_<mode>"
|
||||
[(match_operand:<VWIDE> 0 "register_operand")
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
|
||||
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
|
||||
|
|
|
@ -1811,10 +1811,16 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
|
|||
@tindex VEC_RSHIFT_EXPR
|
||||
@tindex VEC_WIDEN_MULT_HI_EXPR
|
||||
@tindex VEC_WIDEN_MULT_LO_EXPR
|
||||
@tindex VEC_WIDEN_PLUS_HI_EXPR
|
||||
@tindex VEC_WIDEN_PLUS_LO_EXPR
|
||||
@tindex VEC_WIDEN_MINUS_HI_EXPR
|
||||
@tindex VEC_WIDEN_MINUS_LO_EXPR
|
||||
@tindex IFN_VEC_WIDEN_PLUS
|
||||
@tindex IFN_VEC_WIDEN_PLUS_HI
|
||||
@tindex IFN_VEC_WIDEN_PLUS_LO
|
||||
@tindex IFN_VEC_WIDEN_PLUS_EVEN
|
||||
@tindex IFN_VEC_WIDEN_PLUS_ODD
|
||||
@tindex IFN_VEC_WIDEN_MINUS
|
||||
@tindex IFN_VEC_WIDEN_MINUS_HI
|
||||
@tindex IFN_VEC_WIDEN_MINUS_LO
|
||||
@tindex IFN_VEC_WIDEN_MINUS_EVEN
|
||||
@tindex IFN_VEC_WIDEN_MINUS_ODD
|
||||
@tindex VEC_UNPACK_HI_EXPR
|
||||
@tindex VEC_UNPACK_LO_EXPR
|
||||
@tindex VEC_UNPACK_FLOAT_HI_EXPR
|
||||
|
@ -1861,6 +1867,82 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
|
|||
low @code{N/2} elements of the two vector are multiplied to produce the
|
||||
vector of @code{N/2} products.
|
||||
|
||||
@item IFN_VEC_WIDEN_PLUS
|
||||
This internal function represents widening vector addition of two input
|
||||
vectors. Its operands are vectors that contain the same number of elements
|
||||
(@code{N}) of the same integral type. The result is a vector that contains
|
||||
the same amount (@code{N}) of elements, of an integral type whose size is twice
|
||||
as wide, as the input vectors. If the current target does not implement the
|
||||
corresponding optabs the vectorizer may choose to split it into either a pair
|
||||
of @code{IFN_VEC_WIDEN_PLUS_HI} and @code{IFN_VEC_WIDEN_PLUS_LO} or
|
||||
@code{IFN_VEC_WIDEN_PLUS_EVEN} and @code{IFN_VEC_WIDEN_PLUS_ODD}, depending
|
||||
on what optabs the target implements.
|
||||
|
||||
@item IFN_VEC_WIDEN_PLUS_HI
|
||||
@itemx IFN_VEC_WIDEN_PLUS_LO
|
||||
These internal functions represent widening vector addition of the high and low
|
||||
parts of the two input vectors, respectively. Their operands are vectors that
|
||||
contain the same number of elements (@code{N}) of the same integral type. The
|
||||
result is a vector that contains half as many elements, of an integral type
|
||||
whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
|
||||
high @code{N/2} elements of the two vectors are added to produce the vector of
|
||||
@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
|
||||
@code{N/2} elements of the two vectors are added to produce the vector of
|
||||
@code{N/2} additions.
|
||||
|
||||
@item IFN_VEC_WIDEN_PLUS_EVEN
|
||||
@itemx IFN_VEC_WIDEN_PLUS_ODD
|
||||
These internal functions represent widening vector addition of the even and odd
|
||||
elements of the two input vectors, respectively. Their operands are vectors
|
||||
that contain the same number of elements (@code{N}) of the same integral type.
|
||||
The result is a vector that contains half as many elements, of an integral type
|
||||
whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_EVEN} the
|
||||
even @code{N/2} elements of the two vectors are added to produce the vector of
|
||||
@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_ODD} the odd
|
||||
@code{N/2} elements of the two vectors are added to produce the vector of
|
||||
@code{N/2} additions.
|
||||
|
||||
@item IFN_VEC_WIDEN_MINUS
|
||||
This internal function represents widening vector subtraction of two input
|
||||
vectors. Its operands are vectors that contain the same number of elements
|
||||
(@code{N}) of the same integral type. The result is a vector that contains
|
||||
the same amount (@code{N}) of elements, of an integral type whose size is twice
|
||||
as wide, as the input vectors. If the current target does not implement the
|
||||
corresponding optabs the vectorizer may choose to split it into either a pair
|
||||
of @code{IFN_VEC_WIDEN_MINUS_HI} and @code{IFN_VEC_WIDEN_MINUS_LO} or
|
||||
@code{IFN_VEC_WIDEN_MINUS_EVEN} and @code{IFN_VEC_WIDEN_MINUS_ODD}, depending
|
||||
on what optabs the target implements.
|
||||
|
||||
@item IFN_VEC_WIDEN_MINUS_HI
|
||||
@itemx IFN_VEC_WIDEN_MINUS_LO
|
||||
These internal functions represent widening vector subtraction of the high and
|
||||
low parts of the two input vectors, respectively. Their operands are vectors
|
||||
that contain the same number of elements (@code{N}) of the same integral type.
|
||||
The high/low elements of the second vector are subtracted from the high/low
|
||||
elements of the first. The result is a vector that contains half as many
|
||||
elements, of an integral type whose size is twice as wide. In the case of
|
||||
@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
|
||||
vector are subtracted from the high @code{N/2} of the first to produce the
|
||||
vector of @code{N/2} subtractions. In the case of
|
||||
@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
|
||||
vector are subtracted from the low @code{N/2} of the first to produce the
|
||||
vector of @code{N/2} subtractions.
|
||||
|
||||
@item IFN_VEC_WIDEN_MINUS_EVEN
|
||||
@itemx IFN_VEC_WIDEN_MINUS_ODD
|
||||
These internal functions represent widening vector subtraction of the even and
|
||||
odd parts of the two input vectors, respectively. Their operands are vectors
|
||||
that contain the same number of elements (@code{N}) of the same integral type.
|
||||
The even/odd elements of the second vector are subtracted from the even/odd
|
||||
elements of the first. The result is a vector that contains half as many
|
||||
elements, of an integral type whose size is twice as wide. In the case of
|
||||
@code{IFN_VEC_WIDEN_MINUS_EVEN} the even @code{N/2} elements of the second
|
||||
vector are subtracted from the even @code{N/2} of the first to produce the
|
||||
vector of @code{N/2} subtractions. In the case of
|
||||
@code{IFN_VEC_WIDEN_MINUS_ODD} the odd @code{N/2} elements of the second
|
||||
vector are subtracted from the odd @code{N/2} of the first to produce the
|
||||
vector of @code{N/2} subtractions.
|
||||
|
||||
@item VEC_WIDEN_PLUS_HI_EXPR
|
||||
@itemx VEC_WIDEN_PLUS_LO_EXPR
|
||||
These nodes represent widening vector addition of the high and low parts of
|
||||
|
|
|
@ -90,6 +90,60 @@ lookup_internal_fn (const char *name)
|
|||
return entry ? *entry : IFN_LAST;
|
||||
}
|
||||
|
||||
/* Geven an internal_fn IFN that is a widening function, return its
|
||||
corresponding LO and HI internal_fns. */
|
||||
|
||||
extern void
|
||||
lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
|
||||
{
|
||||
gcc_assert (widening_fn_p (ifn));
|
||||
|
||||
switch (ifn)
|
||||
{
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
#undef DEF_INTERNAL_FN
|
||||
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
|
||||
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
|
||||
case IFN_##NAME: \
|
||||
*lo = internal_fn (IFN_##NAME##_LO); \
|
||||
*hi = internal_fn (IFN_##NAME##_HI); \
|
||||
break;
|
||||
#include "internal-fn.def"
|
||||
#undef DEF_INTERNAL_FN
|
||||
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
}
|
||||
}
|
||||
|
||||
/* Given an internal_fn IFN that is a widening function, return its
|
||||
corresponding _EVEN and _ODD internal_fns in *EVEN and *ODD. */
|
||||
|
||||
extern void
|
||||
lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
|
||||
internal_fn *odd)
|
||||
{
|
||||
gcc_assert (widening_fn_p (ifn));
|
||||
|
||||
switch (ifn)
|
||||
{
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
#undef DEF_INTERNAL_FN
|
||||
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
|
||||
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
|
||||
case IFN_##NAME: \
|
||||
*even = internal_fn (IFN_##NAME##_EVEN); \
|
||||
*odd = internal_fn (IFN_##NAME##_ODD); \
|
||||
break;
|
||||
#include "internal-fn.def"
|
||||
#undef DEF_INTERNAL_FN
|
||||
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/* Fnspec of each internal function, indexed by function number. */
|
||||
const_tree internal_fn_fnspec_array[IFN_LAST + 1];
|
||||
|
||||
|
@ -3852,7 +3906,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
|
|||
|
||||
/* Return the optab used by internal function FN. */
|
||||
|
||||
static optab
|
||||
optab
|
||||
direct_internal_fn_optab (internal_fn fn, tree_pair types)
|
||||
{
|
||||
switch (fn)
|
||||
|
@ -3971,6 +4025,11 @@ commutative_binary_fn_p (internal_fn fn)
|
|||
case IFN_UBSAN_CHECK_MUL:
|
||||
case IFN_ADD_OVERFLOW:
|
||||
case IFN_MUL_OVERFLOW:
|
||||
case IFN_VEC_WIDEN_PLUS:
|
||||
case IFN_VEC_WIDEN_PLUS_LO:
|
||||
case IFN_VEC_WIDEN_PLUS_HI:
|
||||
case IFN_VEC_WIDEN_PLUS_EVEN:
|
||||
case IFN_VEC_WIDEN_PLUS_ODD:
|
||||
return true;
|
||||
|
||||
default:
|
||||
|
@ -4044,6 +4103,37 @@ first_commutative_argument (internal_fn fn)
|
|||
}
|
||||
}
|
||||
|
||||
/* Return true if this CODE describes an internal_fn that returns a vector with
|
||||
elements twice as wide as the element size of the input vectors. */
|
||||
|
||||
bool
|
||||
widening_fn_p (code_helper code)
|
||||
{
|
||||
if (!code.is_fn_code ())
|
||||
return false;
|
||||
|
||||
if (!internal_fn_p ((combined_fn) code))
|
||||
return false;
|
||||
|
||||
internal_fn fn = as_internal_fn ((combined_fn) code);
|
||||
switch (fn)
|
||||
{
|
||||
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
|
||||
case IFN_##NAME: \
|
||||
case IFN_##NAME##_HI: \
|
||||
case IFN_##NAME##_LO: \
|
||||
case IFN_##NAME##_EVEN: \
|
||||
case IFN_##NAME##_ODD: \
|
||||
return true;
|
||||
#include "internal-fn.def"
|
||||
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/* Return true if IFN_SET_EDOM is supported. */
|
||||
|
||||
bool
|
||||
|
@ -4072,6 +4162,8 @@ set_edom_supported_p (void)
|
|||
expand_##TYPE##_optab_fn (fn, stmt, which_optab); \
|
||||
}
|
||||
#include "internal-fn.def"
|
||||
#undef DEF_INTERNAL_OPTAB_FN
|
||||
#undef DEF_INTERNAL_SIGNED_OPTAB_FN
|
||||
|
||||
/* Routines to expand each internal function, indexed by function number.
|
||||
Each routine has the prototype:
|
||||
|
@ -4080,6 +4172,7 @@ set_edom_supported_p (void)
|
|||
|
||||
where STMT is the statement that performs the call. */
|
||||
static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
|
||||
|
||||
#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
|
||||
#include "internal-fn.def"
|
||||
0
|
||||
|
|
|
@ -85,6 +85,21 @@ along with GCC; see the file COPYING3. If not see
|
|||
says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
|
||||
group of functions to any integral mode (including vector modes).
|
||||
|
||||
DEF_INTERNAL_WIDENING_OPTAB_FN is a wrapper that defines five internal
|
||||
functions with DEF_INTERNAL_SIGNED_OPTAB_FN:
|
||||
- one that describes a widening operation with the same number of elements
|
||||
in the output and input vectors,
|
||||
- two that describe a pair of high-low widening operations where the output
|
||||
vectors each have half the number of elements of the input vectors,
|
||||
corresponding to the result of the widening operation on the top half and
|
||||
bottom half, these have the suffixes _HI and _LO,
|
||||
- and two that describe a pair of even-odd widening operations where the
|
||||
output vectors each have half the number of elements of the input vectors,
|
||||
corresponding to the result of the widening operation on the even and odd
|
||||
elements, these have the suffixes _EVEN and _ODD.
|
||||
These five internal functions will require two optabs each, a SIGNED_OPTAB
|
||||
and an UNSIGNED_OTPAB.
|
||||
|
||||
Each entry must have a corresponding expander of the form:
|
||||
|
||||
void expand_NAME (gimple_call stmt)
|
||||
|
@ -123,6 +138,15 @@ along with GCC; see the file COPYING3. If not see
|
|||
DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
|
||||
#endif
|
||||
|
||||
#ifndef DEF_INTERNAL_WIDENING_OPTAB_FN
|
||||
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
|
||||
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
|
||||
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \
|
||||
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE) \
|
||||
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, SOPTAB##_even, UOPTAB##_even, TYPE) \
|
||||
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE)
|
||||
#endif
|
||||
|
||||
DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
|
||||
DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
|
||||
DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
|
||||
|
@ -315,6 +339,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
|
|||
DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
|
||||
DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
|
||||
DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
|
||||
DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS,
|
||||
ECF_CONST | ECF_NOTHROW,
|
||||
first,
|
||||
vec_widen_sadd, vec_widen_uadd,
|
||||
binary)
|
||||
DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS,
|
||||
ECF_CONST | ECF_NOTHROW,
|
||||
first,
|
||||
vec_widen_ssub, vec_widen_usub,
|
||||
binary)
|
||||
DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
|
||||
DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
|
||||
|
||||
|
|
|
@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see
|
|||
#ifndef GCC_INTERNAL_FN_H
|
||||
#define GCC_INTERNAL_FN_H
|
||||
|
||||
#include "insn-codes.h"
|
||||
#include "insn-opinit.h"
|
||||
|
||||
|
||||
/* INTEGER_CST values for IFN_UNIQUE function arg-0.
|
||||
|
||||
UNSPEC: Undifferentiated UNIQUE.
|
||||
|
@ -112,6 +116,10 @@ internal_fn_name (enum internal_fn fn)
|
|||
}
|
||||
|
||||
extern internal_fn lookup_internal_fn (const char *);
|
||||
extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
|
||||
extern void lookup_evenodd_internal_fn (internal_fn, internal_fn *,
|
||||
internal_fn *);
|
||||
extern optab direct_internal_fn_optab (internal_fn, tree_pair);
|
||||
|
||||
/* Return the ECF_* flags for function FN. */
|
||||
|
||||
|
@ -210,6 +218,7 @@ extern bool commutative_binary_fn_p (internal_fn);
|
|||
extern bool commutative_ternary_fn_p (internal_fn);
|
||||
extern int first_commutative_argument (internal_fn);
|
||||
extern bool associative_binary_fn_p (internal_fn);
|
||||
extern bool widening_fn_p (code_helper);
|
||||
|
||||
extern bool set_edom_supported_p (void);
|
||||
|
||||
|
|
|
@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab)
|
|||
|| binoptab == smul_widen_optab
|
||||
|| binoptab == umul_widen_optab
|
||||
|| binoptab == smul_highpart_optab
|
||||
|| binoptab == umul_highpart_optab);
|
||||
|| binoptab == umul_highpart_optab
|
||||
|| binoptab == vec_widen_sadd_optab
|
||||
|| binoptab == vec_widen_uadd_optab
|
||||
|| binoptab == vec_widen_sadd_hi_optab
|
||||
|| binoptab == vec_widen_sadd_lo_optab
|
||||
|| binoptab == vec_widen_uadd_hi_optab
|
||||
|| binoptab == vec_widen_uadd_lo_optab
|
||||
|| binoptab == vec_widen_sadd_even_optab
|
||||
|| binoptab == vec_widen_sadd_odd_optab
|
||||
|| binoptab == vec_widen_uadd_even_optab
|
||||
|| binoptab == vec_widen_uadd_odd_optab);
|
||||
}
|
||||
|
||||
/* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're
|
||||
|
|
|
@ -410,6 +410,16 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
|
|||
OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
|
||||
OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
|
||||
OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
|
||||
OPTAB_D (vec_widen_ssub_optab, "vec_widen_ssub_$a")
|
||||
OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
|
||||
OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
|
||||
OPTAB_D (vec_widen_ssub_odd_optab, "vec_widen_ssub_odd_$a")
|
||||
OPTAB_D (vec_widen_ssub_even_optab, "vec_widen_ssub_even_$a")
|
||||
OPTAB_D (vec_widen_sadd_optab, "vec_widen_sadd_$a")
|
||||
OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
|
||||
OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
|
||||
OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a")
|
||||
OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a")
|
||||
OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
|
||||
OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
|
||||
OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
|
||||
|
@ -422,6 +432,16 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
|
|||
OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
|
||||
OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
|
||||
OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
|
||||
OPTAB_D (vec_widen_usub_optab, "vec_widen_usub_$a")
|
||||
OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
|
||||
OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
|
||||
OPTAB_D (vec_widen_usub_odd_optab, "vec_widen_usub_odd_$a")
|
||||
OPTAB_D (vec_widen_usub_even_optab, "vec_widen_usub_even_$a")
|
||||
OPTAB_D (vec_widen_uadd_optab, "vec_widen_uadd_$a")
|
||||
OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
|
||||
OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
|
||||
OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a")
|
||||
OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a")
|
||||
OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
|
||||
OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
|
||||
OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
/* { dg-do run } */
|
||||
/* { dg-options "-O3 -save-temps" } */
|
||||
/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
|
@ -86,6 +86,8 @@ main()
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */
|
||||
/* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
|
||||
/* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
|
||||
/* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
/* { dg-do run } */
|
||||
/* { dg-options "-O3 -save-temps" } */
|
||||
/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */
|
||||
#include <stdint.h>
|
||||
#include <string.h>
|
||||
|
||||
|
@ -86,6 +86,8 @@ main()
|
|||
return 0;
|
||||
}
|
||||
|
||||
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */
|
||||
/* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
|
||||
/* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
|
||||
/* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
|
||||
|
|
|
@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
|
|||
|
||||
static unsigned int
|
||||
vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
|
||||
tree_code widened_code, bool shift_p,
|
||||
code_helper widened_code, bool shift_p,
|
||||
unsigned int max_nops,
|
||||
vect_unpromoted_value *unprom, tree *common_type,
|
||||
enum optab_subtype *subtype = NULL)
|
||||
{
|
||||
/* Check for an integer operation with the right code. */
|
||||
gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
|
||||
if (!assign)
|
||||
gimple* stmt = stmt_info->stmt;
|
||||
if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
|
||||
return 0;
|
||||
|
||||
tree_code rhs_code = gimple_assign_rhs_code (assign);
|
||||
if (rhs_code != code && rhs_code != widened_code)
|
||||
code_helper rhs_code;
|
||||
if (is_gimple_assign (stmt))
|
||||
rhs_code = gimple_assign_rhs_code (stmt);
|
||||
else if (is_gimple_call (stmt))
|
||||
rhs_code = gimple_call_combined_fn (stmt);
|
||||
else
|
||||
return 0;
|
||||
|
||||
tree type = TREE_TYPE (gimple_assign_lhs (assign));
|
||||
if (rhs_code != code
|
||||
&& rhs_code != widened_code)
|
||||
return 0;
|
||||
|
||||
tree lhs = gimple_get_lhs (stmt);
|
||||
tree type = TREE_TYPE (lhs);
|
||||
if (!INTEGRAL_TYPE_P (type))
|
||||
return 0;
|
||||
|
||||
|
@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
|
|||
{
|
||||
vect_unpromoted_value *this_unprom = &unprom[next_op];
|
||||
unsigned int nops = 1;
|
||||
tree op = gimple_op (assign, i + 1);
|
||||
tree op = gimple_arg (stmt, i);
|
||||
if (i == 1 && TREE_CODE (op) == INTEGER_CST)
|
||||
{
|
||||
/* We already have a common type from earlier operands.
|
||||
|
@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
|
|||
/* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi
|
||||
inside the loop (in case we are analyzing an outer-loop). */
|
||||
vect_unpromoted_value unprom[2];
|
||||
if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
|
||||
if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
|
||||
IFN_VEC_WIDEN_MINUS,
|
||||
false, 2, unprom, &half_type))
|
||||
return NULL;
|
||||
|
||||
|
@ -1395,14 +1405,16 @@ static gimple *
|
|||
vect_recog_widen_op_pattern (vec_info *vinfo,
|
||||
stmt_vec_info last_stmt_info, tree *type_out,
|
||||
tree_code orig_code, code_helper wide_code,
|
||||
bool shift_p, const char *name)
|
||||
bool shift_p, const char *name,
|
||||
optab_subtype *subtype = NULL)
|
||||
{
|
||||
gimple *last_stmt = last_stmt_info->stmt;
|
||||
|
||||
vect_unpromoted_value unprom[2];
|
||||
tree half_type;
|
||||
if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
|
||||
shift_p, 2, unprom, &half_type))
|
||||
shift_p, 2, unprom, &half_type, subtype))
|
||||
|
||||
return NULL;
|
||||
|
||||
/* Pattern detected. */
|
||||
|
@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
|
|||
type, pattern_stmt, vecctype);
|
||||
}
|
||||
|
||||
static gimple *
|
||||
vect_recog_widen_op_pattern (vec_info *vinfo,
|
||||
stmt_vec_info last_stmt_info, tree *type_out,
|
||||
tree_code orig_code, internal_fn wide_ifn,
|
||||
bool shift_p, const char *name,
|
||||
optab_subtype *subtype = NULL)
|
||||
{
|
||||
combined_fn ifn = as_combined_fn (wide_ifn);
|
||||
return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
|
||||
orig_code, ifn, shift_p, name,
|
||||
subtype);
|
||||
}
|
||||
|
||||
|
||||
/* Try to detect multiplication on widened inputs, converting MULT_EXPR
|
||||
to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */
|
||||
|
||||
|
@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
|
|||
}
|
||||
|
||||
/* Try to detect addition on widened inputs, converting PLUS_EXPR
|
||||
to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */
|
||||
to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */
|
||||
|
||||
static gimple *
|
||||
vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
|
||||
tree *type_out)
|
||||
{
|
||||
optab_subtype subtype;
|
||||
return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
|
||||
PLUS_EXPR, WIDEN_PLUS_EXPR, false,
|
||||
"vect_recog_widen_plus_pattern");
|
||||
PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
|
||||
false, "vect_recog_widen_plus_pattern",
|
||||
&subtype);
|
||||
}
|
||||
|
||||
/* Try to detect subtraction on widened inputs, converting MINUS_EXPR
|
||||
to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */
|
||||
to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */
|
||||
static gimple *
|
||||
vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
|
||||
tree *type_out)
|
||||
{
|
||||
optab_subtype subtype;
|
||||
return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
|
||||
MINUS_EXPR, WIDEN_MINUS_EXPR, false,
|
||||
"vect_recog_widen_minus_pattern");
|
||||
MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
|
||||
false, "vect_recog_widen_minus_pattern",
|
||||
&subtype);
|
||||
}
|
||||
|
||||
/* Function vect_recog_ctz_ffs_pattern
|
||||
|
@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
|
|||
vect_unpromoted_value unprom[3];
|
||||
tree new_type;
|
||||
unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
|
||||
WIDEN_PLUS_EXPR, false, 3,
|
||||
IFN_VEC_WIDEN_PLUS, false, 3,
|
||||
unprom, &new_type);
|
||||
if (nops == 0)
|
||||
return NULL;
|
||||
|
@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
|
|||
{ vect_recog_mask_conversion_pattern, "mask_conversion" },
|
||||
{ vect_recog_widen_plus_pattern, "widen_plus" },
|
||||
{ vect_recog_widen_minus_pattern, "widen_minus" },
|
||||
/* These must come after the double widening ones. */
|
||||
};
|
||||
|
||||
const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
|
||||
|
|
|
@ -5051,7 +5051,8 @@ vectorizable_conversion (vec_info *vinfo,
|
|||
bool widen_arith = (code == WIDEN_PLUS_EXPR
|
||||
|| code == WIDEN_MINUS_EXPR
|
||||
|| code == WIDEN_MULT_EXPR
|
||||
|| code == WIDEN_LSHIFT_EXPR);
|
||||
|| code == WIDEN_LSHIFT_EXPR
|
||||
|| widening_fn_p (code));
|
||||
|
||||
if (!widen_arith
|
||||
&& !CONVERT_EXPR_CODE_P (code)
|
||||
|
@ -5101,8 +5102,8 @@ vectorizable_conversion (vec_info *vinfo,
|
|||
gcc_assert (code == WIDEN_MULT_EXPR
|
||||
|| code == WIDEN_LSHIFT_EXPR
|
||||
|| code == WIDEN_PLUS_EXPR
|
||||
|| code == WIDEN_MINUS_EXPR);
|
||||
|
||||
|| code == WIDEN_MINUS_EXPR
|
||||
|| widening_fn_p (code));
|
||||
|
||||
op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
|
||||
gimple_call_arg (stmt, 0);
|
||||
|
@ -12574,26 +12575,69 @@ supportable_widening_operation (vec_info *vinfo,
|
|||
optab1 = vec_unpacks_sbool_lo_optab;
|
||||
optab2 = vec_unpacks_sbool_hi_optab;
|
||||
}
|
||||
else
|
||||
|
||||
vec_mode = TYPE_MODE (vectype);
|
||||
if (widening_fn_p (code))
|
||||
{
|
||||
/* If this is an internal fn then we must check whether the target
|
||||
supports either a low-high split or an even-odd split. */
|
||||
internal_fn ifn = as_internal_fn ((combined_fn) code);
|
||||
|
||||
internal_fn lo, hi, even, odd;
|
||||
lookup_hilo_internal_fn (ifn, &lo, &hi);
|
||||
*code1 = as_combined_fn (lo);
|
||||
*code2 = as_combined_fn (hi);
|
||||
optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
|
||||
optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
|
||||
|
||||
/* If we don't support low-high, then check for even-odd. */
|
||||
if (!optab1
|
||||
|| (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
|
||||
|| !optab2
|
||||
|| (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
|
||||
{
|
||||
lookup_evenodd_internal_fn (ifn, &even, &odd);
|
||||
*code1 = as_combined_fn (even);
|
||||
*code2 = as_combined_fn (odd);
|
||||
optab1 = direct_internal_fn_optab (even, {vectype, vectype});
|
||||
optab2 = direct_internal_fn_optab (odd, {vectype, vectype});
|
||||
}
|
||||
}
|
||||
else if (code.is_tree_code ())
|
||||
{
|
||||
optab1 = optab_for_tree_code (c1, vectype, optab_default);
|
||||
optab2 = optab_for_tree_code (c2, vectype, optab_default);
|
||||
if (code == FIX_TRUNC_EXPR)
|
||||
{
|
||||
/* The signedness is determined from output operand. */
|
||||
optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
|
||||
optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
|
||||
}
|
||||
else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
|
||||
&& VECTOR_BOOLEAN_TYPE_P (wide_vectype)
|
||||
&& VECTOR_BOOLEAN_TYPE_P (vectype)
|
||||
&& TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
|
||||
&& SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
|
||||
{
|
||||
/* If the input and result modes are the same, a different optab
|
||||
is needed where we pass in the number of units in vectype. */
|
||||
optab1 = vec_unpacks_sbool_lo_optab;
|
||||
optab2 = vec_unpacks_sbool_hi_optab;
|
||||
}
|
||||
else
|
||||
{
|
||||
optab1 = optab_for_tree_code (c1, vectype, optab_default);
|
||||
optab2 = optab_for_tree_code (c2, vectype, optab_default);
|
||||
}
|
||||
*code1 = c1;
|
||||
*code2 = c2;
|
||||
}
|
||||
|
||||
if (!optab1 || !optab2)
|
||||
return false;
|
||||
|
||||
vec_mode = TYPE_MODE (vectype);
|
||||
if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
|
||||
|| (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
|
||||
return false;
|
||||
|
||||
if (code.is_tree_code ())
|
||||
{
|
||||
*code1 = c1;
|
||||
*code2 = c2;
|
||||
}
|
||||
|
||||
|
||||
if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
|
||||
&& insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
|
||||
|
|
11
gcc/tree.def
11
gcc/tree.def
|
@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
|
|||
DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
|
||||
|
||||
/* Widening sad (sum of absolute differences).
|
||||
The first two arguments are of type t1 which should be integer.
|
||||
The third argument and the result are of type t2, such that t2 is at least
|
||||
twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
|
||||
The first two arguments are of type t1 which should be a vector of integers.
|
||||
The third argument and the result are of type t2, such that the size of
|
||||
the elements of t2 is at least twice the size of the elements of t1.
|
||||
Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
|
||||
equivalent to:
|
||||
tmp = WIDEN_MINUS_EXPR (arg1, arg2)
|
||||
tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
|
||||
tmp2 = ABS_EXPR (tmp)
|
||||
arg3 = PLUS_EXPR (tmp2, arg3)
|
||||
or:
|
||||
tmp = WIDEN_MINUS_EXPR (arg1, arg2)
|
||||
tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
|
||||
tmp2 = ABS_EXPR (tmp)
|
||||
arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
|
||||
*/
|
||||
|
|
Loading…
Add table
Reference in a new issue