internal-fn,vect: Refactor widen_plus as internal_fn

DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN
are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN
respectively. With the exception that they provide convenience wrappers
for a single vector to vector conversion, a hi/lo split or an even/odd
split.  Each definition for <NAME> will require either signed optabs
named <UOPTAB> and <SOPTAB> (for widening) or a single <OPTAB> (for
narrowing) for each of the five functions it creates.

      For example, for widening addition the
DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions:
IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two
optabs, one for signed and one for unsigned.
      Aarch64 implements the hi/lo split optabs:
      IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> -> (u/s)addl2
      IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode> -> (u/s)addl

     This gives the same functionality as the previous
WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into
VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.

2023-06-05  Andre Vieira  <andre.simoesdiasvieira@arm.com>
	    Joel Hutton  <joel.hutton@arm.com>
	    Tamar Christina  <tamar.christina@arm.com>

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>): Rename
	this ...
	(vec_widen_<su>add_lo_<mode>): ... to this.
	(vec_widen_<su>addl_hi_<mode>): Rename this ...
	(vec_widen_<su>add_hi_<mode>): ... to this.
	(vec_widen_<su>subl_lo_<mode>): Rename this ...
	(vec_widen_<su>sub_lo_<mode>): ... to this.
	(vec_widen_<su>subl_hi_<mode>): Rename this ...
	(vec_widen_<su>sub_hi_<mode>): ...to this.
	* doc/generic.texi: Document new IFN codes.
	* internal-fn.cc (lookup_hilo_internal_fn): Add lookup function.
	(commutative_binary_fn_p): Add widen_plus fn's.
	(widening_fn_p): New function.
	(narrowing_fn_p): New function.
	(direct_internal_fn_optab): Change visibility.
	* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
	internal_fn that expands into multiple internal_fns for widening.
	(IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
	IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
	IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI,
	IFN_VEC_WIDEN_MINUS_LO, IFN_VEC_WIDEN_MINUS_ODD,
	IFN_VEC_WIDEN_MINUS_EVEN): Define widening  plus,minus functions.
	* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
	(lookup_hilo_internal_fn): Likewise.
	(widening_fn_p): Likewise.
	(Narrowing_fn_p): Likewise.
	* optabs.cc (commutative_optab_p): Add widening plus optabs.
	* optabs.def (OPTAB_D): Define widen add, sub optabs.
	* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
	patterns with a hi/lo or even/odd split.
	(vect_recog_sad_pattern): Refactor to use new IFN codes.
	(vect_recog_widen_plus_pattern): Likewise.
	(vect_recog_widen_minus_pattern): Likewise.
	(vect_recog_average_pattern): Likewise.
	* tree-vect-stmts.cc (vectorizable_conversion): Add support for
	_HILO IFNs.
	(supportable_widening_operation): Likewise.
	* tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/vect-widen-add.c: Test that new
	IFN_VEC_WIDEN_PLUS is being used.
	* gcc.target/aarch64/vect-widen-sub.c: Test that new
	IFN_VEC_WIDEN_MINUS is being used.
This commit is contained in:
Andre Vieira 2023-06-05 17:53:10 +01:00
parent fe29963d40
commit 2f482a0736
12 changed files with 375 additions and 47 deletions

View file

@ -4698,7 +4698,7 @@
[(set_attr "type" "neon_<ADDSUB:optab>_long")]
)
(define_expand "vec_widen_<su>addl_lo_<mode>"
(define_expand "vec_widen_<su>add_lo_<mode>"
[(match_operand:<VWIDE> 0 "register_operand")
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@ -4710,7 +4710,7 @@
DONE;
})
(define_expand "vec_widen_<su>addl_hi_<mode>"
(define_expand "vec_widen_<su>add_hi_<mode>"
[(match_operand:<VWIDE> 0 "register_operand")
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@ -4722,7 +4722,7 @@
DONE;
})
(define_expand "vec_widen_<su>subl_lo_<mode>"
(define_expand "vec_widen_<su>sub_lo_<mode>"
[(match_operand:<VWIDE> 0 "register_operand")
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@ -4734,7 +4734,7 @@
DONE;
})
(define_expand "vec_widen_<su>subl_hi_<mode>"
(define_expand "vec_widen_<su>sub_hi_<mode>"
[(match_operand:<VWIDE> 0 "register_operand")
(ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
(ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]

View file

@ -1811,10 +1811,16 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
@tindex VEC_RSHIFT_EXPR
@tindex VEC_WIDEN_MULT_HI_EXPR
@tindex VEC_WIDEN_MULT_LO_EXPR
@tindex VEC_WIDEN_PLUS_HI_EXPR
@tindex VEC_WIDEN_PLUS_LO_EXPR
@tindex VEC_WIDEN_MINUS_HI_EXPR
@tindex VEC_WIDEN_MINUS_LO_EXPR
@tindex IFN_VEC_WIDEN_PLUS
@tindex IFN_VEC_WIDEN_PLUS_HI
@tindex IFN_VEC_WIDEN_PLUS_LO
@tindex IFN_VEC_WIDEN_PLUS_EVEN
@tindex IFN_VEC_WIDEN_PLUS_ODD
@tindex IFN_VEC_WIDEN_MINUS
@tindex IFN_VEC_WIDEN_MINUS_HI
@tindex IFN_VEC_WIDEN_MINUS_LO
@tindex IFN_VEC_WIDEN_MINUS_EVEN
@tindex IFN_VEC_WIDEN_MINUS_ODD
@tindex VEC_UNPACK_HI_EXPR
@tindex VEC_UNPACK_LO_EXPR
@tindex VEC_UNPACK_FLOAT_HI_EXPR
@ -1861,6 +1867,82 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
low @code{N/2} elements of the two vector are multiplied to produce the
vector of @code{N/2} products.
@item IFN_VEC_WIDEN_PLUS
This internal function represents widening vector addition of two input
vectors. Its operands are vectors that contain the same number of elements
(@code{N}) of the same integral type. The result is a vector that contains
the same amount (@code{N}) of elements, of an integral type whose size is twice
as wide, as the input vectors. If the current target does not implement the
corresponding optabs the vectorizer may choose to split it into either a pair
of @code{IFN_VEC_WIDEN_PLUS_HI} and @code{IFN_VEC_WIDEN_PLUS_LO} or
@code{IFN_VEC_WIDEN_PLUS_EVEN} and @code{IFN_VEC_WIDEN_PLUS_ODD}, depending
on what optabs the target implements.
@item IFN_VEC_WIDEN_PLUS_HI
@itemx IFN_VEC_WIDEN_PLUS_LO
These internal functions represent widening vector addition of the high and low
parts of the two input vectors, respectively. Their operands are vectors that
contain the same number of elements (@code{N}) of the same integral type. The
result is a vector that contains half as many elements, of an integral type
whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
high @code{N/2} elements of the two vectors are added to produce the vector of
@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
@code{N/2} elements of the two vectors are added to produce the vector of
@code{N/2} additions.
@item IFN_VEC_WIDEN_PLUS_EVEN
@itemx IFN_VEC_WIDEN_PLUS_ODD
These internal functions represent widening vector addition of the even and odd
elements of the two input vectors, respectively. Their operands are vectors
that contain the same number of elements (@code{N}) of the same integral type.
The result is a vector that contains half as many elements, of an integral type
whose size is twice as wide. In the case of @code{IFN_VEC_WIDEN_PLUS_EVEN} the
even @code{N/2} elements of the two vectors are added to produce the vector of
@code{N/2} additions. In the case of @code{IFN_VEC_WIDEN_PLUS_ODD} the odd
@code{N/2} elements of the two vectors are added to produce the vector of
@code{N/2} additions.
@item IFN_VEC_WIDEN_MINUS
This internal function represents widening vector subtraction of two input
vectors. Its operands are vectors that contain the same number of elements
(@code{N}) of the same integral type. The result is a vector that contains
the same amount (@code{N}) of elements, of an integral type whose size is twice
as wide, as the input vectors. If the current target does not implement the
corresponding optabs the vectorizer may choose to split it into either a pair
of @code{IFN_VEC_WIDEN_MINUS_HI} and @code{IFN_VEC_WIDEN_MINUS_LO} or
@code{IFN_VEC_WIDEN_MINUS_EVEN} and @code{IFN_VEC_WIDEN_MINUS_ODD}, depending
on what optabs the target implements.
@item IFN_VEC_WIDEN_MINUS_HI
@itemx IFN_VEC_WIDEN_MINUS_LO
These internal functions represent widening vector subtraction of the high and
low parts of the two input vectors, respectively. Their operands are vectors
that contain the same number of elements (@code{N}) of the same integral type.
The high/low elements of the second vector are subtracted from the high/low
elements of the first. The result is a vector that contains half as many
elements, of an integral type whose size is twice as wide. In the case of
@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
vector are subtracted from the high @code{N/2} of the first to produce the
vector of @code{N/2} subtractions. In the case of
@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
vector are subtracted from the low @code{N/2} of the first to produce the
vector of @code{N/2} subtractions.
@item IFN_VEC_WIDEN_MINUS_EVEN
@itemx IFN_VEC_WIDEN_MINUS_ODD
These internal functions represent widening vector subtraction of the even and
odd parts of the two input vectors, respectively. Their operands are vectors
that contain the same number of elements (@code{N}) of the same integral type.
The even/odd elements of the second vector are subtracted from the even/odd
elements of the first. The result is a vector that contains half as many
elements, of an integral type whose size is twice as wide. In the case of
@code{IFN_VEC_WIDEN_MINUS_EVEN} the even @code{N/2} elements of the second
vector are subtracted from the even @code{N/2} of the first to produce the
vector of @code{N/2} subtractions. In the case of
@code{IFN_VEC_WIDEN_MINUS_ODD} the odd @code{N/2} elements of the second
vector are subtracted from the odd @code{N/2} of the first to produce the
vector of @code{N/2} subtractions.
@item VEC_WIDEN_PLUS_HI_EXPR
@itemx VEC_WIDEN_PLUS_LO_EXPR
These nodes represent widening vector addition of the high and low parts of

View file

@ -90,6 +90,60 @@ lookup_internal_fn (const char *name)
return entry ? *entry : IFN_LAST;
}
/* Geven an internal_fn IFN that is a widening function, return its
corresponding LO and HI internal_fns. */
extern void
lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
{
gcc_assert (widening_fn_p (ifn));
switch (ifn)
{
default:
gcc_unreachable ();
#undef DEF_INTERNAL_FN
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
case IFN_##NAME: \
*lo = internal_fn (IFN_##NAME##_LO); \
*hi = internal_fn (IFN_##NAME##_HI); \
break;
#include "internal-fn.def"
#undef DEF_INTERNAL_FN
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
}
}
/* Given an internal_fn IFN that is a widening function, return its
corresponding _EVEN and _ODD internal_fns in *EVEN and *ODD. */
extern void
lookup_evenodd_internal_fn (internal_fn ifn, internal_fn *even,
internal_fn *odd)
{
gcc_assert (widening_fn_p (ifn));
switch (ifn)
{
default:
gcc_unreachable ();
#undef DEF_INTERNAL_FN
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
#define DEF_INTERNAL_FN(NAME, FLAGS, TYPE)
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
case IFN_##NAME: \
*even = internal_fn (IFN_##NAME##_EVEN); \
*odd = internal_fn (IFN_##NAME##_ODD); \
break;
#include "internal-fn.def"
#undef DEF_INTERNAL_FN
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
}
}
/* Fnspec of each internal function, indexed by function number. */
const_tree internal_fn_fnspec_array[IFN_LAST + 1];
@ -3852,7 +3906,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
/* Return the optab used by internal function FN. */
static optab
optab
direct_internal_fn_optab (internal_fn fn, tree_pair types)
{
switch (fn)
@ -3971,6 +4025,11 @@ commutative_binary_fn_p (internal_fn fn)
case IFN_UBSAN_CHECK_MUL:
case IFN_ADD_OVERFLOW:
case IFN_MUL_OVERFLOW:
case IFN_VEC_WIDEN_PLUS:
case IFN_VEC_WIDEN_PLUS_LO:
case IFN_VEC_WIDEN_PLUS_HI:
case IFN_VEC_WIDEN_PLUS_EVEN:
case IFN_VEC_WIDEN_PLUS_ODD:
return true;
default:
@ -4044,6 +4103,37 @@ first_commutative_argument (internal_fn fn)
}
}
/* Return true if this CODE describes an internal_fn that returns a vector with
elements twice as wide as the element size of the input vectors. */
bool
widening_fn_p (code_helper code)
{
if (!code.is_fn_code ())
return false;
if (!internal_fn_p ((combined_fn) code))
return false;
internal_fn fn = as_internal_fn ((combined_fn) code);
switch (fn)
{
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, F, S, SO, UO, T) \
case IFN_##NAME: \
case IFN_##NAME##_HI: \
case IFN_##NAME##_LO: \
case IFN_##NAME##_EVEN: \
case IFN_##NAME##_ODD: \
return true;
#include "internal-fn.def"
#undef DEF_INTERNAL_WIDENING_OPTAB_FN
default:
return false;
}
}
/* Return true if IFN_SET_EDOM is supported. */
bool
@ -4072,6 +4162,8 @@ set_edom_supported_p (void)
expand_##TYPE##_optab_fn (fn, stmt, which_optab); \
}
#include "internal-fn.def"
#undef DEF_INTERNAL_OPTAB_FN
#undef DEF_INTERNAL_SIGNED_OPTAB_FN
/* Routines to expand each internal function, indexed by function number.
Each routine has the prototype:
@ -4080,6 +4172,7 @@ set_edom_supported_p (void)
where STMT is the statement that performs the call. */
static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
#define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
#include "internal-fn.def"
0

View file

@ -85,6 +85,21 @@ along with GCC; see the file COPYING3. If not see
says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
group of functions to any integral mode (including vector modes).
DEF_INTERNAL_WIDENING_OPTAB_FN is a wrapper that defines five internal
functions with DEF_INTERNAL_SIGNED_OPTAB_FN:
- one that describes a widening operation with the same number of elements
in the output and input vectors,
- two that describe a pair of high-low widening operations where the output
vectors each have half the number of elements of the input vectors,
corresponding to the result of the widening operation on the top half and
bottom half, these have the suffixes _HI and _LO,
- and two that describe a pair of even-odd widening operations where the
output vectors each have half the number of elements of the input vectors,
corresponding to the result of the widening operation on the even and odd
elements, these have the suffixes _EVEN and _ODD.
These five internal functions will require two optabs each, a SIGNED_OPTAB
and an UNSIGNED_OTPAB.
Each entry must have a corresponding expander of the form:
void expand_NAME (gimple_call stmt)
@ -123,6 +138,15 @@ along with GCC; see the file COPYING3. If not see
DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
#endif
#ifndef DEF_INTERNAL_WIDENING_OPTAB_FN
#define DEF_INTERNAL_WIDENING_OPTAB_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE) \
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _EVEN, FLAGS, SELECTOR, SOPTAB##_even, UOPTAB##_even, TYPE) \
DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _ODD, FLAGS, SELECTOR, SOPTAB##_odd, UOPTAB##_odd, TYPE)
#endif
DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@ -315,6 +339,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS,
ECF_CONST | ECF_NOTHROW,
first,
vec_widen_sadd, vec_widen_uadd,
binary)
DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_MINUS,
ECF_CONST | ECF_NOTHROW,
first,
vec_widen_ssub, vec_widen_usub,
binary)
DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)

View file

@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see
#ifndef GCC_INTERNAL_FN_H
#define GCC_INTERNAL_FN_H
#include "insn-codes.h"
#include "insn-opinit.h"
/* INTEGER_CST values for IFN_UNIQUE function arg-0.
UNSPEC: Undifferentiated UNIQUE.
@ -112,6 +116,10 @@ internal_fn_name (enum internal_fn fn)
}
extern internal_fn lookup_internal_fn (const char *);
extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
extern void lookup_evenodd_internal_fn (internal_fn, internal_fn *,
internal_fn *);
extern optab direct_internal_fn_optab (internal_fn, tree_pair);
/* Return the ECF_* flags for function FN. */
@ -210,6 +218,7 @@ extern bool commutative_binary_fn_p (internal_fn);
extern bool commutative_ternary_fn_p (internal_fn);
extern int first_commutative_argument (internal_fn);
extern bool associative_binary_fn_p (internal_fn);
extern bool widening_fn_p (code_helper);
extern bool set_edom_supported_p (void);

View file

@ -1314,7 +1314,17 @@ commutative_optab_p (optab binoptab)
|| binoptab == smul_widen_optab
|| binoptab == umul_widen_optab
|| binoptab == smul_highpart_optab
|| binoptab == umul_highpart_optab);
|| binoptab == umul_highpart_optab
|| binoptab == vec_widen_sadd_optab
|| binoptab == vec_widen_uadd_optab
|| binoptab == vec_widen_sadd_hi_optab
|| binoptab == vec_widen_sadd_lo_optab
|| binoptab == vec_widen_uadd_hi_optab
|| binoptab == vec_widen_uadd_lo_optab
|| binoptab == vec_widen_sadd_even_optab
|| binoptab == vec_widen_sadd_odd_optab
|| binoptab == vec_widen_uadd_even_optab
|| binoptab == vec_widen_uadd_odd_optab);
}
/* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're

View file

@ -410,6 +410,16 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
OPTAB_D (vec_widen_ssub_optab, "vec_widen_ssub_$a")
OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
OPTAB_D (vec_widen_ssub_odd_optab, "vec_widen_ssub_odd_$a")
OPTAB_D (vec_widen_ssub_even_optab, "vec_widen_ssub_even_$a")
OPTAB_D (vec_widen_sadd_optab, "vec_widen_sadd_$a")
OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
OPTAB_D (vec_widen_sadd_odd_optab, "vec_widen_sadd_odd_$a")
OPTAB_D (vec_widen_sadd_even_optab, "vec_widen_sadd_even_$a")
OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@ -422,6 +432,16 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
OPTAB_D (vec_widen_usub_optab, "vec_widen_usub_$a")
OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
OPTAB_D (vec_widen_usub_odd_optab, "vec_widen_usub_odd_$a")
OPTAB_D (vec_widen_usub_even_optab, "vec_widen_usub_even_$a")
OPTAB_D (vec_widen_uadd_optab, "vec_widen_uadd_$a")
OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
OPTAB_D (vec_widen_uadd_odd_optab, "vec_widen_uadd_odd_$a")
OPTAB_D (vec_widen_uadd_even_optab, "vec_widen_uadd_even_$a")
OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")

View file

@ -1,5 +1,5 @@
/* { dg-do run } */
/* { dg-options "-O3 -save-temps" } */
/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */
#include <stdint.h>
#include <string.h>
@ -86,6 +86,8 @@ main()
return 0;
}
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */
/* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
/* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
/* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */

View file

@ -1,5 +1,5 @@
/* { dg-do run } */
/* { dg-options "-O3 -save-temps" } */
/* { dg-options "-O3 -save-temps -fdump-tree-vect-details" } */
#include <stdint.h>
#include <string.h>
@ -86,6 +86,8 @@ main()
return 0;
}
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */
/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */
/* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
/* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
/* { dg-final { scan-assembler-times {\tssubl\t} 1} } */

View file

@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
static unsigned int
vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
tree_code widened_code, bool shift_p,
code_helper widened_code, bool shift_p,
unsigned int max_nops,
vect_unpromoted_value *unprom, tree *common_type,
enum optab_subtype *subtype = NULL)
{
/* Check for an integer operation with the right code. */
gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
if (!assign)
gimple* stmt = stmt_info->stmt;
if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
return 0;
tree_code rhs_code = gimple_assign_rhs_code (assign);
if (rhs_code != code && rhs_code != widened_code)
code_helper rhs_code;
if (is_gimple_assign (stmt))
rhs_code = gimple_assign_rhs_code (stmt);
else if (is_gimple_call (stmt))
rhs_code = gimple_call_combined_fn (stmt);
else
return 0;
tree type = TREE_TYPE (gimple_assign_lhs (assign));
if (rhs_code != code
&& rhs_code != widened_code)
return 0;
tree lhs = gimple_get_lhs (stmt);
tree type = TREE_TYPE (lhs);
if (!INTEGRAL_TYPE_P (type))
return 0;
@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
{
vect_unpromoted_value *this_unprom = &unprom[next_op];
unsigned int nops = 1;
tree op = gimple_op (assign, i + 1);
tree op = gimple_arg (stmt, i);
if (i == 1 && TREE_CODE (op) == INTEGER_CST)
{
/* We already have a common type from earlier operands.
@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
/* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi
inside the loop (in case we are analyzing an outer-loop). */
vect_unpromoted_value unprom[2];
if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
IFN_VEC_WIDEN_MINUS,
false, 2, unprom, &half_type))
return NULL;
@ -1395,14 +1405,16 @@ static gimple *
vect_recog_widen_op_pattern (vec_info *vinfo,
stmt_vec_info last_stmt_info, tree *type_out,
tree_code orig_code, code_helper wide_code,
bool shift_p, const char *name)
bool shift_p, const char *name,
optab_subtype *subtype = NULL)
{
gimple *last_stmt = last_stmt_info->stmt;
vect_unpromoted_value unprom[2];
tree half_type;
if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
shift_p, 2, unprom, &half_type))
shift_p, 2, unprom, &half_type, subtype))
return NULL;
/* Pattern detected. */
@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
type, pattern_stmt, vecctype);
}
static gimple *
vect_recog_widen_op_pattern (vec_info *vinfo,
stmt_vec_info last_stmt_info, tree *type_out,
tree_code orig_code, internal_fn wide_ifn,
bool shift_p, const char *name,
optab_subtype *subtype = NULL)
{
combined_fn ifn = as_combined_fn (wide_ifn);
return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
orig_code, ifn, shift_p, name,
subtype);
}
/* Try to detect multiplication on widened inputs, converting MULT_EXPR
to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */
@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
}
/* Try to detect addition on widened inputs, converting PLUS_EXPR
to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */
to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */
static gimple *
vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
tree *type_out)
{
optab_subtype subtype;
return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
PLUS_EXPR, WIDEN_PLUS_EXPR, false,
"vect_recog_widen_plus_pattern");
PLUS_EXPR, IFN_VEC_WIDEN_PLUS,
false, "vect_recog_widen_plus_pattern",
&subtype);
}
/* Try to detect subtraction on widened inputs, converting MINUS_EXPR
to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */
to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */
static gimple *
vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
tree *type_out)
{
optab_subtype subtype;
return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
MINUS_EXPR, WIDEN_MINUS_EXPR, false,
"vect_recog_widen_minus_pattern");
MINUS_EXPR, IFN_VEC_WIDEN_MINUS,
false, "vect_recog_widen_minus_pattern",
&subtype);
}
/* Function vect_recog_ctz_ffs_pattern
@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
vect_unpromoted_value unprom[3];
tree new_type;
unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
WIDEN_PLUS_EXPR, false, 3,
IFN_VEC_WIDEN_PLUS, false, 3,
unprom, &new_type);
if (nops == 0)
return NULL;
@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
{ vect_recog_mask_conversion_pattern, "mask_conversion" },
{ vect_recog_widen_plus_pattern, "widen_plus" },
{ vect_recog_widen_minus_pattern, "widen_minus" },
/* These must come after the double widening ones. */
};
const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);

View file

@ -5051,7 +5051,8 @@ vectorizable_conversion (vec_info *vinfo,
bool widen_arith = (code == WIDEN_PLUS_EXPR
|| code == WIDEN_MINUS_EXPR
|| code == WIDEN_MULT_EXPR
|| code == WIDEN_LSHIFT_EXPR);
|| code == WIDEN_LSHIFT_EXPR
|| widening_fn_p (code));
if (!widen_arith
&& !CONVERT_EXPR_CODE_P (code)
@ -5101,8 +5102,8 @@ vectorizable_conversion (vec_info *vinfo,
gcc_assert (code == WIDEN_MULT_EXPR
|| code == WIDEN_LSHIFT_EXPR
|| code == WIDEN_PLUS_EXPR
|| code == WIDEN_MINUS_EXPR);
|| code == WIDEN_MINUS_EXPR
|| widening_fn_p (code));
op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
gimple_call_arg (stmt, 0);
@ -12574,26 +12575,69 @@ supportable_widening_operation (vec_info *vinfo,
optab1 = vec_unpacks_sbool_lo_optab;
optab2 = vec_unpacks_sbool_hi_optab;
}
else
vec_mode = TYPE_MODE (vectype);
if (widening_fn_p (code))
{
/* If this is an internal fn then we must check whether the target
supports either a low-high split or an even-odd split. */
internal_fn ifn = as_internal_fn ((combined_fn) code);
internal_fn lo, hi, even, odd;
lookup_hilo_internal_fn (ifn, &lo, &hi);
*code1 = as_combined_fn (lo);
*code2 = as_combined_fn (hi);
optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
/* If we don't support low-high, then check for even-odd. */
if (!optab1
|| (icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
|| !optab2
|| (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
{
lookup_evenodd_internal_fn (ifn, &even, &odd);
*code1 = as_combined_fn (even);
*code2 = as_combined_fn (odd);
optab1 = direct_internal_fn_optab (even, {vectype, vectype});
optab2 = direct_internal_fn_optab (odd, {vectype, vectype});
}
}
else if (code.is_tree_code ())
{
optab1 = optab_for_tree_code (c1, vectype, optab_default);
optab2 = optab_for_tree_code (c2, vectype, optab_default);
if (code == FIX_TRUNC_EXPR)
{
/* The signedness is determined from output operand. */
optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
}
else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
&& VECTOR_BOOLEAN_TYPE_P (wide_vectype)
&& VECTOR_BOOLEAN_TYPE_P (vectype)
&& TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
&& SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
{
/* If the input and result modes are the same, a different optab
is needed where we pass in the number of units in vectype. */
optab1 = vec_unpacks_sbool_lo_optab;
optab2 = vec_unpacks_sbool_hi_optab;
}
else
{
optab1 = optab_for_tree_code (c1, vectype, optab_default);
optab2 = optab_for_tree_code (c2, vectype, optab_default);
}
*code1 = c1;
*code2 = c2;
}
if (!optab1 || !optab2)
return false;
vec_mode = TYPE_MODE (vectype);
if ((icode1 = optab_handler (optab1, vec_mode)) == CODE_FOR_nothing
|| (icode2 = optab_handler (optab2, vec_mode)) == CODE_FOR_nothing)
return false;
if (code.is_tree_code ())
{
*code1 = c1;
*code2 = c2;
}
if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
&& insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))

View file

@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
/* Widening sad (sum of absolute differences).
The first two arguments are of type t1 which should be integer.
The third argument and the result are of type t2, such that t2 is at least
twice the size of t1. Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
The first two arguments are of type t1 which should be a vector of integers.
The third argument and the result are of type t2, such that the size of
the elements of t2 is at least twice the size of the elements of t1.
Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
equivalent to:
tmp = WIDEN_MINUS_EXPR (arg1, arg2)
tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
tmp2 = ABS_EXPR (tmp)
arg3 = PLUS_EXPR (tmp2, arg3)
or:
tmp = WIDEN_MINUS_EXPR (arg1, arg2)
tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
tmp2 = ABS_EXPR (tmp)
arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
*/