vect: while_ult for integer masks

Add a vector length parameter needed by amdgcn without breaking aarch64.

All amdgcn vector masks are DImode, regardless of vector length, so we can't
tell what length is implied simply from the operator mode.  (Even if we used
different integer modes there's no mode small enough to differenciate a 2 or
4 lane mask).  Without knowing the intended length we end up using a mask with
too many lanes enabled, which leads to undefined behaviour..

The extra operand is not added for vector mask types so AArch64 does not need
to be adjusted.

gcc/ChangeLog:

	* config/gcn/gcn-valu.md (while_ultsidi): Limit mask length using
	operand 3.
	* doc/md.texi (while_ult): Document new operand 3 usage.
	* internal-fn.cc (expand_while_optab_fn): Set operand 3 when lhs_type
	maps to a non-vector mode.
This commit is contained in:
Andrew Stubbs 2020-10-02 15:12:50 +01:00
parent f41d1b39a6
commit 48960b6897
3 changed files with 35 additions and 6 deletions

View file

@ -3052,7 +3052,8 @@
(define_expand "while_ultsidi"
[(match_operand:DI 0 "register_operand")
(match_operand:SI 1 "")
(match_operand:SI 2 "")]
(match_operand:SI 2 "")
(match_operand:SI 3 "")]
""
{
if (GET_CODE (operands[1]) != CONST_INT
@ -3077,6 +3078,11 @@
: ~((unsigned HOST_WIDE_INT)-1 << diff));
emit_move_insn (operands[0], gen_rtx_CONST_INT (VOIDmode, mask));
}
if (INTVAL (operands[3]) < 64)
emit_insn (gen_anddi3 (operands[0], operands[0],
gen_rtx_CONST_INT (VOIDmode,
~((unsigned HOST_WIDE_INT)-1
<< INTVAL (operands[3])))));
DONE;
})

View file

@ -4950,9 +4950,10 @@ This pattern is not allowed to @code{FAIL}.
@cindex @code{while_ult@var{m}@var{n}} instruction pattern
@item @code{while_ult@var{m}@var{n}}
Set operand 0 to a mask that is true while incrementing operand 1
gives a value that is less than operand 2. Operand 0 has mode @var{n}
and operands 1 and 2 are scalar integers of mode @var{m}.
The operation is equivalent to:
gives a value that is less than operand 2, for a vector length up to operand 3.
Operand 0 has mode @var{n} and operands 1 and 2 are scalar integers of mode
@var{m}. Operand 3 should be omitted when @var{n} is a vector mode, and
a @code{CONST_INT} otherwise. The operation for vector modes is equivalent to:
@smallexample
operand0[0] = operand1 < operand2;
@ -4960,6 +4961,14 @@ for (i = 1; i < GET_MODE_NUNITS (@var{n}); i++)
operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
@end smallexample
And for non-vector modes the operation is equivalent to:
@smallexample
operand0[0] = operand1 < operand2;
for (i = 1; i < operand3; i++)
operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
@end smallexample
@cindex @code{check_raw_ptrs@var{m}} instruction pattern
@item @samp{check_raw_ptrs@var{m}}
Check whether, given two pointers @var{a} and @var{b} and a length @var{len},

View file

@ -3664,7 +3664,7 @@ expand_direct_optab_fn (internal_fn fn, gcall *stmt, direct_optab optab,
static void
expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
{
expand_operand ops[3];
expand_operand ops[4];
tree rhs_type[2];
tree lhs = gimple_call_lhs (stmt);
@ -3680,10 +3680,24 @@ expand_while_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
create_input_operand (&ops[i + 1], rhs_rtx, TYPE_MODE (rhs_type[i]));
}
int opcnt;
if (!VECTOR_MODE_P (TYPE_MODE (lhs_type)))
{
/* When the mask is an integer mode the exact vector length may not
be clear to the backend, so we pass it in operand[3].
Use the vector in arg2 for the most reliable intended size. */
tree type = TREE_TYPE (gimple_call_arg (stmt, 2));
create_integer_operand (&ops[3], TYPE_VECTOR_SUBPARTS (type));
opcnt = 4;
}
else
/* The mask has a vector type so the length operand is unnecessary. */
opcnt = 3;
insn_code icode = convert_optab_handler (optab, TYPE_MODE (rhs_type[0]),
TYPE_MODE (lhs_type));
expand_insn (icode, 3, ops);
expand_insn (icode, opcnt, ops);
if (!rtx_equal_p (lhs_rtx, ops[0].value))
emit_move_insn (lhs_rtx, ops[0].value);
}