Find a file
Jennifer Schmitz 5289540ed5 SVE intrinsics: Fold calls with pfalse predicate.
If an SVE intrinsic has predicate pfalse, we can fold the call to
a simplified assignment statement: For _m predication, the LHS can be assigned
the operand for inactive values and for _z, we can assign a zero vector.
For _x, the returned values can be arbitrary and as suggested by
Richard Sandiford, we fold to a zero vector.

For example,
svint32_t foo (svint32_t op1, svint32_t op2)
{
  return svadd_s32_m (svpfalse_b (), op1, op2);
}
can be folded to lhs = op1, such that foo is compiled to just a RET.

For implicit predication, a case distinction is necessary:
Intrinsics that read from memory can be folded to a zero vector.
Intrinsics that write to memory or prefetch can be folded to a no-op.
Other intrinsics need case-by-case implemenation, which we added in
the corresponding svxxx_impl::fold.

We implemented this optimization during gimple folding by calling a new method
gimple_folder::fold_pfalse from gimple_folder::fold, which covers the generic
cases described above.

We tested the new behavior for each intrinsic with all supported predications
and data types and checked the produced assembly. There is a test file
for each shape subclass with scan-assembler-times tests that look for
the simplified instruction sequences, such as individual RET instructions
or zeroing moves. There is an additional directive counting the total number of
functions in the test, which must be the sum of counts of all other
directives. This is to check that all tested intrinsics were optimized.

Some few intrinsics were not covered by this patch:
- svlasta and svlastb already have an implementation to cover a pfalse
predicate. No changes were made to them.
- svld1/2/3/4 return aggregate types and were excluded from the case
that folds calls with implicit predication to lhs = {0, ...}.
- svst1/2/3/4 already have an implementation in svstx_impl that precedes
our optimization, such that it is not triggered.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>

gcc/ChangeLog:

	PR target/106329
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svac_impl::fold): Add folding if pfalse predicate.
	(svadda_impl::fold): Likewise.
	(class svaddv_impl): Likewise.
	(class svandv_impl): Likewise.
	(svclast_impl::fold): Likewise.
	(svcmp_impl::fold): Likewise.
	(svcmp_wide_impl::fold): Likewise.
	(svcmpuo_impl::fold): Likewise.
	(svcntp_impl::fold): Likewise.
	(class svcompact_impl): Likewise.
	(class svcvtnt_impl): Likewise.
	(class sveorv_impl): Likewise.
	(class svminv_impl): Likewise.
	(class svmaxnmv_impl): Likewise.
	(class svmaxv_impl): Likewise.
	(class svminnmv_impl): Likewise.
	(class svorv_impl): Likewise.
	(svpfirst_svpnext_impl::fold): Likewise.
	(svptest_impl::fold): Likewise.
	(class svsplice_impl): Likewise.
	* config/aarch64/aarch64-sve-builtins-sve2.cc
	(class svcvtxnt_impl): Likewise.
	(svmatch_svnmatch_impl::fold): Likewise.
	* config/aarch64/aarch64-sve-builtins.cc
	(is_pfalse): Return true if tree is pfalse.
	(gimple_folder::fold_pfalse): Fold calls with pfalse predicate.
	(gimple_folder::fold_call_to): Fold call to lhs = t for given tree t.
	(gimple_folder::fold_to_stmt_vops): Helper function that folds the
	call to given stmt and adjusts virtual operands.
	(gimple_folder::fold): Call fold_pfalse.
	* config/aarch64/aarch64-sve-builtins.h (is_pfalse): Declare is_pfalse.

gcc/testsuite/ChangeLog:

	PR target/106329
	* gcc.target/aarch64/pfalse-binary_0.h: New test.
	* gcc.target/aarch64/pfalse-unary_0.h: New test.
	* gcc.target/aarch64/sve/pfalse-binary.c: New test.
	* gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-binary_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-binary_rotate.c: New test.
	* gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-binaryxn.c: New test.
	* gcc.target/aarch64/sve/pfalse-clast.c: New test.
	* gcc.target/aarch64/sve/pfalse-compare_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-count_pred.c: New test.
	* gcc.target/aarch64/sve/pfalse-fold_left.c: New test.
	* gcc.target/aarch64/sve/pfalse-load.c: New test.
	* gcc.target/aarch64/sve/pfalse-load_ext.c: New test.
	* gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: New test.
	* gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: New test.
	* gcc.target/aarch64/sve/pfalse-load_gather_sv.c: New test.
	* gcc.target/aarch64/sve/pfalse-load_gather_vs.c: New test.
	* gcc.target/aarch64/sve/pfalse-load_replicate.c: New test.
	* gcc.target/aarch64/sve/pfalse-prefetch.c: New test.
	* gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: New test.
	* gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: New test.
	* gcc.target/aarch64/sve/pfalse-ptest.c: New test.
	* gcc.target/aarch64/sve/pfalse-rdffr.c: New test.
	* gcc.target/aarch64/sve/pfalse-reduction.c: New test.
	* gcc.target/aarch64/sve/pfalse-reduction_wide.c: New test.
	* gcc.target/aarch64/sve/pfalse-shift_right_imm.c: New test.
	* gcc.target/aarch64/sve/pfalse-store.c: New test.
	* gcc.target/aarch64/sve/pfalse-store_scatter_index.c: New test.
	* gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: New test.
	* gcc.target/aarch64/sve/pfalse-storexn.c: New test.
	* gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-ternary_rotate.c: New test.
	* gcc.target/aarch64/sve/pfalse-unary.c: New test.
	* gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: New test.
	* gcc.target/aarch64/sve/pfalse-unary_convertxn.c: New test.
	* gcc.target/aarch64/sve/pfalse-unary_n.c: New test.
	* gcc.target/aarch64/sve/pfalse-unary_pred.c: New test.
	* gcc.target/aarch64/sve/pfalse-unary_to_uint.c: New test.
	* gcc.target/aarch64/sve/pfalse-unaryxn.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: New test.
	* gcc.target/aarch64/sve2/pfalse-binary_wide.c: New test.
	* gcc.target/aarch64/sve2/pfalse-compare.c: New test.
	* gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c:
	New test.
	* gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c:
	New test.
	* gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: New test.
	* gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: New test.
	* gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: New test.
	* gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: New test.
	* gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c:
	New test.
	* gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c:
	New test.
	* gcc.target/aarch64/sve2/pfalse-unary.c: New test.
	* gcc.target/aarch64/sve2/pfalse-unary_convert.c: New test.
	* gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: New test.
	* gcc.target/aarch64/sve2/pfalse-unary_to_int.c: New test.
2024-12-06 08:35:13 +01:00
.forgejo top-level: Add pull request template for Forgejo 2024-10-23 19:45:09 +01:00
.github Minor formatting fix for newly-added file from previous commit 2023-11-01 19:28:56 -04:00
c++tools Daily bump. 2024-05-09 10:58:01 +00:00
config Daily bump. 2024-11-26 00:19:26 +00:00
contrib Daily bump. 2024-12-05 00:19:47 +00:00
fixincludes Daily bump. 2024-07-12 00:17:52 +00:00
gcc SVE intrinsics: Fold calls with pfalse predicate. 2024-12-06 08:35:13 +01:00
gnattools Daily bump. 2024-07-08 00:17:01 +00:00
gotools Daily bump. 2024-04-16 00:18:06 +00:00
include Daily bump. 2024-11-24 00:18:09 +00:00
INSTALL
libada Update copyright years. 2024-01-03 12:19:35 +01:00
libatomic Daily bump. 2024-11-19 00:19:52 +00:00
libbacktrace Daily bump. 2024-11-30 00:20:11 +00:00
libcc1 Daily bump. 2024-09-21 00:16:55 +00:00
libcody Update Copyright year in ChangeLog files 2024-01-03 11:35:18 +01:00
libcpp Daily bump. 2024-12-04 00:21:20 +00:00
libdecnumber Daily bump. 2024-04-03 00:17:29 +00:00
libffi Daily bump. 2024-10-26 00:19:39 +00:00
libgcc Daily bump. 2024-12-01 00:17:14 +00:00
libgfortran Daily bump. 2024-12-05 00:19:47 +00:00
libgm2 Daily bump. 2024-11-21 00:20:27 +00:00
libgo syscall: don't define syscall stub on Hurd 2024-10-30 11:33:07 -07:00
libgomp Daily bump. 2024-12-04 00:21:20 +00:00
libgrust Daily bump. 2024-08-02 00:18:55 +00:00
libiberty Daily bump. 2024-11-20 00:19:59 +00:00
libitm Daily bump. 2024-11-19 00:19:52 +00:00
libobjc Daily bump. 2024-09-24 00:18:14 +00:00
libphobos Daily bump. 2024-11-19 00:19:52 +00:00
libquadmath Daily bump. 2024-08-29 00:19:25 +00:00
libsanitizer Daily bump. 2024-11-26 00:19:26 +00:00
libssp Daily bump. 2024-05-09 10:58:01 +00:00
libstdc++-v3 Daily bump. 2024-12-06 00:19:28 +00:00
libvtv Daily bump. 2024-11-19 00:19:52 +00:00
lto-plugin Daily bump. 2024-08-24 00:18:13 +00:00
maintainer-scripts Daily bump. 2024-12-04 00:21:20 +00:00
zlib
.b4-config Add config file so b4 uses inbox.sourceware.org automatically 2024-07-28 11:13:16 +01:00
.dir-locals.el dir-locals: apply our C settings in C++ also 2024-07-31 20:38:27 +02:00
.gitattributes
.gitignore Git ignores .vscode 2024-09-12 22:51:00 +08:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-12-03 00:20:18 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in config-ml.in: Fix multi-os-dir search 2024-05-06 12:08:28 +08:00
config.guess
config.rpath
config.sub
configure Rename "libdiagnostics" to "libgdiagnostics" 2024-11-29 18:13:22 -05:00
configure.ac Rename "libdiagnostics" to "libgdiagnostics" 2024-11-29 18:13:22 -05:00
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4 Build: fix error in fixinclude configure 2023-11-22 11:54:33 +01:00
ltgcc.m4
ltmain.sh ltmain.sh: allow more flags at link-time 2024-09-25 19:05:24 +01:00
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS MAINTAINERS: add myself to write after approval 2024-12-02 16:43:06 +00:00
Makefile.def gccrs: Fix missing build dependency 2024-01-16 16:23:02 +01:00
Makefile.in Makefile.tpl: fix whitespace in licence header 2024-08-22 03:41:12 +01:00
Makefile.tpl Makefile.tpl: fix whitespace in licence header 2024-08-22 03:41:12 +01:00
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt Remove Debian from SECURITY.txt 2024-11-19 12:27:33 +01:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.