Find a file
Tamar Christina 405c99c172 perform affine fold to unsigned on non address expressions. [PR114932]
When the patch for PR114074 was applied we saw a good boost in exchange2.

This boost was partially caused by a simplification of the addressing modes.
With the patch applied IV opts saw the following form for the base addressing;

  Base: (integer(kind=4) *) &block + ((sizetype) ((unsigned long) l0_19(D) *
324) + 36)

vs what we normally get:

  Base: (integer(kind=4) *) &block + ((sizetype) ((integer(kind=8)) l0_19(D)
* 81) + 9) * 4

This is because the patch promoted multiplies where one operand is a constant
from a signed multiply to an unsigned one, to attempt to fold away the constant.

This patch attempts the same but due to the various problems with SCEV and
niters not being able to analyze the resulting forms (i.e. PR114322) we can't
do it during SCEV or in the general form like in fold-const like extract_muldiv
attempts.

Instead this applies the simplification during IVopts initialization when we
create the IV. This allows IV opts to see the simplified form without
influencing the rest of the compiler.

as mentioned in PR114074 it would be good to fix the missed optimization in the
other passes so we can perform this in general.

The reason this has a big impact on Fortran code is that Fortran doesn't seem to
have unsigned integer types.  As such all it's addressing are created with
signed types and folding does not happen on them due to the possible overflow.

concretely on AArch64 this changes the results from generation:

        mov     x27, -108
        mov     x24, -72
        mov     x23, -36
        add     x21, x1, x0, lsl 2
        add     x19, x20, x22
.L5:
        add     x0, x22, x19
        add     x19, x19, 324
        ldr     d1, [x0, x27]
        add     v1.2s, v1.2s, v15.2s
        str     d1, [x20, 216]
        ldr     d0, [x0, x24]
        add     v0.2s, v0.2s, v15.2s
        str     d0, [x20, 252]
        ldr     d31, [x0, x23]
        add     v31.2s, v31.2s, v15.2s
        str     d31, [x20, 288]
        bl      digits_20_
        cmp     x21, x19
        bne     .L5

into:

.L5:
        ldr     d1, [x19, -108]
        add     v1.2s, v1.2s, v15.2s
        str     d1, [x20, 216]
        ldr     d0, [x19, -72]
        add     v0.2s, v0.2s, v15.2s
        str     d0, [x20, 252]
        ldr     d31, [x19, -36]
        add     x19, x19, 324
        add     v31.2s, v31.2s, v15.2s
        str     d31, [x20, 288]
        bl      digits_20_
        cmp     x21, x19
        bne     .L5

The two patches together results in a 10% performance increase in exchange2 in
SPECCPU 2017 and a 4% reduction in binary size and a 5% improvement in compile
time. There's also a 5% performance improvement in fotonik3d and similar
reduction in binary size.

The patch folds every IV to unsigned to canonicalize them.  At the end of the
pass we match.pd will then remove unneeded conversions.

Note that we cannot force everything to unsigned, IVops requires that array
address expressions remain as such.  Folding them results in them becoming
pointer expressions for which some optimizations in IVopts do not run.

gcc/ChangeLog:

	PR tree-optimization/114932
	* tree-ssa-loop-ivopts.cc (alloc_iv): Perform affine unsigned fold.

gcc/testsuite/ChangeLog:

	PR tree-optimization/114932
	* gcc.dg/tree-ssa/pr64705.c: Update dump file scan.
	* gcc.target/i386/pr115462.c: The testcase shares 3 IVs which calculates
	the same thing but with a slightly different increment offset.  The test
	checks for 3 complex addressing loads, one for each IV.  But with this
	change they now all share one IV.  That is the loop now only has one
	complex addressing.  This is ultimately driven by the backend costing
	and the current costing says this is preferred so updating the testcase.
	* gfortran.dg/addressing-modes_1.f90: New test.
2025-01-07 14:51:15 +00:00
.forgejo top-level: Add pull request template for Forgejo 2024-10-23 19:45:09 +01:00
.github Minor formatting fix for newly-added file from previous commit 2023-11-01 19:28:56 -04:00
c++tools Update copyright years. 2025-01-02 11:59:57 +01:00
config Daily bump. 2024-11-26 00:19:26 +00:00
contrib Daily bump. 2025-01-03 00:17:15 +00:00
fixincludes Daily bump. 2024-07-12 00:17:52 +00:00
gcc perform affine fold to unsigned on non address expressions. [PR114932] 2025-01-07 14:51:15 +00:00
gnattools Daily bump. 2024-07-08 00:17:01 +00:00
gotools Daily bump. 2024-04-16 00:18:06 +00:00
include Update copyright years. 2025-01-02 11:59:57 +01:00
INSTALL
libada Update copyright years. 2025-01-02 11:59:57 +01:00
libatomic Update copyright years. 2025-01-02 11:59:57 +01:00
libbacktrace Update copyright years. 2025-01-02 12:17:04 +01:00
libcc1 Update copyright years. 2025-01-02 11:59:57 +01:00
libcody Update Copyright year in ChangeLog files 2025-01-02 11:13:18 +01:00
libcpp Update copyright years. 2025-01-02 12:17:04 +01:00
libdecnumber Update copyright years. 2025-01-02 11:59:57 +01:00
libffi Daily bump. 2024-10-26 00:19:39 +00:00
libgcc Daily bump. 2025-01-07 00:18:08 +00:00
libgfortran Update copyright years. 2025-01-02 11:59:57 +01:00
libgm2 Daily bump. 2024-11-21 00:20:27 +00:00
libgo crypto/tls: fix Config.Time in tests using expired certificates 2025-01-06 09:59:32 -08:00
libgomp Daily bump. 2025-01-04 00:17:49 +00:00
libgrust Update Copyright year in ChangeLog files 2025-01-02 11:13:18 +01:00
libiberty Update copyright years. 2025-01-02 11:59:57 +01:00
libitm Daily bump. 2025-01-03 00:17:15 +00:00
libobjc Update copyright years. 2025-01-02 11:59:57 +01:00
libphobos Daily bump. 2025-01-06 00:16:54 +00:00
libquadmath Daily bump. 2025-01-03 00:17:15 +00:00
libsanitizer Daily bump. 2025-01-07 00:18:08 +00:00
libssp Update copyright years. 2025-01-02 11:59:57 +01:00
libstdc++-v3 Update copyright years. 2025-01-02 12:17:04 +01:00
libvtv Update copyright years. 2025-01-02 11:59:57 +01:00
lto-plugin Update copyright years. 2025-01-02 11:59:57 +01:00
maintainer-scripts Daily bump. 2024-12-04 00:21:20 +00:00
zlib Daily bump. 2023-10-23 00:16:43 +00:00
.b4-config Add config file so b4 uses inbox.sourceware.org automatically 2024-07-28 11:13:16 +01:00
.dir-locals.el dir-locals: apply our C settings in C++ also 2024-07-31 20:38:27 +02:00
.gitattributes
.gitignore Git ignores .vscode 2024-09-12 22:51:00 +08:00
ABOUT-NLS
ar-lib
ChangeLog Daily bump. 2024-12-24 00:17:55 +00:00
ChangeLog.jit
ChangeLog.tree-ssa
compile
config-ml.in config-ml.in: Fix multi-os-dir search 2024-05-06 12:08:28 +08:00
config.guess
config.rpath
config.sub
configure Revert "PR81358: Enable automatic linking of libatomic." 2024-12-18 22:03:38 +05:30
configure.ac Revert "PR81358: Enable automatic linking of libatomic." 2024-12-18 22:03:38 +05:30
COPYING
COPYING.LIB
COPYING.RUNTIME
COPYING3
COPYING3.LIB
depcomp
install-sh
libtool-ldflags
libtool.m4 Build: fix error in fixinclude configure 2023-11-22 11:54:33 +01:00
ltgcc.m4
ltmain.sh ltmain.sh: allow more flags at link-time 2024-09-25 19:05:24 +01:00
ltoptions.m4
ltsugar.m4
ltversion.m4
lt~obsolete.m4
MAINTAINERS MAINTAINERS: add myself to write after approval 2024-12-23 04:12:07 +00:00
Makefile.def Revert "PR81358: Enable automatic linking of libatomic." 2024-12-18 22:03:38 +05:30
Makefile.in Revert "PR81358: Enable automatic linking of libatomic." 2024-12-18 22:03:38 +05:30
Makefile.tpl Revert "PR81358: Enable automatic linking of libatomic." 2024-12-18 22:03:38 +05:30
missing
mkdep
mkinstalldirs
move-if-change
multilib.am
README
SECURITY.txt Remove Debian from SECURITY.txt 2024-11-19 12:27:33 +01:00
symlink-tree
test-driver
ylwrap

This directory contains the GNU Compiler Collection (GCC).

The GNU Compiler Collection is free software.  See the files whose
names start with COPYING for copying permission.  The manuals, and
some of the runtime libraries, are under different terms; see the
individual source files for details.

The directory INSTALL contains copies of the installation information
as HTML and plain text.  The source of this information is
gcc/doc/install.texi.  The installation information includes details
of what is included in the GCC sources and what files GCC installs.

See the file gcc/doc/gcc.texi (together with other files that it
includes) for usage and porting information.  An online readable
version of the manual is in the files gcc/doc/gcc.info*.

See http://gcc.gnu.org/bugs/ for how to report bugs usefully.

Copyright years on GCC source files may be listed using range
notation, e.g., 1987-2012, indicating that every year in the range,
inclusive, is a copyrightable year that could otherwise be listed
individually.