[PR rtl-optimization/116039] Fix life computation for promoted subregs

So this turned out to be a neat little test and while the fuzzer found it on
RISC-V, I wouldn't be surprised if the underlying issue is also the root cause
of the loongarch issue with ext-dce.

The key issue is that if we have something like

(set (dest) (any_extend (subreg (source))))

If the subreg object is marked with SUBREG_PROMOTED and the sign/unsigned state
matches the any_extend opcode, then combine (and I guess anything using
simplify-rtx) may simplify that to

(set (dest) (source))

That implies that bits outside the mode of the subreg are actually live and
valid.  This needs to be accounted for during liveness computation.

We have to be careful here though. If we're too conservative about setting
additional bits live, then we'll inhibit the desired optimization in the
coremark examples.  To do a good job we need to know the extension opcode.

I'm extremely unhappy with how the use handling works in ext-dce.  It mixes
different conceptual steps and has horribly complex control flow.  It only
handles a subset of the unary/binary opcodes, etc etc.  It's just damn mess.
It's going to need some more noodling around.

In the mean time this is a bit hacky in that it depends on non-obvious behavior
to know it can get the extension opcode, but I don't want to leave the trunk in
a broken state while I figure out the refactoring problem.

Bootstrapped and regression tested on x86 and tested on the crosses.  Pushing to the trunk.

	PR rtl-optimization/116039
gcc/
	* ext-dce.cc (ext_dce_process_uses): Add some comments about concerns
	with current code.  Mark additional bit groups as live when we have
	an extension of a suitably promoted subreg.

gcc/testsuite
	* gcc.dg/torture/pr116039.c: New test.
This commit is contained in:
Jeff Law 2024-07-25 12:32:28 -06:00
parent 3aeb697a21
commit 34fb0feca7
2 changed files with 57 additions and 6 deletions

View file

@ -667,6 +667,12 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
if (modify && !skipped_dest && (dst_mask & ~src_mask) == 0)
ext_dce_try_optimize_insn (insn, x);
/* Stripping the extension here just seems wrong on multiple
levels. It's source side handling, so it seems like it
belongs in the loop below. Stripping here also makes it
harder than necessary to properly handle live bit groups
for (ANY_EXTEND (SUBREG)) where the SUBREG has
SUBREG_PROMOTED state. */
dst_mask &= src_mask;
src = XEXP (src, 0);
code = GET_CODE (src);
@ -674,8 +680,8 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
/* Optimization is done at this point. We just want to make
sure everything that should get marked as live is marked
from here onward. */
from here onward. Shouldn't the backpropagate step happen
before optimization? */
dst_mask = carry_backpropagate (dst_mask, code, src);
/* We will handle the other operand of a binary operator
@ -688,7 +694,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
/* We're inside a SET and want to process the source operands
making things live. Breaking from this loop will cause
the iterator to work on sub-rtxs, so it is safe to break
if we see something we don't know how to handle. */
if we see something we don't know how to handle.
This code is just hokey as it really just handles trivial
unary and binary cases. Otherwise the loop exits and we
continue iterating on sub-rtxs, but outside the set context. */
unsigned HOST_WIDE_INT save_mask = dst_mask;
for (;;)
{
@ -704,10 +714,26 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
y = XEXP (y, 0);
else if (SUBREG_P (y) && SUBREG_BYTE (y).is_constant ())
{
/* For anything but (subreg (reg)), break the inner loop
and process normally (conservatively). */
if (!REG_P (SUBREG_REG (y)))
/* We really want to know the outer code here, ie do we
have (ANY_EXTEND (SUBREG ...)) as we need to know if
the extension matches the SUBREG_PROMOTED state. In
that case optimizers can turn the extension into a
simple copy. Which means that bits outside the
SUBREG's mode are actually live.
We don't want to mark those bits live unnecessarily
as that inhibits extension elimination in important
cases such as those in Coremark. So we need that
outer code. */
if (!REG_P (SUBREG_REG (y))
|| (SUBREG_PROMOTED_VAR_P (y)
&& ((GET_CODE (SET_SRC (x)) == SIGN_EXTEND
&& SUBREG_PROMOTED_SIGNED_P (y))
|| (GET_CODE (SET_SRC (x)) == ZERO_EXTEND
&& SUBREG_PROMOTED_UNSIGNED_P (y)))))
break;
/* The SUBREG's mode determine the live width. */
bit = subreg_lsb (y).to_constant ();
if (dst_mask)
{
@ -785,6 +811,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
HOST_WIDE_INT size = GET_MODE_BITSIZE (GET_MODE (x)).to_constant ();
HOST_WIDE_INT rn = 4 * REGNO (SUBREG_REG (x));
/* If this is a promoted subreg, then more of it may be live than
is otherwise obvious. */
if (SUBREG_PROMOTED_VAR_P (x))
size = GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (x))).to_constant ();
bitmap_set_bit (livenow, rn);
if (size > 8)
bitmap_set_bit (livenow, rn + 1);

View file

@ -0,0 +1,20 @@
/* { dg-do run } */
/* { dg-additional-options "-fsigned-char -fno-strict-aliasing -fwrapv" } */
extern void abort (void);
int c[12];
char d[12];
int *f = c;
int *z = (int *)1;
long long y;
int main() {
c[9] = 0xff;
for (int i = 0; i < 12; i += 3)
d[9] = z ? f[i] : 0;
for (long i = 0; i < 12; ++i)
y ^= d[i];
if (y != -1)
abort ();
return 0;
}