arm: don't vectorize fmaxf() unless unsafe math opts are enabled

This test has presumably been failing since vectorization was enabled
at -O2.  I suspect part of the reason this wasn't picked up sooner is
that the test is a hybrid execution/scan-assembler test and the
execution part requires appropriate hardware.

The problem is that we are vectorizing an expansion of fmaxf() when
the vector version of the instruction does not preserve denormal
values.  This means we should only apply this optimization when
-funsafe-math-optimizations is enabled.

This fix does a few things:

- Moves the expand pattern to vec-common.md.  Although I haven't changed
its behaviour (beyond fixing the bug), this should really be enabled for
MVE as well (but that will need to wait for gcc-16 since the MVE code
needs some additional changes first).
- Adds support for HF mode vectors.
- splits the test that was exposing the bug into two parts: an executable
test and a scan-assembler test.  The scan-assembler version is more
widely enabled, since it does not require a suitable executable environment.

gcc/ChangeLog:

	* config/arm/neon.md (<fmaxmin><mode>3): Move pattern from here...
	* config/arm/vec-common.md (<fmaxmin><mode>3): ... to here.  Convert
	to define_expand and disable the pattern when denormal values might
	get truncated to zero.  Iterate on VF to add V4HF and V8HF variants.

gcc/testsuite/ChangeLog:

	* gcc.target/arm/fmaxmin.c: Move scan-assembler checks to ...
	* gcc.target/arm/fmaxmin-2.c: ... here.  New test.
This commit is contained in:
Richard Earnshaw 2025-03-26 15:56:18 +00:00
parent 271745bafa
commit b631ff45f2
4 changed files with 24 additions and 19 deletions

View file

@ -2738,17 +2738,6 @@
[(set_attr "type" "neon_fp_minmax_s<q>")]
)
;; Vector forms for the IEEE-754 fmax()/fmin() functions
(define_insn "<fmaxmin><mode>3"
[(set (match_operand:VCVTF 0 "s_register_operand" "=w")
(unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w")
(match_operand:VCVTF 2 "s_register_operand" "w")]
VMAXMINFNM))]
"TARGET_NEON && TARGET_VFP5"
"<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
[(set_attr "type" "neon_fp_minmax_s<q>")]
)
(define_expand "neon_vpadd<mode>"
[(match_operand:VD 0 "s_register_operand")
(match_operand:VD 1 "s_register_operand")

View file

@ -137,6 +137,17 @@
"ARM_HAVE_<MODE>_ARITH"
)
;; Vector forms for the IEEE-754 fmax()/fmin() functions
;; Fixme: Should be enabled for MVE as well, but currently that uses an
;; incompatible expasion.
(define_expand "<fmaxmin><mode>3"
[(set (match_operand:VF 0 "s_register_operand" "")
(unspec:VF [(match_operand:VF 1 "s_register_operand")
(match_operand:VF 2 "s_register_operand")]
VMAXMINFNM))]
"TARGET_NEON && TARGET_VFP5 && ARM_HAVE_<MODE>_ARITH"
)
(define_expand "vec_perm<mode>"
[(match_operand:VE 0 "s_register_operand")
(match_operand:VE 1 "s_register_operand")

View file

@ -0,0 +1,12 @@
/* { dg-do compile } */
/* { dg-require-effective-target arm_arch_v8a_hard_ok } */
/* { dg-options "-O2 -fno-inline" } */
/* { dg-add-options arm_arch_v8a_hard } */
#include "fmaxmin.x"
/* { dg-final { scan-assembler-times "vmaxnm.f32\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" 1 } } */
/* { dg-final { scan-assembler-times "vminnm.f32\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" 1 } } */
/* { dg-final { scan-assembler-times "vmaxnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
/* { dg-final { scan-assembler-times "vminnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */

View file

@ -1,13 +1,6 @@
/* { dg-do run } */
/* { dg-require-effective-target arm_v8_neon_hw } */
/* { dg-options "-O2 -fno-inline -march=armv8-a -save-temps" } */
/* { dg-options "-O2 -fno-inline" } */
/* { dg-add-options arm_v8_neon } */
#include "fmaxmin.x"
/* { dg-final { scan-assembler-times "vmaxnm.f32\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" 1 } } */
/* { dg-final { scan-assembler-times "vminnm.f32\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" 1 } } */
/* { dg-final { scan-assembler-times "vmaxnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */
/* { dg-final { scan-assembler-times "vminnm.f64\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 1 } } */