Makefile.in (TEXI_GCC_FILES): Add arm-neon-intrinsics.texi.

gcc/
    * Makefile.in (TEXI_GCC_FILES): Add arm-neon-intrinsics.texi.
    * config.gcc (arm*-*-*): Add arm_neon.h to extra headers.
    (with_fpu): Allow --with-fpu=neon.
    * config/arm/aof.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15.
    * config/arm/aout.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15.
    * config/arm/arm-modes.def (EI, OI, CI, XI): New modes.
    * config/arm/arm-protos.h (neon_immediate_valid_for_move)
    (neon_immediate_valid_for_logic, neon_output_logic_immediate)
    (neon_pairwise_reduce, neon_expand_vector_init, neon_reinterpret)
    (neon_emit_pair_result_insn, neon_disambiguate_copy)
    (neon_vector_mem_operand, neon_struct_mem_operand, output_move_quad)
    (output_move_neon): Add prototypes.
    * config/arm/arm.c (FL_NEON): New flag for NEON processor capability.
    (all_fpus): Add FPUTYPE_NEON.
    (fp_model_for_fpu): Add NEON field.
    (arm_return_in_memory): Return vectors <= 16 bytes in ARM registers.
    (arm_arg_partial_bytes): Allow NEON vectors to be passed partially
    in registers.
    (arm_legitimate_address_p): Don't support fancy addressing for NEON
    structure moves.
    (thumb2_legitimate_address_p): Likewise.
    (neon_valid_immediate): Recognize and prepare constants suitable for
    NEON instructions.
    (neon_immediate_valid_for_move): New function. Recognize and prepare
    immediates for NEON move instructions.
    (neon_immediate_valid_for_logic): New function. Recognize and
    prepare immediates for NEON logic instructions.
    (neon_output_logic_immediate): New function. Create asm string
    suitable for outputting immediate logic instructions.
    (neon_pairwise_reduce): New function. Implement reduction using
    pairwise operations.
    (neon_expand_vector_init): New function. Expand a (possibly
    non-constant) vector initialization.
    (neon_vector_mem_operand): New function. Memory operands supported
    for quad-word loads/stores to/from ARM or NEON registers. Don't
    allow base+offset addressing for core regs.
    (neon_struct_mem_operand): New function. Valid mems for NEON
    structure moves.
    (coproc_secondary_reload_class): Enable NEON registers to be loaded
    from neon_vector_mem_operand addresses without a secondary register.
    (add_minipool_forward_ref): Handle >8-byte minipool entries.
    (add_minipool_backward_ref): Likewise.
    (dump_minipool): Likewise.
    (push_minipool_fix): Likewise.
    (output_move_quad): New function. Output quad-word moves, loads and
    stores using ARM registers.
    (output_move_vfp): Add support for vectors in VFP (NEON) D
    registers.
    (output_move_neon): Output a NEON load/store to/from a quadword
    register.
    (arm_print_operand): Implement new codes:
    - 'c' for unadorned integers (without a # sign).
    - 'J', 'K' for reg+2/reg+3, reg+3/reg+2 in little/big-endian
    mode.
    - 'e', 'f' for the low and high D parts of a NEON Q register.
    - 'q' outputs a NEON Q register.
    - 'h' outputs ranges of D registers for VLDM/VSTM etc.
    - 'T' prints NEON opcode features from a coded bitmask.
    - 'F' is similar to T, but signed/unsigned codes both print as
    'i'.
    - 't' is similar to T, but 'u' is printed instead of 'p'.
    - 'O' prints 'r' if NEON instruction should perform rounding (as
    specified by bitmask), else prints nothing.
    - '#' is a punctuation character to stop operand numbers from
    running together with following digits in the assembler
    strings for instructions (when using mode attributes).
    (arm_assemble_integer): Handle extra NEON vector modes. Permute
    constant vectors in big-endian mode, where necessary.
    (arm_hard_regno_mode_ok): Allow vectors in VFP/NEON registers.
    Handle EI, OI, CI, XI modes.
    (ashlv4hi3, ashlv2si3, lshrv4hi3, lshrv2si3, ashrv4hi3)
    (ashrv2si3): Rename IWMMXT2_BUILTINs to...
    (ashlv4hi3_iwmmxt, ashlv2si3_iwmmxt, lshrv4hi3_iwmmxt)
    (lshrv2si3_iwmmxt, ashrv4hi3_iwmmxt, ashrv2si3_iwmmxt): New names.
    (neon_builtin_type_bits): Add enumeration, one bit for each vector
    type.
    (v8qi_UP, v4hi_UP, v2si_UP, v2sf_UP, di_UP, v16qi_UP, v8hi_UP)
    (v4si_UP, v4sf_UP, v2di_UP, ti_UP, ei_UP, oi_UP, UP): Define macros
    to turn v8qi, etc. into bits defined above.
    (neon_itype): New enumeration. Classifications of NEON builtins.
    (neon_builtin_datum): Define struct. Contains information about
    a single builtin (with multiple modes).
    (CF): Define helper macro for...
    (VAR1...VAR10): Define builtins with a type, name and 1-10 different
    modes.
    (neon_builtin_data): New array. Define information about builtins
    for use during initialization/expansion.
    (arm_init_neon_builtins): New function.
    (arm_init_builtins): Call arm_init_neon_builtins if TARGET_NEON is
    true.
    (neon_builtin_compare): New function.
    (locate_neon_builtin_icode): New function. Find an insn code for a
    builtin given a function code for that builtin. Also return type of
    builtin (NEON_BINOP, NEON_UNOP etc.).
    (builtin_arg): New enumeration. Types of arguments for builtins.
    (arm_expand_neon_args): New function. Expand a generic NEON builtin.
    Takes a variable argument list of builtin_arg types, terminated by
    NEON_ARG_STOP.
    (arm_expand_neon_builtin): New function. Expand a NEON builtin.
    (neon_reinterpret): New function. Expand NEON reinterpret intrinsic.
    (neon_emit_pair_result_insn): New function. Support returning pairs
    of vectors via a pointer.
    (neon_disambiguate_copy): New function. Set up operands for a
    multi-word copy such that registers do not get clobbered.
    (arm_expand_builtin): Call arm_expand_neon_builtin if fcode >=
    ARM_BUILTIN_NEON_BASE.
    (arm_file_start): Set float-abi attribute for NEON.
    (arm_vector_mode_supported_p): Enable NEON vector modes.
    (arm_mangle_map_entry): New.
    (arm_mangle_map): New.
    (arm_mangle_vector_type): New.
    * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_NEON__
    when appropriate.
    (TARGET_NEON): New macro. Target supports NEON.
    (fputype): Add FPUTYPE_NEON.
    (UNITS_PER_SIMD_WORD): Define. Allow quad-word registers to be used
    for vectorization based on command-line arg.
    (NEON_REGNO_OK_FOR_NREGS): Define.
    (VALID_NEON_DREG_MODE, VALID_NEON_QREG_MODE)
    (VALID_NEON_STRUCT_MODE): Define.
    (PRINT_OPERAND_PUNCT_VALID_P): '#' is valid punctuation.
    (arm_builtins): Add ARM_BUILTIN_NEON_BASE.
    * config/arm/arm.md (VUNSPEC_POOL_16): Insert constant for unspec.
    (consttable_16): Add pattern for outputting 16-byte minipool
    entries.
    (movv2si, movv4hi, movv8qi): Remove blank expanders (redefined in
    vec-common.md).
    (vec-common.md, neon.md): Include md files.
    * config/arm/arm.opt (mvectorize-with-neon-quad): Add option.
    * config/arm/constraints.md (constraint "Dn", "Dl", "DL"): Define.
    (memory_constraint "Ut", "Un", "Us"): Define.
    * config/arm/iwmmxt.md (VMMX, VSHFT): New mode macros.
    (MMX_char): New mode attribute.
    (addv8qi3, addv4hi3, addv2si3): Remove. Replace with...
    (*add<mode>3_iwmmxt): New insn pattern.
    (subv8qi3, subv4hi3, subv2si3): Remove. Replace with...
    (*sub<mode>3_iwmmxt): New insn pattern.
    (mulv4hi3): Rename to...
    (*mulv4hi3_iwmmxt): This.
    (smaxv8qi3, smaxv4hi3, smaxv2si3, umaxv8qi3, umaxv4hi3)
    (umaxv2si3, sminv8qi3, sminv4hi3, sminv2si3, uminv8qi3)
    (uminv4hi3, uminv2si3): Remove. Replace with...
    (*smax<mode>3_iwmmxt, *umax<mode>3_iwmmxt, *smin<mode>3_iwmmxt)
    (*umin<mode>3_iwmmxt): These.
    (ashrv4hi3, ashrv2si3, ashrdi3_iwmmxt): Replace with...
    (ashr<mode>3_iwmmxt): This new pattern.
    (lshrv4hi3, lshrv2si3, lshrdi3_iwmmxt): Replace with...
    (lshr<mode>3_iwmmxt): This new pattern.
    (ashlv4hi3, ashlv2si3, ashldi3_iwmmxt): Replace with...
    (ashl<mode>3_iwmmxt): This new pattern.
    * config/arm/neon-docgen.ml: New file. Generate documentation for
    intrinsics.
    * config/arm/neon-gen.ml: New file. Generate arm_neon.h header.
    * config/arm/arm_neon.h: New (autogenerated).
    * config/arm/neon-testgen.ml: New file. Generate NEON tests
    automatically.
    * config/arm/neon.md: New file. Define NEON instructions.
    * config/arm/neon.ml: New file. Abstract description of NEON
    instructions, used to generate arm_neon.h header, documentation and tests.
    * config/arm/t-arm (MD_INCLUDES): Add vec-common.md, neon.md.
    * vec-common.md: New file. Shared parts for iWMMXt and NEON vector
    support.
    * doc/extend.texi (ARM Built-in Functions): Rename and remove
    extraneous comma.
    (ARM NEON Intrinsics): New subsection.
    * doc/arm-neon-intrinsics.texi: New (autogenerated).

    gcc/testsuite/
    * gcc.dg/vect/vect.exp: Check is-effective-target arm_neon_hw.
    * gcc.dg/vect/tree-vect.h: Check for NEON SIMD support.
    * lib/gcc-dg.exp (cleanup-saved-temps): Fix comment.
    * lib/target-supports.exp (check_effective_target_arm_neon_ok)
    (check_effective_target_arm_neon_hw): New.
    * gcc.target/arm/neon/neon.exp: New file.
    * gcc.target/arm/neon/polytypes.c: New file.
    * gcc.target/arm/neon/v*.c (1870 files): New (autogenerated).


Co-Authored-By: Joseph Myers <joseph@codesourcery.com>
Co-Authored-By: Mark Shinwell <shinwell@codesourcery.com>
Co-Authored-By: Paul Brook <paul@codesourcery.com>

From-SVN: r126911
This commit is contained in:
Julian Brown 2007-07-25 12:28:31 +00:00 committed by Julian Brown
parent 15d92b36a1
commit 88f77cba02
1902 changed files with 69377 additions and 303 deletions

View file

@ -1,3 +1,176 @@
2007-07-25 Julian Brown <julian@codesourcery.com>
Paul Brook <paul@codesourcery.com>
Joseph Myers <joseph@codesourcery.com>
Mark Shinwell <shinwell@codesourcery.com>
* Makefile.in (TEXI_GCC_FILES): Add arm-neon-intrinsics.texi.
* config.gcc (arm*-*-*): Add arm_neon.h to extra headers.
(with_fpu): Allow --with-fpu=neon.
* config/arm/aof.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15.
* config/arm/aout.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15.
* config/arm/arm-modes.def (EI, OI, CI, XI): New modes.
* config/arm/arm-protos.h (neon_immediate_valid_for_move)
(neon_immediate_valid_for_logic, neon_output_logic_immediate)
(neon_pairwise_reduce, neon_expand_vector_init, neon_reinterpret)
(neon_emit_pair_result_insn, neon_disambiguate_copy)
(neon_vector_mem_operand, neon_struct_mem_operand, output_move_quad)
(output_move_neon): Add prototypes.
* config/arm/arm.c (FL_NEON): New flag for NEON processor capability.
(all_fpus): Add FPUTYPE_NEON.
(fp_model_for_fpu): Add NEON field.
(arm_return_in_memory): Return vectors <= 16 bytes in ARM registers.
(arm_arg_partial_bytes): Allow NEON vectors to be passed partially
in registers.
(arm_legitimate_address_p): Don't support fancy addressing for NEON
structure moves.
(thumb2_legitimate_address_p): Likewise.
(neon_valid_immediate): Recognize and prepare constants suitable for
NEON instructions.
(neon_immediate_valid_for_move): New function. Recognize and prepare
immediates for NEON move instructions.
(neon_immediate_valid_for_logic): New function. Recognize and
prepare immediates for NEON logic instructions.
(neon_output_logic_immediate): New function. Create asm string
suitable for outputting immediate logic instructions.
(neon_pairwise_reduce): New function. Implement reduction using
pairwise operations.
(neon_expand_vector_init): New function. Expand a (possibly
non-constant) vector initialization.
(neon_vector_mem_operand): New function. Memory operands supported
for quad-word loads/stores to/from ARM or NEON registers. Don't
allow base+offset addressing for core regs.
(neon_struct_mem_operand): New function. Valid mems for NEON
structure moves.
(coproc_secondary_reload_class): Enable NEON registers to be loaded
from neon_vector_mem_operand addresses without a secondary register.
(add_minipool_forward_ref): Handle >8-byte minipool entries.
(add_minipool_backward_ref): Likewise.
(dump_minipool): Likewise.
(push_minipool_fix): Likewise.
(output_move_quad): New function. Output quad-word moves, loads and
stores using ARM registers.
(output_move_vfp): Add support for vectors in VFP (NEON) D
registers.
(output_move_neon): Output a NEON load/store to/from a quadword
register.
(arm_print_operand): Implement new codes:
- 'c' for unadorned integers (without a # sign).
- 'J', 'K' for reg+2/reg+3, reg+3/reg+2 in little/big-endian
mode.
- 'e', 'f' for the low and high D parts of a NEON Q register.
- 'q' outputs a NEON Q register.
- 'h' outputs ranges of D registers for VLDM/VSTM etc.
- 'T' prints NEON opcode features from a coded bitmask.
- 'F' is similar to T, but signed/unsigned codes both print as
'i'.
- 't' is similar to T, but 'u' is printed instead of 'p'.
- 'O' prints 'r' if NEON instruction should perform rounding (as
specified by bitmask), else prints nothing.
- '#' is a punctuation character to stop operand numbers from
running together with following digits in the assembler
strings for instructions (when using mode attributes).
(arm_assemble_integer): Handle extra NEON vector modes. Permute
constant vectors in big-endian mode, where necessary.
(arm_hard_regno_mode_ok): Allow vectors in VFP/NEON registers.
Handle EI, OI, CI, XI modes.
(ashlv4hi3, ashlv2si3, lshrv4hi3, lshrv2si3, ashrv4hi3)
(ashrv2si3): Rename IWMMXT2_BUILTINs to...
(ashlv4hi3_iwmmxt, ashlv2si3_iwmmxt, lshrv4hi3_iwmmxt)
(lshrv2si3_iwmmxt, ashrv4hi3_iwmmxt, ashrv2si3_iwmmxt): New names.
(neon_builtin_type_bits): Add enumeration, one bit for each vector
type.
(v8qi_UP, v4hi_UP, v2si_UP, v2sf_UP, di_UP, v16qi_UP, v8hi_UP)
(v4si_UP, v4sf_UP, v2di_UP, ti_UP, ei_UP, oi_UP, UP): Define macros
to turn v8qi, etc. into bits defined above.
(neon_itype): New enumeration. Classifications of NEON builtins.
(neon_builtin_datum): Define struct. Contains information about
a single builtin (with multiple modes).
(CF): Define helper macro for...
(VAR1...VAR10): Define builtins with a type, name and 1-10 different
modes.
(neon_builtin_data): New array. Define information about builtins
for use during initialization/expansion.
(arm_init_neon_builtins): New function.
(arm_init_builtins): Call arm_init_neon_builtins if TARGET_NEON is
true.
(neon_builtin_compare): New function.
(locate_neon_builtin_icode): New function. Find an insn code for a
builtin given a function code for that builtin. Also return type of
builtin (NEON_BINOP, NEON_UNOP etc.).
(builtin_arg): New enumeration. Types of arguments for builtins.
(arm_expand_neon_args): New function. Expand a generic NEON builtin.
Takes a variable argument list of builtin_arg types, terminated by
NEON_ARG_STOP.
(arm_expand_neon_builtin): New function. Expand a NEON builtin.
(neon_reinterpret): New function. Expand NEON reinterpret intrinsic.
(neon_emit_pair_result_insn): New function. Support returning pairs
of vectors via a pointer.
(neon_disambiguate_copy): New function. Set up operands for a
multi-word copy such that registers do not get clobbered.
(arm_expand_builtin): Call arm_expand_neon_builtin if fcode >=
ARM_BUILTIN_NEON_BASE.
(arm_file_start): Set float-abi attribute for NEON.
(arm_vector_mode_supported_p): Enable NEON vector modes.
(arm_mangle_map_entry): New.
(arm_mangle_map): New.
(arm_mangle_vector_type): New.
* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_NEON__
when appropriate.
(TARGET_NEON): New macro. Target supports NEON.
(fputype): Add FPUTYPE_NEON.
(UNITS_PER_SIMD_WORD): Define. Allow quad-word registers to be used
for vectorization based on command-line arg.
(NEON_REGNO_OK_FOR_NREGS): Define.
(VALID_NEON_DREG_MODE, VALID_NEON_QREG_MODE)
(VALID_NEON_STRUCT_MODE): Define.
(PRINT_OPERAND_PUNCT_VALID_P): '#' is valid punctuation.
(arm_builtins): Add ARM_BUILTIN_NEON_BASE.
* config/arm/arm.md (VUNSPEC_POOL_16): Insert constant for unspec.
(consttable_16): Add pattern for outputting 16-byte minipool
entries.
(movv2si, movv4hi, movv8qi): Remove blank expanders (redefined in
vec-common.md).
(vec-common.md, neon.md): Include md files.
* config/arm/arm.opt (mvectorize-with-neon-quad): Add option.
* config/arm/constraints.md (constraint "Dn", "Dl", "DL"): Define.
(memory_constraint "Ut", "Un", "Us"): Define.
* config/arm/iwmmxt.md (VMMX, VSHFT): New mode macros.
(MMX_char): New mode attribute.
(addv8qi3, addv4hi3, addv2si3): Remove. Replace with...
(*add<mode>3_iwmmxt): New insn pattern.
(subv8qi3, subv4hi3, subv2si3): Remove. Replace with...
(*sub<mode>3_iwmmxt): New insn pattern.
(mulv4hi3): Rename to...
(*mulv4hi3_iwmmxt): This.
(smaxv8qi3, smaxv4hi3, smaxv2si3, umaxv8qi3, umaxv4hi3)
(umaxv2si3, sminv8qi3, sminv4hi3, sminv2si3, uminv8qi3)
(uminv4hi3, uminv2si3): Remove. Replace with...
(*smax<mode>3_iwmmxt, *umax<mode>3_iwmmxt, *smin<mode>3_iwmmxt)
(*umin<mode>3_iwmmxt): These.
(ashrv4hi3, ashrv2si3, ashrdi3_iwmmxt): Replace with...
(ashr<mode>3_iwmmxt): This new pattern.
(lshrv4hi3, lshrv2si3, lshrdi3_iwmmxt): Replace with...
(lshr<mode>3_iwmmxt): This new pattern.
(ashlv4hi3, ashlv2si3, ashldi3_iwmmxt): Replace with...
(ashl<mode>3_iwmmxt): This new pattern.
* config/arm/neon-docgen.ml: New file. Generate documentation for
intrinsics.
* config/arm/neon-gen.ml: New file. Generate arm_neon.h header.
* config/arm/arm_neon.h: New (autogenerated).
* config/arm/neon-testgen.ml: New file. Generate NEON tests
automatically.
* config/arm/neon.md: New file. Define NEON instructions.
* config/arm/neon.ml: New file. Abstract description of NEON
instructions, used to generate arm_neon.h header, documentation and
tests.
* config/arm/t-arm (MD_INCLUDES): Add vec-common.md, neon.md.
* vec-common.md: New file. Shared parts for iWMMXt and NEON vector
support.
* doc/extend.texi (ARM Built-in Functions): Rename and remove
extraneous comma.
(ARM NEON Intrinsics): New subsection.
* doc/arm-neon-intrinsics.texi: New (autogenerated).
2007-07-25 Danny Smith <dannysmith@users.sourceforge.net>
* config/i386/i386-protos.h (i386_pe_asm_file_end): Remove

View file

@ -3581,7 +3581,7 @@ TEXI_GCC_FILES = gcc.texi gcc-common.texi gcc-vers.texi frontends.texi \
gcov.texi trouble.texi bugreport.texi service.texi \
contribute.texi compat.texi funding.texi gnu.texi gpl.texi \
fdl.texi contrib.texi cppenv.texi cppopts.texi \
implement-c.texi
implement-c.texi arm-neon-intrinsics.texi
TEXI_GCCINT_FILES = gccint.texi gcc-common.texi gcc-vers.texi \
contribute.texi makefile.texi configterms.texi options.texi \

View file

@ -259,7 +259,7 @@ strongarm*-*-*)
;;
arm*-*-*)
cpu_type=arm
extra_headers="mmintrin.h"
extra_headers="mmintrin.h arm_neon.h"
;;
bfin*-*)
cpu_type=bfin
@ -2841,7 +2841,7 @@ case "${target}" in
case "$with_fpu" in
"" \
| fpa | fpe2 | fpe3 | maverick | vfp | vfp3 )
| fpa | fpe2 | fpe3 | maverick | vfp | vfp3 | neon )
# OK
;;
*)

View file

@ -239,22 +239,30 @@ do { \
{"r13", 13}, {"sp", 13}, \
{"r14", 14}, {"lr", 14}, \
{"r15", 15}, {"pc", 15}, \
{"d0", 63}, \
{"d0", 63}, {"q0", 63}, \
{"d1", 65}, \
{"d2", 67}, \
{"d2", 67}, {"q1", 67}, \
{"d3", 69}, \
{"d4", 71}, \
{"d4", 71}, {"q2", 71}, \
{"d5", 73}, \
{"d6", 75}, \
{"d6", 75}, {"q3", 75}, \
{"d7", 77}, \
{"d8", 79}, \
{"d8", 79}, {"q4", 79}, \
{"d9", 81}, \
{"d10", 83}, \
{"d10", 83}, {"q5", 83}, \
{"d11", 85}, \
{"d12", 87}, \
{"d12", 87}, {"q6", 87}, \
{"d13", 89}, \
{"d14", 91}, \
{"d15", 93} \
{"d14", 91}, {"q7", 91}, \
{"d15", 93}, \
{"q8", 95}, \
{"q9", 99}, \
{"q10", 103}, \
{"q11", 107}, \
{"q12", 111}, \
{"q13", 115}, \
{"q14", 119}, \
{"q15", 123} \
}
#define REGISTER_PREFIX "__"

View file

@ -165,22 +165,30 @@
{"mvdx13", 40}, \
{"mvdx14", 41}, \
{"mvdx15", 42}, \
{"d0", 63}, \
{"d0", 63}, {"q0", 63}, \
{"d1", 65}, \
{"d2", 67}, \
{"d2", 67}, {"q1", 67}, \
{"d3", 69}, \
{"d4", 71}, \
{"d4", 71}, {"q2", 71}, \
{"d5", 73}, \
{"d6", 75}, \
{"d6", 75}, {"q3", 75}, \
{"d7", 77}, \
{"d8", 79}, \
{"d8", 79}, {"q4", 79}, \
{"d9", 81}, \
{"d10", 83}, \
{"d10", 83}, {"q5", 83}, \
{"d11", 85}, \
{"d12", 87}, \
{"d12", 87}, {"q6", 87}, \
{"d13", 89}, \
{"d14", 91}, \
{"d14", 91}, {"q7", 91}, \
{"d15", 93}, \
{"q8", 95}, \
{"q9", 99}, \
{"q10", 103}, \
{"q11", 107}, \
{"q12", 111}, \
{"q13", 115}, \
{"q14", 119}, \
{"q15", 123} \
}
#endif

View file

@ -58,3 +58,11 @@ VECTOR_MODES (INT, 16); /* V16QI V8HI V4SI V2DI */
VECTOR_MODES (FLOAT, 8); /* V4HF V2SF */
VECTOR_MODES (FLOAT, 16); /* V8HF V4SF V2DF */
/* Opaque integer modes for 3, 4, 6 or 8 Neon double registers (2 is
TImode). */
INT_MODE (EI, 24);
INT_MODE (OI, 32);
INT_MODE (CI, 48);
/* ??? This should actually have 512 bits but the precision only has 9
bits. */
FRACTIONAL_INT_MODE (XI, 511, 64);

View file

@ -68,6 +68,19 @@ extern rtx thumb_legitimize_reload_address (rtx *, enum machine_mode, int, int,
extern int arm_const_double_rtx (rtx);
extern int neg_const_double_rtx_ok_for_fpa (rtx);
extern int vfp3_const_double_rtx (rtx);
extern int neon_immediate_valid_for_move (rtx, enum machine_mode, rtx *, int *);
extern int neon_immediate_valid_for_logic (rtx, enum machine_mode, int, rtx *,
int *);
extern char *neon_output_logic_immediate (const char *, rtx *,
enum machine_mode, int, int);
extern void neon_pairwise_reduce (rtx, rtx, enum machine_mode,
rtx (*) (rtx, rtx, rtx));
extern void neon_expand_vector_init (rtx, rtx);
extern void neon_reinterpret (rtx, rtx);
extern void neon_emit_pair_result_insn (enum machine_mode,
rtx (*) (rtx, rtx, rtx, rtx),
rtx, rtx, rtx);
extern void neon_disambiguate_copy (rtx *, rtx *, rtx *, unsigned int);
extern enum reg_class coproc_secondary_reload_class (enum machine_mode, rtx,
bool);
extern bool arm_tls_referenced_p (rtx);
@ -75,6 +88,8 @@ extern bool arm_cannot_force_const_mem (rtx);
extern int cirrus_memory_offset (rtx);
extern int arm_coproc_mem_operand (rtx, bool);
extern int neon_vector_mem_operand (rtx, bool);
extern int neon_struct_mem_operand (rtx);
extern int arm_no_early_store_addr_dep (rtx, rtx);
extern int arm_no_early_alu_shift_dep (rtx, rtx);
extern int arm_no_early_alu_shift_value_dep (rtx, rtx);
@ -113,7 +128,9 @@ extern const char *output_mov_long_double_arm_from_arm (rtx *);
extern const char *output_mov_double_fpa_from_arm (rtx *);
extern const char *output_mov_double_arm_from_fpa (rtx *);
extern const char *output_move_double (rtx *);
extern const char *output_move_quad (rtx *);
extern const char *output_move_vfp (rtx *operands);
extern const char *output_move_neon (rtx *operands);
extern const char *output_add_immediate (rtx *);
extern const char *arithmetic_instr (rtx, int);
extern void output_ascii_pseudo_op (FILE *, const unsigned char *, int);

File diff suppressed because it is too large Load diff

View file

@ -65,6 +65,9 @@ extern char arm_arch_name[];
if (TARGET_VFP) \
builtin_define ("__VFP_FP__"); \
\
if (TARGET_NEON) \
builtin_define ("__ARM_NEON__"); \
\
/* Add a define for interworking. \
Needed when building libgcc.a. */ \
if (arm_cpp_interwork) \
@ -206,10 +209,23 @@ extern GTY(()) rtx aof_pic_label;
/* 32-bit Thumb-2 code. */
#define TARGET_THUMB2 (TARGET_THUMB && arm_arch_thumb2)
/* The following two macros concern the ability to execute coprocessor
instructions for VFPv3 or NEON. TARGET_VFP3 is currently only ever
tested when we know we are generating for VFP hardware; we need to
be more careful with TARGET_NEON as noted below. */
/* FPU is VFPv3 (with twice the number of D registers). Setting the FPU to
Neon automatically enables VFPv3 too. */
#define TARGET_VFP3 (arm_fp_model == ARM_FP_MODEL_VFP \
&& (arm_fpu_arch == FPUTYPE_VFP3))
&& (arm_fpu_arch == FPUTYPE_VFP3 \
|| arm_fpu_arch == FPUTYPE_NEON))
/* FPU supports Neon instructions. The setting of this macro gets
revealed via __ARM_NEON__ so we add extra guards upon TARGET_32BIT
and TARGET_HARD_FLOAT to ensure that NEON instructions are
available. */
#define TARGET_NEON (TARGET_32BIT && TARGET_HARD_FLOAT \
&& arm_fp_model == ARM_FP_MODEL_VFP \
&& arm_fpu_arch == FPUTYPE_NEON)
/* "DSP" multiply instructions, eg. SMULxy. */
#define TARGET_DSP_MULTIPLY \
@ -282,7 +298,9 @@ enum fputype
/* VFP. */
FPUTYPE_VFP,
/* VFPv3. */
FPUTYPE_VFP3
FPUTYPE_VFP3,
/* Neon. */
FPUTYPE_NEON
};
/* Recast the floating point class to be the floating point attribute. */
@ -483,6 +501,12 @@ extern int arm_arch_hwdiv;
#define UNITS_PER_WORD 4
/* Use the option -mvectorize-with-neon-quad to override the use of doubleword
registers when autovectorizing for Neon, at least until multiple vector
widths are supported properly by the middle-end. */
#define UNITS_PER_SIMD_WORD \
(TARGET_NEON ? (TARGET_NEON_VECTORIZE_QUAD ? 16 : 8) : UNITS_PER_WORD)
/* True if natural alignment is used for doubleword types. */
#define ARM_DOUBLEWORD_ALIGN TARGET_AAPCS_BASED
@ -941,6 +965,18 @@ extern int arm_structure_size_boundary;
#define VFP_REGNO_OK_FOR_DOUBLE(REGNUM) \
((((REGNUM) - FIRST_VFP_REGNUM) & 1) == 0)
/* Neon Quad values must start at a multiple of four registers. */
#define NEON_REGNO_OK_FOR_QUAD(REGNUM) \
((((REGNUM) - FIRST_VFP_REGNUM) & 3) == 0)
/* Neon structures of vectors must be in even register pairs and there
must be enough registers available. Because of various patterns
requiring quad registers, we require them to start at a multiple of
four. */
#define NEON_REGNO_OK_FOR_NREGS(REGNUM, N) \
((((REGNUM) - FIRST_VFP_REGNUM) & 3) == 0 \
&& (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1))
/* The number of hard registers is 16 ARM + 8 FPA + 1 CC + 1 SFP + 1 AFP. */
/* + 16 Cirrus registers take us up to 43. */
/* Intel Wireless MMX Technology registers add 16 + 4 more. */
@ -994,6 +1030,21 @@ extern int arm_structure_size_boundary;
#define VALID_IWMMXT_REG_MODE(MODE) \
(arm_vector_mode_supported_p (MODE) || (MODE) == DImode)
/* Modes valid for Neon D registers. */
#define VALID_NEON_DREG_MODE(MODE) \
((MODE) == V2SImode || (MODE) == V4HImode || (MODE) == V8QImode \
|| (MODE) == V2SFmode || (MODE) == DImode)
/* Modes valid for Neon Q registers. */
#define VALID_NEON_QREG_MODE(MODE) \
((MODE) == V4SImode || (MODE) == V8HImode || (MODE) == V16QImode \
|| (MODE) == V4SFmode || (MODE) == V2DImode)
/* Structure modes valid for Neon registers. */
#define VALID_NEON_STRUCT_MODE(MODE) \
((MODE) == TImode || (MODE) == EImode || (MODE) == OImode \
|| (MODE) == CImode || (MODE) == XImode)
/* The order in which register should be allocated. It is good to use ip
since no saving is required (though calls clobber it) and it never contains
function parameters. It is quite good to use lr since other calls may
@ -2409,7 +2460,7 @@ extern int making_const_table;
#define PRINT_OPERAND_PUNCT_VALID_P(CODE) \
(CODE == '@' || CODE == '|' || CODE == '.' \
|| CODE == '(' || CODE == ')' \
|| CODE == '(' || CODE == ')' || CODE == '#' \
|| (TARGET_32BIT && (CODE == '?')) \
|| (TARGET_THUMB2 && (CODE == '!')) \
|| (TARGET_THUMB && (CODE == '_')))
@ -2581,6 +2632,9 @@ extern int making_const_table;
: arm_gen_return_addr_mask ())
/* Neon defines builtins from ARM_BUILTIN_MAX upwards, though they don't have
symbolic names defined here (which would require too much duplication).
FIXME? */
enum arm_builtins
{
ARM_BUILTIN_GETWCX,
@ -2745,7 +2799,9 @@ enum arm_builtins
ARM_BUILTIN_THREAD_POINTER,
ARM_BUILTIN_MAX
ARM_BUILTIN_NEON_BASE,
ARM_BUILTIN_MAX = ARM_BUILTIN_NEON_BASE /* FIXME: Wrong! */
};
/* Do not emit .note.GNU-stack by default. */

View file

@ -51,6 +51,7 @@
;; UNSPEC Usage:
;; Note: sin and cos are no-longer used.
;; Unspec constants for Neon are defined in neon.md.
(define_constants
[(UNSPEC_SIN 0) ; `sin' operation (MODE_FLOAT):
@ -121,12 +122,14 @@
; a 32-bit object.
(VUNSPEC_POOL_8 7) ; `pool-entry(8)'. An entry in the constant pool for
; a 64-bit object.
(VUNSPEC_TMRC 8) ; Used by the iWMMXt TMRC instruction.
(VUNSPEC_TMCR 9) ; Used by the iWMMXt TMCR instruction.
(VUNSPEC_ALIGN8 10) ; 8-byte alignment version of VUNSPEC_ALIGN
(VUNSPEC_WCMP_EQ 11) ; Used by the iWMMXt WCMPEQ instructions
(VUNSPEC_WCMP_GTU 12) ; Used by the iWMMXt WCMPGTU instructions
(VUNSPEC_WCMP_GT 13) ; Used by the iwMMXT WCMPGT instructions
(VUNSPEC_POOL_16 8) ; `pool-entry(16)'. An entry in the constant pool for
; a 128-bit object.
(VUNSPEC_TMRC 9) ; Used by the iWMMXt TMRC instruction.
(VUNSPEC_TMCR 10) ; Used by the iWMMXt TMCR instruction.
(VUNSPEC_ALIGN8 11) ; 8-byte alignment version of VUNSPEC_ALIGN
(VUNSPEC_WCMP_EQ 12) ; Used by the iWMMXt WCMPEQ instructions
(VUNSPEC_WCMP_GTU 13) ; Used by the iWMMXt WCMPGTU instructions
(VUNSPEC_WCMP_GT 14) ; Used by the iwMMXT WCMPGT instructions
(VUNSPEC_EH_RETURN 20); Use to override the return address for exception
; handling.
]
@ -5768,27 +5771,6 @@
"
)
;; Vector Moves
(define_expand "movv2si"
[(set (match_operand:V2SI 0 "nonimmediate_operand" "")
(match_operand:V2SI 1 "general_operand" ""))]
"TARGET_REALLY_IWMMXT"
{
})
(define_expand "movv4hi"
[(set (match_operand:V4HI 0 "nonimmediate_operand" "")
(match_operand:V4HI 1 "general_operand" ""))]
"TARGET_REALLY_IWMMXT"
{
})
(define_expand "movv8qi"
[(set (match_operand:V8QI 0 "nonimmediate_operand" "")
(match_operand:V8QI 1 "general_operand" ""))]
"TARGET_REALLY_IWMMXT"
{
})
;; load- and store-multiple insns
@ -10731,6 +10713,30 @@
[(set_attr "length" "8")]
)
(define_insn "consttable_16"
[(unspec_volatile [(match_operand 0 "" "")] VUNSPEC_POOL_16)]
"TARGET_EITHER"
"*
{
making_const_table = TRUE;
switch (GET_MODE_CLASS (GET_MODE (operands[0])))
{
case MODE_FLOAT:
{
REAL_VALUE_TYPE r;
REAL_VALUE_FROM_CONST_DOUBLE (r, operands[0]);
assemble_real (r, GET_MODE (operands[0]), BITS_PER_WORD);
break;
}
default:
assemble_integer (operands[0], 16, BITS_PER_WORD, 1);
break;
}
return \"\";
}"
[(set_attr "length" "16")]
)
;; Miscellaneous Thumb patterns
(define_expand "tablejump"
@ -10906,10 +10912,14 @@
(include "fpa.md")
;; Load the Maverick co-processor patterns
(include "cirrus.md")
;; Vector bits common to IWMMXT and Neon
(include "vec-common.md")
;; Load the Intel Wireless Multimedia Extension patterns
(include "iwmmxt.md")
;; Load the VFP co-processor patterns
(include "vfp.md")
;; Thumb-2 patterns
(include "thumb2.md")
;; Neon patterns
(include "neon.md")

View file

@ -153,3 +153,7 @@ Tune code for the given processor
mwords-little-endian
Target Report RejectNegative Mask(LITTLE_WORDS)
Assume big endian bytes, little endian words
mvectorize-with-neon-quad
Target Report Mask(NEON_VECTORIZE_QUAD)
Use Neon quad-word (rather than double-word) registers for vectorization

12179
gcc/config/arm/arm_neon.h Normal file

File diff suppressed because it is too large Load diff

View file

@ -30,10 +30,10 @@
;; in Thumb-1 state: I, J, K, L, M, N, O
;; The following multi-letter normal constraints have been used:
;; in ARM/Thumb-2 state: Da, Db, Dc, Dv
;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv
;; The following memory constraints have been used:
;; in ARM/Thumb-2 state: Q, Uv, Uy
;; in ARM/Thumb-2 state: Q, Ut, Uv, Uy, Un, Us
;; in ARM state: Uq
@ -164,6 +164,30 @@
(match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4
&& !(optimize_size || arm_ld_sched)")))
(define_constraint "Dn"
"@internal
In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov
immediate instruction."
(and (match_code "const_vector")
(match_test "TARGET_32BIT
&& imm_for_neon_mov_operand (op, GET_MODE (op))")))
(define_constraint "Dl"
"@internal
In ARM/Thumb-2 state a const_vector which can be used with a Neon vorr or
vbic instruction."
(and (match_code "const_vector")
(match_test "TARGET_32BIT
&& imm_for_neon_logic_operand (op, GET_MODE (op))")))
(define_constraint "DL"
"@internal
In ARM/Thumb-2 state a const_vector which can be used with a Neon vorn or
vand instruction."
(and (match_code "const_vector")
(match_test "TARGET_32BIT
&& imm_for_neon_inv_logic_operand (op, GET_MODE (op))")))
(define_constraint "Dv"
"@internal
In ARM/Thumb-2 state a const_double which can be used with a VFP fconsts
@ -171,6 +195,13 @@
(and (match_code "const_double")
(match_test "TARGET_32BIT && vfp3_const_double_rtx (op)")))
(define_memory_constraint "Ut"
"@internal
In ARM/Thumb-2 state an address valid for loading/storing opaque structure
types wider than TImode."
(and (match_code "mem")
(match_test "TARGET_32BIT && neon_struct_mem_operand (op)")))
(define_memory_constraint "Uv"
"@internal
In ARM/Thumb-2 state a valid VFP load/store address."
@ -183,6 +214,20 @@
(and (match_code "mem")
(match_test "TARGET_32BIT && arm_coproc_mem_operand (op, TRUE)")))
(define_memory_constraint "Un"
"@internal
In ARM/Thumb-2 state a valid address for Neon element and structure
load/store instructions."
(and (match_code "mem")
(match_test "TARGET_32BIT && neon_vector_mem_operand (op, FALSE)")))
(define_memory_constraint "Us"
"@internal
In ARM/Thumb-2 state a valid address for non-offset loads/stores of
quad-word values in four ARM registers."
(and (match_code "mem")
(match_test "TARGET_32BIT && neon_vector_mem_operand (op, TRUE)")))
(define_memory_constraint "Uq"
"@internal
In ARM state an address valid in ldrsb instructions."

View file

@ -20,6 +20,15 @@
;; the Free Software Foundation, 51 Franklin Street, Fifth Floor,
;; Boston, MA 02110-1301, USA.
;; Integer element sizes implemented by IWMMXT.
(define_mode_macro VMMX [V2SI V4HI V8QI])
;; Integer element sizes for shifts.
(define_mode_macro VSHFT [V4HI V2SI DI])
;; Determine element size suffix from vector mode.
(define_mode_attr MMX_char [(V8QI "b") (V4HI "h") (V2SI "w") (DI "d")])
(define_insn "iwmmxt_iordi3"
[(set (match_operand:DI 0 "register_operand" "=y,?&r,?&r")
(ior:DI (match_operand:DI 1 "register_operand" "%y,0,r")
@ -239,28 +248,12 @@
;; Vector add/subtract
(define_insn "addv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y")
(plus:V8QI (match_operand:V8QI 1 "register_operand" "y")
(match_operand:V8QI 2 "register_operand" "y")))]
(define_insn "*add<mode>3_iwmmxt"
[(set (match_operand:VMMX 0 "register_operand" "=y")
(plus:VMMX (match_operand:VMMX 1 "register_operand" "y")
(match_operand:VMMX 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"waddb%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "addv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(plus:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"waddh%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "addv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(plus:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:V2SI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"waddw%?\\t%0, %1, %2"
"wadd<MMX_char>%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ssaddv8qi3"
@ -311,28 +304,12 @@
"waddwus%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "subv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y")
(minus:V8QI (match_operand:V8QI 1 "register_operand" "y")
(match_operand:V8QI 2 "register_operand" "y")))]
(define_insn "*sub<mode>3_iwmmxt"
[(set (match_operand:VMMX 0 "register_operand" "=y")
(minus:VMMX (match_operand:VMMX 1 "register_operand" "y")
(match_operand:VMMX 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wsubb%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "subv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(minus:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wsubh%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "subv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(minus:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:V2SI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wsubw%?\\t%0, %1, %2"
"wsub<MMX_char>%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "sssubv8qi3"
@ -383,7 +360,7 @@
"wsubwus%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "mulv4hi3"
(define_insn "*mulv4hi3_iwmmxt"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(mult:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
@ -734,100 +711,36 @@
;; Max/min insns
(define_insn "smaxv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y")
(smax:V8QI (match_operand:V8QI 1 "register_operand" "y")
(match_operand:V8QI 2 "register_operand" "y")))]
(define_insn "*smax<mode>3_iwmmxt"
[(set (match_operand:VMMX 0 "register_operand" "=y")
(smax:VMMX (match_operand:VMMX 1 "register_operand" "y")
(match_operand:VMMX 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wmaxsb%?\\t%0, %1, %2"
"wmaxs<MMX_char>%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "umaxv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y")
(umax:V8QI (match_operand:V8QI 1 "register_operand" "y")
(match_operand:V8QI 2 "register_operand" "y")))]
(define_insn "*umax<mode>3_iwmmxt"
[(set (match_operand:VMMX 0 "register_operand" "=y")
(umax:VMMX (match_operand:VMMX 1 "register_operand" "y")
(match_operand:VMMX 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wmaxub%?\\t%0, %1, %2"
"wmaxu<MMX_char>%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "smaxv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(smax:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
(define_insn "*smin<mode>3_iwmmxt"
[(set (match_operand:VMMX 0 "register_operand" "=y")
(smin:VMMX (match_operand:VMMX 1 "register_operand" "y")
(match_operand:VMMX 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wmaxsh%?\\t%0, %1, %2"
"wmins<MMX_char>%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "umaxv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(umax:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
(define_insn "*umin<mode>3_iwmmxt"
[(set (match_operand:VMMX 0 "register_operand" "=y")
(umin:VMMX (match_operand:VMMX 1 "register_operand" "y")
(match_operand:VMMX 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wmaxuh%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "smaxv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(smax:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:V2SI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wmaxsw%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "umaxv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(umax:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:V2SI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wmaxuw%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "sminv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y")
(smin:V8QI (match_operand:V8QI 1 "register_operand" "y")
(match_operand:V8QI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wminsb%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "uminv8qi3"
[(set (match_operand:V8QI 0 "register_operand" "=y")
(umin:V8QI (match_operand:V8QI 1 "register_operand" "y")
(match_operand:V8QI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wminub%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "sminv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(smin:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wminsh%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "uminv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(umin:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:V4HI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wminuh%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "sminv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(smin:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:V2SI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wminsw%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "uminv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(umin:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:V2SI 2 "register_operand" "y")))]
"TARGET_REALLY_IWMMXT"
"wminuw%?\\t%0, %1, %2"
"wminu<MMX_char>%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
;; Pack/unpack insns.
@ -1141,76 +1054,28 @@
"wrordg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ashrv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(ashiftrt:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
(define_insn "ashr<mode>3_iwmmxt"
[(set (match_operand:VSHFT 0 "register_operand" "=y")
(ashiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsrahg%?\\t%0, %1, %2"
"wsra<MMX_char>g%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ashrv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(ashiftrt:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
(define_insn "lshr<mode>3_iwmmxt"
[(set (match_operand:VSHFT 0 "register_operand" "=y")
(lshiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsrawg%?\\t%0, %1, %2"
"wsrl<MMX_char>g%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ashrdi3_iwmmxt"
[(set (match_operand:DI 0 "register_operand" "=y")
(ashiftrt:DI (match_operand:DI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
(define_insn "ashl<mode>3_iwmmxt"
[(set (match_operand:VSHFT 0 "register_operand" "=y")
(ashift:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsradg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "lshrv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(lshiftrt:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsrlhg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "lshrv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(lshiftrt:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsrlwg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "lshrdi3_iwmmxt"
[(set (match_operand:DI 0 "register_operand" "=y")
(lshiftrt:DI (match_operand:DI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsrldg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ashlv4hi3"
[(set (match_operand:V4HI 0 "register_operand" "=y")
(ashift:V4HI (match_operand:V4HI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsllhg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ashlv2si3"
[(set (match_operand:V2SI 0 "register_operand" "=y")
(ashift:V2SI (match_operand:V2SI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wsllwg%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "ashldi3_iwmmxt"
[(set (match_operand:DI 0 "register_operand" "=y")
(ashift:DI (match_operand:DI 1 "register_operand" "y")
(match_operand:SI 2 "register_operand" "z")))]
"TARGET_REALLY_IWMMXT"
"wslldg%?\\t%0, %1, %2"
"wsll<MMX_char>g%?\\t%0, %1, %2"
[(set_attr "predicable" "yes")])
(define_insn "rorv4hi3_di"

View file

@ -0,0 +1,337 @@
(* ARM NEON documentation generator.
Copyright (C) 2006 Free Software Foundation, Inc.
Contributed by CodeSourcery.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.
This is an O'Caml program. The O'Caml compiler is available from:
http://caml.inria.fr/
Or from your favourite OS's friendly packaging system. Tested with version
3.09.2, though other versions will probably work too.
Compile with:
ocamlc -c neon.ml
ocamlc -o neon-docgen neon.cmo neon-docgen.ml
Run with:
/path/to/neon-docgen /path/to/gcc/doc/arm-neon-intrinsics.texi
*)
open Neon
(* The combined "ops" and "reinterp" table. *)
let ops_reinterp = reinterp @ ops
(* Helper functions for extracting things from the "ops" table. *)
let single_opcode desired_opcode () =
List.fold_left (fun got_so_far ->
fun row ->
match row with
(opcode, _, _, _, _, _) ->
if opcode = desired_opcode then row :: got_so_far
else got_so_far
) [] ops_reinterp
let multiple_opcodes desired_opcodes () =
List.fold_left (fun got_so_far ->
fun desired_opcode ->
(single_opcode desired_opcode ()) @ got_so_far)
[] desired_opcodes
let ldx_opcode number () =
List.fold_left (fun got_so_far ->
fun row ->
match row with
(opcode, _, _, _, _, _) ->
match opcode with
Vldx n | Vldx_lane n | Vldx_dup n when n = number ->
row :: got_so_far
| _ -> got_so_far
) [] ops_reinterp
let stx_opcode number () =
List.fold_left (fun got_so_far ->
fun row ->
match row with
(opcode, _, _, _, _, _) ->
match opcode with
Vstx n | Vstx_lane n when n = number ->
row :: got_so_far
| _ -> got_so_far
) [] ops_reinterp
let tbl_opcode () =
List.fold_left (fun got_so_far ->
fun row ->
match row with
(opcode, _, _, _, _, _) ->
match opcode with
Vtbl _ -> row :: got_so_far
| _ -> got_so_far
) [] ops_reinterp
let tbx_opcode () =
List.fold_left (fun got_so_far ->
fun row ->
match row with
(opcode, _, _, _, _, _) ->
match opcode with
Vtbx _ -> row :: got_so_far
| _ -> got_so_far
) [] ops_reinterp
(* The groups of intrinsics. *)
let intrinsic_groups =
[ "Addition", single_opcode Vadd;
"Multiplication", single_opcode Vmul;
"Multiply-accumulate", single_opcode Vmla;
"Multiply-subtract", single_opcode Vmls;
"Subtraction", single_opcode Vsub;
"Comparison (equal-to)", single_opcode Vceq;
"Comparison (greater-than-or-equal-to)", single_opcode Vcge;
"Comparison (less-than-or-equal-to)", single_opcode Vcle;
"Comparison (greater-than)", single_opcode Vcgt;
"Comparison (less-than)", single_opcode Vclt;
"Comparison (absolute greater-than-or-equal-to)", single_opcode Vcage;
"Comparison (absolute less-than-or-equal-to)", single_opcode Vcale;
"Comparison (absolute greater-than)", single_opcode Vcagt;
"Comparison (absolute less-than)", single_opcode Vcalt;
"Test bits", single_opcode Vtst;
"Absolute difference", single_opcode Vabd;
"Absolute difference and accumulate", single_opcode Vaba;
"Maximum", single_opcode Vmax;
"Minimum", single_opcode Vmin;
"Pairwise add", single_opcode Vpadd;
"Pairwise add, single_opcode widen and accumulate", single_opcode Vpada;
"Folding maximum", single_opcode Vpmax;
"Folding minimum", single_opcode Vpmin;
"Reciprocal step", multiple_opcodes [Vrecps; Vrsqrts];
"Vector shift left", single_opcode Vshl;
"Vector shift left by constant", single_opcode Vshl_n;
"Vector shift right by constant", single_opcode Vshr_n;
"Vector shift right by constant and accumulate", single_opcode Vsra_n;
"Vector shift right and insert", single_opcode Vsri;
"Vector shift left and insert", single_opcode Vsli;
"Absolute value", single_opcode Vabs;
"Negation", single_opcode Vneg;
"Bitwise not", single_opcode Vmvn;
"Count leading sign bits", single_opcode Vcls;
"Count leading zeros", single_opcode Vclz;
"Count number of set bits", single_opcode Vcnt;
"Reciprocal estimate", single_opcode Vrecpe;
"Reciprocal square-root estimate", single_opcode Vrsqrte;
"Get lanes from a vector", single_opcode Vget_lane;
"Set lanes in a vector", single_opcode Vset_lane;
"Create vector from literal bit pattern", single_opcode Vcreate;
"Set all lanes to the same value",
multiple_opcodes [Vdup_n; Vmov_n; Vdup_lane];
"Combining vectors", single_opcode Vcombine;
"Splitting vectors", multiple_opcodes [Vget_high; Vget_low];
"Conversions", multiple_opcodes [Vcvt; Vcvt_n];
"Move, single_opcode narrowing", single_opcode Vmovn;
"Move, single_opcode long", single_opcode Vmovl;
"Table lookup", tbl_opcode;
"Extended table lookup", tbx_opcode;
"Multiply, lane", single_opcode Vmul_lane;
"Long multiply, lane", single_opcode Vmull_lane;
"Saturating doubling long multiply, lane", single_opcode Vqdmull_lane;
"Saturating doubling multiply high, lane", single_opcode Vqdmulh_lane;
"Multiply-accumulate, lane", single_opcode Vmla_lane;
"Multiply-subtract, lane", single_opcode Vmls_lane;
"Vector multiply by scalar", single_opcode Vmul_n;
"Vector long multiply by scalar", single_opcode Vmull_n;
"Vector saturating doubling long multiply by scalar",
single_opcode Vqdmull_n;
"Vector saturating doubling multiply high by scalar",
single_opcode Vqdmulh_n;
"Vector multiply-accumulate by scalar", single_opcode Vmla_n;
"Vector multiply-subtract by scalar", single_opcode Vmls_n;
"Vector extract", single_opcode Vext;
"Reverse elements", multiple_opcodes [Vrev64; Vrev32; Vrev16];
"Bit selection", single_opcode Vbsl;
"Transpose elements", single_opcode Vtrn;
"Zip elements", single_opcode Vzip;
"Unzip elements", single_opcode Vuzp;
"Element/structure loads, VLD1 variants", ldx_opcode 1;
"Element/structure stores, VST1 variants", stx_opcode 1;
"Element/structure loads, VLD2 variants", ldx_opcode 2;
"Element/structure stores, VST2 variants", stx_opcode 2;
"Element/structure loads, VLD3 variants", ldx_opcode 3;
"Element/structure stores, VST3 variants", stx_opcode 3;
"Element/structure loads, VLD4 variants", ldx_opcode 4;
"Element/structure stores, VST4 variants", stx_opcode 4;
"Logical operations (AND)", single_opcode Vand;
"Logical operations (OR)", single_opcode Vorr;
"Logical operations (exclusive OR)", single_opcode Veor;
"Logical operations (AND-NOT)", single_opcode Vbic;
"Logical operations (OR-NOT)", single_opcode Vorn;
"Reinterpret casts", single_opcode Vreinterp ]
(* Given an intrinsic shape, produce a string to document the corresponding
operand shapes. *)
let rec analyze_shape shape =
let rec n_things n thing =
match n with
0 -> []
| n -> thing :: (n_things (n - 1) thing)
in
let rec analyze_shape_elt reg_no elt =
match elt with
Dreg -> "@var{d" ^ (string_of_int reg_no) ^ "}"
| Qreg -> "@var{q" ^ (string_of_int reg_no) ^ "}"
| Corereg -> "@var{r" ^ (string_of_int reg_no) ^ "}"
| Immed -> "#@var{0}"
| VecArray (1, elt) ->
let elt_regexp = analyze_shape_elt 0 elt in
"@{" ^ elt_regexp ^ "@}"
| VecArray (n, elt) ->
let rec f m =
match m with
0 -> []
| m -> (analyze_shape_elt (m - 1) elt) :: (f (m - 1))
in
let ops = List.rev (f n) in
"@{" ^ (commas (fun x -> x) ops "") ^ "@}"
| (PtrTo elt | CstPtrTo elt) ->
"[" ^ (analyze_shape_elt reg_no elt) ^ "]"
| Element_of_dreg -> (analyze_shape_elt reg_no Dreg) ^ "[@var{0}]"
| Element_of_qreg -> (analyze_shape_elt reg_no Qreg) ^ "[@var{0}]"
| All_elements_of_dreg -> (analyze_shape_elt reg_no Dreg) ^ "[]"
in
match shape with
All (n, elt) -> commas (analyze_shape_elt 0) (n_things n elt) ""
| Long -> (analyze_shape_elt 0 Qreg) ^ ", " ^ (analyze_shape_elt 0 Dreg) ^
", " ^ (analyze_shape_elt 0 Dreg)
| Long_noreg elt -> (analyze_shape_elt 0 elt) ^ ", " ^
(analyze_shape_elt 0 elt)
| Wide -> (analyze_shape_elt 0 Qreg) ^ ", " ^ (analyze_shape_elt 0 Qreg) ^
", " ^ (analyze_shape_elt 0 Dreg)
| Wide_noreg elt -> analyze_shape (Long_noreg elt)
| Narrow -> (analyze_shape_elt 0 Dreg) ^ ", " ^ (analyze_shape_elt 0 Qreg) ^
", " ^ (analyze_shape_elt 0 Qreg)
| Use_operands elts -> commas (analyze_shape_elt 0) (Array.to_list elts) ""
| By_scalar Dreg ->
analyze_shape (Use_operands [| Dreg; Dreg; Element_of_dreg |])
| By_scalar Qreg ->
analyze_shape (Use_operands [| Qreg; Qreg; Element_of_dreg |])
| By_scalar _ -> assert false
| Wide_lane ->
analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
| Wide_scalar ->
analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
| Pair_result elt ->
let elt_regexp = analyze_shape_elt 0 elt in
let elt_regexp' = analyze_shape_elt 1 elt in
elt_regexp ^ ", " ^ elt_regexp'
| Unary_scalar _ -> "FIXME Unary_scalar"
| Binary_imm elt -> analyze_shape (Use_operands [| elt; elt; Immed |])
| Narrow_imm -> analyze_shape (Use_operands [| Dreg; Qreg; Immed |])
| Long_imm -> analyze_shape (Use_operands [| Qreg; Dreg; Immed |])
(* Document a single intrinsic. *)
let describe_intrinsic first chan
(elt_ty, (_, features, shape, name, munge, _)) =
let c_arity, new_elt_ty = munge shape elt_ty in
let c_types = strings_of_arity c_arity in
Printf.fprintf chan "@itemize @bullet\n";
let item_code = if first then "@item" else "@itemx" in
Printf.fprintf chan "%s %s %s_%s (" item_code (List.hd c_types)
(intrinsic_name name) (string_of_elt elt_ty);
Printf.fprintf chan "%s)\n" (commas (fun ty -> ty) (List.tl c_types) "");
if not (List.exists (fun feature -> feature = No_op) features) then
begin
let print_one_insn name =
Printf.fprintf chan "@code{";
let no_suffix = (new_elt_ty = NoElts) in
let name_with_suffix =
if no_suffix then name
else name ^ "." ^ (string_of_elt_dots new_elt_ty)
in
let possible_operands = analyze_all_shapes features shape
analyze_shape
in
let rec print_one_possible_operand op =
Printf.fprintf chan "%s %s}" name_with_suffix op
in
(* If the intrinsic expands to multiple instructions, we assume
they are all of the same form. *)
print_one_possible_operand (List.hd possible_operands)
in
let rec print_insns names =
match names with
[] -> ()
| [name] -> print_one_insn name
| name::names -> (print_one_insn name;
Printf.fprintf chan " @emph{or} ";
print_insns names)
in
let insn_names = get_insn_names features name in
Printf.fprintf chan "@*@emph{Form of expected instruction(s):} ";
print_insns insn_names;
Printf.fprintf chan "\n"
end;
Printf.fprintf chan "@end itemize\n";
Printf.fprintf chan "\n\n"
(* Document a group of intrinsics. *)
let document_group chan (group_title, group_extractor) =
(* Extract the rows in question from the ops table and then turn them
into a list of intrinsics. *)
let intrinsics =
List.fold_left (fun got_so_far ->
fun row ->
match row with
(_, _, _, _, _, elt_tys) ->
List.fold_left (fun got_so_far' ->
fun elt_ty ->
(elt_ty, row) :: got_so_far')
got_so_far elt_tys
) [] (group_extractor ())
in
(* Emit the title for this group. *)
Printf.fprintf chan "@subsubsection %s\n\n" group_title;
(* Emit a description of each intrinsic. *)
List.iter (describe_intrinsic true chan) intrinsics;
(* Close this group. *)
Printf.fprintf chan "\n\n"
let gnu_header chan =
List.iter (fun s -> Printf.fprintf chan "%s\n" s) [
"@c Copyright (C) 2006 Free Software Foundation, Inc.";
"@c This is part of the GCC manual.";
"@c For copying conditions, see the file gcc.texi.";
"";
"@c This file is generated automatically using gcc/config/arm/neon-docgen.ml";
"@c Please do not edit manually."]
(* Program entry point. *)
let _ =
if Array.length Sys.argv <> 2 then
failwith "Usage: neon-docgen <output filename>"
else
let file = Sys.argv.(1) in
try
let chan = open_out file in
gnu_header chan;
List.iter (document_group chan) intrinsic_groups;
close_out chan
with Sys_error sys ->
failwith ("Could not create output file " ^ file ^ ": " ^ sys)

419
gcc/config/arm/neon-gen.ml Normal file
View file

@ -0,0 +1,419 @@
(* Auto-generate ARM Neon intrinsics header file.
Copyright (C) 2006, 2007 Free Software Foundation, Inc.
Contributed by CodeSourcery.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.
This is an O'Caml program. The O'Caml compiler is available from:
http://caml.inria.fr/
Or from your favourite OS's friendly packaging system. Tested with version
3.09.2, though other versions will probably work too.
Compile with:
ocamlc -c neon.ml
ocamlc -o neon-gen neon.cmo neon-gen.ml
Run with:
./neon-gen > arm_neon.h
*)
open Neon
(* The format codes used in the following functions are documented at:
http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html\
#6_printflikefunctionsforprettyprinting
(one line, remove the backslash.)
*)
(* Following functions can be used to approximate GNU indentation style. *)
let start_function () =
Format.printf "@[<v 0>";
ref 0
let end_function nesting =
match !nesting with
0 -> Format.printf "@;@;@]"
| _ -> failwith ("Bad nesting (ending function at level "
^ (string_of_int !nesting) ^ ")")
let open_braceblock nesting =
begin match !nesting with
0 -> Format.printf "@,@<0>{@[<v 2>@,"
| _ -> Format.printf "@,@[<v 2> @<0>{@[<v 2>@,"
end;
incr nesting
let close_braceblock nesting =
decr nesting;
match !nesting with
0 -> Format.printf "@]@,@<0>}"
| _ -> Format.printf "@]@,@<0>}@]"
let print_function arity fnname body =
let ffmt = start_function () in
Format.printf "__extension__ static __inline ";
let inl = "__attribute__ ((__always_inline__))" in
begin match arity with
Arity0 ret ->
Format.printf "%s %s@,%s (void)" (string_of_vectype ret) inl fnname
| Arity1 (ret, arg0) ->
Format.printf "%s %s@,%s (%s __a)" (string_of_vectype ret) inl fnname
(string_of_vectype arg0)
| Arity2 (ret, arg0, arg1) ->
Format.printf "%s %s@,%s (%s __a, %s __b)"
(string_of_vectype ret) inl fnname (string_of_vectype arg0)
(string_of_vectype arg1)
| Arity3 (ret, arg0, arg1, arg2) ->
Format.printf "%s %s@,%s (%s __a, %s __b, %s __c)"
(string_of_vectype ret) inl fnname (string_of_vectype arg0)
(string_of_vectype arg1) (string_of_vectype arg2)
| Arity4 (ret, arg0, arg1, arg2, arg3) ->
Format.printf "%s %s@,%s (%s __a, %s __b, %s __c, %s __d)"
(string_of_vectype ret) inl fnname (string_of_vectype arg0)
(string_of_vectype arg1) (string_of_vectype arg2)
(string_of_vectype arg3)
end;
open_braceblock ffmt;
let rec print_lines = function
[] -> ()
| [line] -> Format.printf "%s" line
| line::lines -> Format.printf "%s@," line; print_lines lines in
print_lines body;
close_braceblock ffmt;
end_function ffmt
let return_by_ptr features = List.mem ReturnPtr features
let union_string num elts base =
let itype = inttype_for_array num elts in
let iname = string_of_inttype itype
and sname = string_of_vectype (T_arrayof (num, elts)) in
Printf.sprintf "union { %s __i; %s __o; } %s" sname iname base
let rec signed_ctype = function
T_uint8x8 | T_poly8x8 -> T_int8x8
| T_uint8x16 | T_poly8x16 -> T_int8x16
| T_uint16x4 | T_poly16x4 -> T_int16x4
| T_uint16x8 | T_poly16x8 -> T_int16x8
| T_uint32x2 -> T_int32x2
| T_uint32x4 -> T_int32x4
| T_uint64x1 -> T_int64x1
| T_uint64x2 -> T_int64x2
(* Cast to types defined by mode in arm.c, not random types pulled in from
the <stdint.h> header in use. This fixes incompatible pointer errors when
compiling with C++. *)
| T_uint8 | T_int8 -> T_intQI
| T_uint16 | T_int16 -> T_intHI
| T_uint32 | T_int32 -> T_intSI
| T_uint64 | T_int64 -> T_intDI
| T_poly8 -> T_intQI
| T_poly16 -> T_intHI
| T_arrayof (n, elt) -> T_arrayof (n, signed_ctype elt)
| T_ptrto elt -> T_ptrto (signed_ctype elt)
| T_const elt -> T_const (signed_ctype elt)
| x -> x
let add_cast ctype cval =
let stype = signed_ctype ctype in
if ctype <> stype then
Printf.sprintf "(%s) %s" (string_of_vectype stype) cval
else
cval
let cast_for_return to_ty = "(" ^ (string_of_vectype to_ty) ^ ")"
(* Return a tuple of a list of declarations to go at the start of the function,
and a list of statements needed to return THING. *)
let return arity return_by_ptr thing =
match arity with
Arity0 (ret) | Arity1 (ret, _) | Arity2 (ret, _, _) | Arity3 (ret, _, _, _)
| Arity4 (ret, _, _, _, _) ->
match ret with
T_arrayof (num, vec) ->
if return_by_ptr then
let sname = string_of_vectype ret in
[Printf.sprintf "%s __rv;" sname],
[thing ^ ";"; "return __rv;"]
else
let uname = union_string num vec "__rv" in
[uname ^ ";"], ["__rv.__o = " ^ thing ^ ";"; "return __rv.__i;"]
| T_void -> [], [thing ^ ";"]
| _ ->
[], ["return " ^ (cast_for_return ret) ^ thing ^ ";"]
let rec element_type ctype =
match ctype with
T_arrayof (_, v) -> element_type v
| _ -> ctype
let params return_by_ptr ps =
let pdecls = ref [] in
let ptype t p =
match t with
T_arrayof (num, elts) ->
let uname = union_string num elts (p ^ "u") in
let decl = Printf.sprintf "%s = { %s };" uname p in
pdecls := decl :: !pdecls;
p ^ "u.__o"
| _ -> add_cast t p in
let plist = match ps with
Arity0 _ -> []
| Arity1 (_, t1) -> [ptype t1 "__a"]
| Arity2 (_, t1, t2) -> [ptype t1 "__a"; ptype t2 "__b"]
| Arity3 (_, t1, t2, t3) -> [ptype t1 "__a"; ptype t2 "__b"; ptype t3 "__c"]
| Arity4 (_, t1, t2, t3, t4) ->
[ptype t1 "__a"; ptype t2 "__b"; ptype t3 "__c"; ptype t4 "__d"] in
match ps with
Arity0 ret | Arity1 (ret, _) | Arity2 (ret, _, _) | Arity3 (ret, _, _, _)
| Arity4 (ret, _, _, _, _) ->
if return_by_ptr then
!pdecls, add_cast (T_ptrto (element_type ret)) "&__rv.val[0]" :: plist
else
!pdecls, plist
let modify_params features plist =
let is_flipped =
List.exists (function Flipped _ -> true | _ -> false) features in
if is_flipped then
match plist with
[ a; b ] -> [ b; a ]
| _ ->
failwith ("Don't know how to flip args " ^ (String.concat ", " plist))
else
plist
(* !!! Decide whether to add an extra information word based on the shape
form. *)
let extra_word shape features paramlist bits =
let use_word =
match shape with
All _ | Long | Long_noreg _ | Wide | Wide_noreg _ | Narrow
| By_scalar _ | Wide_scalar | Wide_lane | Binary_imm _ | Long_imm
| Narrow_imm -> true
| _ -> List.mem InfoWord features
in
if use_word then
paramlist @ [string_of_int bits]
else
paramlist
(* Bit 0 represents signed (1) vs unsigned (0), or float (1) vs poly (0).
Bit 1 represents floats & polynomials (1), or ordinary integers (0).
Bit 2 represents rounding (1) vs none (0). *)
let infoword_value elttype features =
let bits01 =
match elt_class elttype with
Signed | ConvClass (Signed, _) | ConvClass (_, Signed) -> 0b001
| Poly -> 0b010
| Float -> 0b011
| _ -> 0b000
and rounding_bit = if List.mem Rounding features then 0b100 else 0b000 in
bits01 lor rounding_bit
(* "Cast" type operations will throw an exception in mode_of_elt (actually in
elt_width, called from there). Deal with that here, and generate a suffix
with multiple modes (<to><from>). *)
let rec mode_suffix elttype shape =
try
let mode = mode_of_elt elttype shape in
string_of_mode mode
with MixedMode (dst, src) ->
let dstmode = mode_of_elt dst shape
and srcmode = mode_of_elt src shape in
string_of_mode dstmode ^ string_of_mode srcmode
let print_variant opcode features shape name (ctype, asmtype, elttype) =
let bits = infoword_value elttype features in
let modesuf = mode_suffix elttype shape in
let return_by_ptr = return_by_ptr features in
let pdecls, paramlist = params return_by_ptr ctype in
let paramlist' = modify_params features paramlist in
let paramlist'' = extra_word shape features paramlist' bits in
let parstr = String.concat ", " paramlist'' in
let builtin = Printf.sprintf "__builtin_neon_%s%s (%s)"
(builtin_name features name) modesuf parstr in
let rdecls, stmts = return ctype return_by_ptr builtin in
let body = pdecls @ rdecls @ stmts
and fnname = (intrinsic_name name) ^ "_" ^ (string_of_elt elttype) in
print_function ctype fnname body
(* When this function processes the element types in the ops table, it rewrites
them in a list of tuples (a,b,c):
a : C type as an "arity", e.g. Arity1 (T_poly8x8, T_poly8x8)
b : Asm type : a single, processed element type, e.g. P16. This is the
type which should be attached to the asm opcode.
c : Variant type : the unprocessed type for this variant (e.g. in add
instructions which don't care about the sign, b might be i16 and c
might be s16.)
*)
let print_op (opcode, features, shape, name, munge, types) =
let sorted_types = List.sort compare types in
let munged_types = List.map
(fun elt -> let c, asm = munge shape elt in c, asm, elt) sorted_types in
List.iter
(fun variant -> print_variant opcode features shape name variant)
munged_types
let print_ops ops =
List.iter print_op ops
(* Output type definitions. Table entries are:
cbase : "C" name for the type.
abase : "ARM" base name for the type (i.e. int in int8x8_t).
esize : element size.
enum : element count.
*)
let deftypes () =
let typeinfo = [
(* Doubleword vector types. *)
"__builtin_neon_qi", "int", 8, 8;
"__builtin_neon_hi", "int", 16, 4;
"__builtin_neon_si", "int", 32, 2;
"__builtin_neon_di", "int", 64, 1;
"__builtin_neon_sf", "float", 32, 2;
"__builtin_neon_poly8", "poly", 8, 8;
"__builtin_neon_poly16", "poly", 16, 4;
"__builtin_neon_uqi", "uint", 8, 8;
"__builtin_neon_uhi", "uint", 16, 4;
"__builtin_neon_usi", "uint", 32, 2;
"__builtin_neon_udi", "uint", 64, 1;
(* Quadword vector types. *)
"__builtin_neon_qi", "int", 8, 16;
"__builtin_neon_hi", "int", 16, 8;
"__builtin_neon_si", "int", 32, 4;
"__builtin_neon_di", "int", 64, 2;
"__builtin_neon_sf", "float", 32, 4;
"__builtin_neon_poly8", "poly", 8, 16;
"__builtin_neon_poly16", "poly", 16, 8;
"__builtin_neon_uqi", "uint", 8, 16;
"__builtin_neon_uhi", "uint", 16, 8;
"__builtin_neon_usi", "uint", 32, 4;
"__builtin_neon_udi", "uint", 64, 2
] in
List.iter
(fun (cbase, abase, esize, enum) ->
let attr =
match enum with
1 -> ""
| _ -> Printf.sprintf "\t__attribute__ ((__vector_size__ (%d)))"
(esize * enum / 8) in
Format.printf "typedef %s %s%dx%d_t%s;@\n" cbase abase esize enum attr)
typeinfo;
Format.print_newline ();
(* Extra types not in <stdint.h>. *)
Format.printf "typedef __builtin_neon_sf float32_t;\n";
Format.printf "typedef __builtin_neon_poly8 poly8_t;\n";
Format.printf "typedef __builtin_neon_poly16 poly16_t;\n"
(* Output structs containing arrays, for load & store instructions etc. *)
let arrtypes () =
let typeinfo = [
"int", 8; "int", 16;
"int", 32; "int", 64;
"uint", 8; "uint", 16;
"uint", 32; "uint", 64;
"float", 32; "poly", 8;
"poly", 16
] in
let writestruct elname elsize regsize arrsize =
let elnum = regsize / elsize in
let structname =
Printf.sprintf "%s%dx%dx%d_t" elname elsize elnum arrsize in
let sfmt = start_function () in
Format.printf "typedef struct %s" structname;
open_braceblock sfmt;
Format.printf "%s%dx%d_t val[%d];" elname elsize elnum arrsize;
close_braceblock sfmt;
Format.printf " %s;" structname;
end_function sfmt;
in
for n = 2 to 4 do
List.iter
(fun (elname, elsize) ->
writestruct elname elsize 64 n;
writestruct elname elsize 128 n)
typeinfo
done
let print_lines = List.iter (fun s -> Format.printf "%s@\n" s)
(* Do it. *)
let _ =
print_lines [
"/* ARM NEON intrinsics include file. This file is generated automatically";
" using neon-gen.ml. Please do not edit manually.";
"";
" Copyright (C) 2006, 2007 Free Software Foundation, Inc.";
" Contributed by CodeSourcery.";
"";
" This file is part of GCC.";
"";
" GCC is free software; you can redistribute it and/or modify it";
" under the terms of the GNU General Public License as published";
" by the Free Software Foundation; either version 2, or (at your";
" option) any later version.";
"";
" GCC is distributed in the hope that it will be useful, but WITHOUT";
" ANY WARRANTY; without even the implied warranty of MERCHANTABILITY";
" or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public";
" License for more details.";
"";
" You should have received a copy of the GNU General Public License";
" along with GCC; see the file COPYING. If not, write to the";
" Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston,";
" MA 02110-1301, USA. */";
"";
"/* As a special exception, if you include this header file into source";
" files compiled by GCC, this header file does not by itself cause";
" the resulting executable to be covered by the GNU General Public";
" License. This exception does not however invalidate any other";
" reasons why the executable file might be covered by the GNU General";
" Public License. */";
"";
"#ifndef _GCC_ARM_NEON_H";
"#define _GCC_ARM_NEON_H 1";
"";
"#ifndef __ARM_NEON__";
"#error You must enable NEON instructions (e.g. -mfloat-abi=softfp -mfpu=neon) to use arm_neon.h";
"#else";
"";
"#ifdef __cplusplus";
"extern \"C\" {";
"#endif";
"";
"#include <stdint.h>";
""];
deftypes ();
arrtypes ();
Format.print_newline ();
print_ops ops;
Format.print_newline ();
print_ops reinterp;
print_lines [
"#ifdef __cplusplus";
"}";
"#endif";
"#endif";
"#endif"]

View file

@ -0,0 +1,277 @@
(* Auto-generate ARM Neon intrinsics tests.
Copyright (C) 2006 Free Software Foundation, Inc.
Contributed by CodeSourcery.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2, or (at your option) any later
version.
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING. If not, write to the Free
Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.
This is an O'Caml program. The O'Caml compiler is available from:
http://caml.inria.fr/
Or from your favourite OS's friendly packaging system. Tested with version
3.09.2, though other versions will probably work too.
Compile with:
ocamlc -c neon.ml
ocamlc -o neon-testgen neon.cmo neon-testgen.ml
Run with:
cd /path/to/gcc/testsuite/gcc.target/arm/neon
/path/to/neon-testgen
*)
open Neon
type c_type_flags = Pointer | Const
(* Open a test source file. *)
let open_test_file dir name =
try
open_out (dir ^ "/" ^ name ^ ".c")
with Sys_error str ->
failwith ("Could not create test source file " ^ name ^ ": " ^ str)
(* Emit prologue code to a test source file. *)
let emit_prologue chan test_name =
Printf.fprintf chan "/* Test the `%s' ARM Neon intrinsic. */\n" test_name;
Printf.fprintf chan "/* This file was autogenerated by neon-testgen. */\n\n";
Printf.fprintf chan "/* { dg-do assemble } */\n";
Printf.fprintf chan "/* { dg-require-effective-target arm_neon_ok } */\n";
Printf.fprintf chan
"/* { dg-options \"-save-temps -O0 -mfpu=neon -mfloat-abi=softfp\" } */\n";
Printf.fprintf chan "\n#include \"arm_neon.h\"\n\n";
Printf.fprintf chan "void test_%s (void)\n{\n" test_name
(* Emit declarations of local variables that are going to be passed
to an intrinsic, together with one to take a returned value if needed. *)
let emit_automatics chan c_types =
let emit () =
ignore (
List.fold_left (fun arg_number -> fun (flags, ty) ->
let pointer_bit =
if List.mem Pointer flags then "*" else ""
in
(* Const arguments to builtins are directly
written in as constants. *)
if not (List.mem Const flags) then
Printf.fprintf chan " %s %sarg%d_%s;\n"
ty pointer_bit arg_number ty;
arg_number + 1)
0 (List.tl c_types))
in
match c_types with
(_, return_ty) :: tys ->
if return_ty <> "void" then
(* The intrinsic returns a value. *)
(Printf.fprintf chan " %s out_%s;\n" return_ty return_ty;
emit ())
else
(* The intrinsic does not return a value. *)
emit ()
| _ -> assert false
(* Emit code to call an intrinsic. *)
let emit_call chan const_valuator c_types name elt_ty =
(if snd (List.hd c_types) <> "void" then
Printf.fprintf chan " out_%s = " (snd (List.hd c_types))
else
Printf.fprintf chan " ");
Printf.fprintf chan "%s_%s (" (intrinsic_name name) (string_of_elt elt_ty);
let print_arg chan arg_number (flags, ty) =
(* If the argument is of const type, then directly write in the
constant now. *)
if List.mem Const flags then
match const_valuator with
None ->
if List.mem Pointer flags then
Printf.fprintf chan "0"
else
Printf.fprintf chan "1"
| Some f -> Printf.fprintf chan "%s" (string_of_int (f arg_number))
else
Printf.fprintf chan "arg%d_%s" arg_number ty
in
let rec print_args arg_number tys =
match tys with
[] -> ()
| [ty] -> print_arg chan arg_number ty
| ty::tys ->
print_arg chan arg_number ty;
Printf.fprintf chan ", ";
print_args (arg_number + 1) tys
in
print_args 0 (List.tl c_types);
Printf.fprintf chan ");\n"
(* Emit epilogue code to a test source file. *)
let emit_epilogue chan features regexps =
let no_op = List.exists (fun feature -> feature = No_op) features in
Printf.fprintf chan "}\n\n";
(if not no_op then
List.iter (fun regexp ->
Printf.fprintf chan
"/* { dg-final { scan-assembler \"%s\" } } */\n" regexp)
regexps
else
()
);
Printf.fprintf chan "/* { dg-final { cleanup-saved-temps } } */\n"
(* Check a list of C types to determine which ones are pointers and which
ones are const. *)
let check_types tys =
let tys' =
List.map (fun ty ->
let len = String.length ty in
if len > 2 && String.get ty (len - 2) = ' '
&& String.get ty (len - 1) = '*'
then ([Pointer], String.sub ty 0 (len - 2))
else ([], ty)) tys
in
List.map (fun (flags, ty) ->
if String.length ty > 6 && String.sub ty 0 6 = "const "
then (Const :: flags, String.sub ty 6 ((String.length ty) - 6))
else (flags, ty)) tys'
(* Given an intrinsic shape, produce a regexp that will match
the right-hand sides of instructions generated by an intrinsic of
that shape. *)
let rec analyze_shape shape =
let rec n_things n thing =
match n with
0 -> []
| n -> thing :: (n_things (n - 1) thing)
in
let rec analyze_shape_elt elt =
match elt with
Dreg -> "\\[dD\\]\\[0-9\\]+"
| Qreg -> "\\[qQ\\]\\[0-9\\]+"
| Corereg -> "\\[rR\\]\\[0-9\\]+"
| Immed -> "#\\[0-9\\]+"
| VecArray (1, elt) ->
let elt_regexp = analyze_shape_elt elt in
"((\\\\\\{" ^ elt_regexp ^ "\\\\\\})|(" ^ elt_regexp ^ "))"
| VecArray (n, elt) ->
let elt_regexp = analyze_shape_elt elt in
let alt1 = elt_regexp ^ "-" ^ elt_regexp in
let alt2 = commas (fun x -> x) (n_things n elt_regexp) "" in
"\\\\\\{((" ^ alt1 ^ ")|(" ^ alt2 ^ "))\\\\\\}"
| (PtrTo elt | CstPtrTo elt) ->
"\\\\\\[" ^ (analyze_shape_elt elt) ^ "\\\\\\]"
| Element_of_dreg -> (analyze_shape_elt Dreg) ^ "\\\\\\[\\[0-9\\]+\\\\\\]"
| Element_of_qreg -> (analyze_shape_elt Qreg) ^ "\\\\\\[\\[0-9\\]+\\\\\\]"
| All_elements_of_dreg -> (analyze_shape_elt Dreg) ^ "\\\\\\[\\\\\\]"
in
match shape with
All (n, elt) -> commas analyze_shape_elt (n_things n elt) ""
| Long -> (analyze_shape_elt Qreg) ^ ", " ^ (analyze_shape_elt Dreg) ^
", " ^ (analyze_shape_elt Dreg)
| Long_noreg elt -> (analyze_shape_elt elt) ^ ", " ^ (analyze_shape_elt elt)
| Wide -> (analyze_shape_elt Qreg) ^ ", " ^ (analyze_shape_elt Qreg) ^
", " ^ (analyze_shape_elt Dreg)
| Wide_noreg elt -> analyze_shape (Long_noreg elt)
| Narrow -> (analyze_shape_elt Dreg) ^ ", " ^ (analyze_shape_elt Qreg) ^
", " ^ (analyze_shape_elt Qreg)
| Use_operands elts -> commas analyze_shape_elt (Array.to_list elts) ""
| By_scalar Dreg ->
analyze_shape (Use_operands [| Dreg; Dreg; Element_of_dreg |])
| By_scalar Qreg ->
analyze_shape (Use_operands [| Qreg; Qreg; Element_of_dreg |])
| By_scalar _ -> assert false
| Wide_lane ->
analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
| Wide_scalar ->
analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
| Pair_result elt ->
let elt_regexp = analyze_shape_elt elt in
elt_regexp ^ ", " ^ elt_regexp
| Unary_scalar _ -> "FIXME Unary_scalar"
| Binary_imm elt -> analyze_shape (Use_operands [| elt; elt; Immed |])
| Narrow_imm -> analyze_shape (Use_operands [| Dreg; Qreg; Immed |])
| Long_imm -> analyze_shape (Use_operands [| Qreg; Dreg; Immed |])
(* Generate tests for one intrinsic. *)
let test_intrinsic dir opcode features shape name munge elt_ty =
(* Open the test source file. *)
let test_name = name ^ (string_of_elt elt_ty) in
let chan = open_test_file dir test_name in
(* Work out what argument and return types the intrinsic has. *)
let c_arity, new_elt_ty = munge shape elt_ty in
let c_types = check_types (strings_of_arity c_arity) in
(* Extract any constant valuator (a function specifying what constant
values are to be written into the intrinsic call) from the features
list. *)
let const_valuator =
try
match (List.find (fun feature -> match feature with
Const_valuator _ -> true
| _ -> false) features) with
Const_valuator f -> Some f
| _ -> assert false
with Not_found -> None
in
(* Work out what instruction name(s) to expect. *)
let insns = get_insn_names features name in
let no_suffix = (new_elt_ty = NoElts) in
let insns =
if no_suffix then insns
else List.map (fun insn ->
let suffix = string_of_elt_dots new_elt_ty in
insn ^ "\\." ^ suffix) insns
in
(* Construct a regexp to match against the expected instruction name(s). *)
let insn_regexp =
match insns with
[] -> assert false
| [insn] -> insn
| _ ->
let rec calc_regexp insns cur_regexp =
match insns with
[] -> cur_regexp
| [insn] -> cur_regexp ^ "(" ^ insn ^ "))"
| insn::insns -> calc_regexp insns (cur_regexp ^ "(" ^ insn ^ ")|")
in calc_regexp insns "("
in
(* Construct regexps to match against the instructions that this
intrinsic expands to. Watch out for any writeback character and
comments after the instruction. *)
let regexps = List.map (fun regexp -> insn_regexp ^ "\\[ \t\\]+" ^ regexp ^
"!?\\(\\[ \t\\]+@\\[a-zA-Z0-9 \\]+\\)?\\n")
(analyze_all_shapes features shape analyze_shape)
in
(* Emit file and function prologues. *)
emit_prologue chan test_name;
(* Emit local variable declarations. *)
emit_automatics chan c_types;
Printf.fprintf chan "\n";
(* Emit the call to the intrinsic. *)
emit_call chan const_valuator c_types name elt_ty;
(* Emit the function epilogue and the DejaGNU scan-assembler directives. *)
emit_epilogue chan features regexps;
(* Close the test file. *)
close_out chan
(* Generate tests for one element of the "ops" table. *)
let test_intrinsic_group dir (opcode, features, shape, name, munge, types) =
List.iter (test_intrinsic dir opcode features shape name munge) types
(* Program entry point. *)
let _ =
let directory = if Array.length Sys.argv <> 1 then Sys.argv.(1) else "." in
List.iter (test_intrinsic_group directory) (reinterp @ ops)

3948
gcc/config/arm/neon.md Normal file

File diff suppressed because it is too large Load diff

1826
gcc/config/arm/neon.ml Normal file

File diff suppressed because it is too large Load diff

View file

@ -470,3 +470,43 @@
(match_test "((unsigned HOST_WIDE_INT) INTVAL (op)) < 64")))
;; Neon predicates
(define_predicate "const_multiple_of_8_operand"
(match_code "const_int")
{
unsigned HOST_WIDE_INT val = INTVAL (op);
return (val & 7) == 0;
})
(define_predicate "imm_for_neon_mov_operand"
(match_code "const_vector")
{
return neon_immediate_valid_for_move (op, mode, NULL, NULL);
})
(define_predicate "imm_for_neon_logic_operand"
(match_code "const_vector")
{
return neon_immediate_valid_for_logic (op, mode, 0, NULL, NULL);
})
(define_predicate "imm_for_neon_inv_logic_operand"
(match_code "const_vector")
{
return neon_immediate_valid_for_logic (op, mode, 1, NULL, NULL);
})
(define_predicate "neon_logic_op2"
(ior (match_operand 0 "imm_for_neon_logic_operand")
(match_operand 0 "s_register_operand")))
(define_predicate "neon_inv_logic_op2"
(ior (match_operand 0 "imm_for_neon_inv_logic_operand")
(match_operand 0 "s_register_operand")))
;; TODO: We could check lane numbers more precisely based on the mode.
(define_predicate "neon_lane_number"
(and (match_code "const_int")
(match_test "INTVAL (op) >= 0 && INTVAL (op) <= 7")))

View file

@ -9,8 +9,10 @@ MD_INCLUDES= $(srcdir)/config/arm/arm-tune.md \
$(srcdir)/config/arm/arm926ejs.md \
$(srcdir)/config/arm/cirrus.md \
$(srcdir)/config/arm/fpa.md \
$(srcdir)/config/arm/vec-common.md \
$(srcdir)/config/arm/iwmmxt.md \
$(srcdir)/config/arm/vfp.md \
$(srcdir)/config/arm/neon.md \
$(srcdir)/config/arm/thumb2.md
s-config s-conditions s-flags s-codes s-constants s-emit s-recog s-preds \

View file

@ -0,0 +1,107 @@
;; Machine Description for shared bits common to IWMMXT and Neon.
;; Copyright (C) 2006 Free Software Foundation, Inc.
;; Written by CodeSourcery.
;;
;; This file is part of GCC.
;;
;; GCC is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 2, or (at your option)
;; any later version.
;;
;; GCC is distributed in the hope that it will be useful, but
;; WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
;; General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with GCC; see the file COPYING. If not, write to the Free
;; Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
;; 02110-1301, USA.
;; Vector Moves
;; All integer and float modes supported by Neon and IWMMXT.
(define_mode_macro VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
;; All integer and float modes supported by Neon and IWMMXT, except V2DI.
(define_mode_macro VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
;; All integer modes supported by Neon and IWMMXT
(define_mode_macro VINT [V2DI V2SI V4HI V8QI V4SI V8HI V16QI])
;; All integer modes supported by Neon and IWMMXT, except V2DI
(define_mode_macro VINTW [V2SI V4HI V8QI V4SI V8HI V16QI])
(define_expand "mov<mode>"
[(set (match_operand:VALL 0 "nonimmediate_operand" "")
(match_operand:VALL 1 "general_operand" ""))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})
;; Vector arithmetic. Expanders are blank, then unnamed insns implement
;; patterns seperately for IWMMXT and Neon.
(define_expand "add<mode>3"
[(set (match_operand:VALL 0 "s_register_operand" "")
(plus:VALL (match_operand:VALL 1 "s_register_operand" "")
(match_operand:VALL 2 "s_register_operand" "")))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})
(define_expand "sub<mode>3"
[(set (match_operand:VALL 0 "s_register_operand" "")
(minus:VALL (match_operand:VALL 1 "s_register_operand" "")
(match_operand:VALL 2 "s_register_operand" "")))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})
(define_expand "mul<mode>3"
[(set (match_operand:VALLW 0 "s_register_operand" "")
(mult:VALLW (match_operand:VALLW 1 "s_register_operand" "")
(match_operand:VALLW 2 "s_register_operand" "")))]
"TARGET_NEON || (<MODE>mode == V4HImode && TARGET_REALLY_IWMMXT)"
{
})
(define_expand "smin<mode>3"
[(set (match_operand:VALLW 0 "s_register_operand" "")
(smin:VALLW (match_operand:VALLW 1 "s_register_operand" "")
(match_operand:VALLW 2 "s_register_operand" "")))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})
(define_expand "umin<mode>3"
[(set (match_operand:VINTW 0 "s_register_operand" "")
(umin:VINTW (match_operand:VINTW 1 "s_register_operand" "")
(match_operand:VINTW 2 "s_register_operand" "")))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})
(define_expand "smax<mode>3"
[(set (match_operand:VALLW 0 "s_register_operand" "")
(smax:VALLW (match_operand:VALLW 1 "s_register_operand" "")
(match_operand:VALLW 2 "s_register_operand" "")))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})
(define_expand "umax<mode>3"
[(set (match_operand:VINTW 0 "s_register_operand" "")
(umax:VINTW (match_operand:VINTW 1 "s_register_operand" "")
(match_operand:VINTW 2 "s_register_operand" "")))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
})

File diff suppressed because it is too large Load diff

View file

@ -6404,7 +6404,8 @@ instructions, but allow the compiler to schedule those calls.
@menu
* Alpha Built-in Functions::
* ARM Built-in Functions::
* ARM iWMMXt Built-in Functions::
* ARM NEON Intrinsics::
* Blackfin Built-in Functions::
* FR-V Built-in Functions::
* X86 Built-in Functions::
@ -6497,11 +6498,11 @@ void *__builtin_thread_pointer (void)
void __builtin_set_thread_pointer (void *)
@end smallexample
@node ARM Built-in Functions
@subsection ARM Built-in Functions
@node ARM iWMMXt Built-in Functions
@subsection ARM iWMMXt Built-in Functions
These built-in functions are available for the ARM family of
processors, when the @option{-mcpu=iwmmxt} switch is used:
processors when the @option{-mcpu=iwmmxt} switch is used:
@smallexample
typedef int v2si __attribute__ ((vector_size (8)));
@ -6644,6 +6645,14 @@ long long __builtin_arm_wxor (long long, long long)
long long __builtin_arm_wzero ()
@end smallexample
@node ARM NEON Intrinsics
@subsection ARM NEON Intrinsics
These built-in intrinsics for the ARM Advanced SIMD extension are available
when the @option{-mfpu=neon} switch is used:
@include arm-neon-intrinsics.texi
@node Blackfin Built-in Functions
@subsection Blackfin Built-in Functions

View file

@ -1,3 +1,17 @@
2007-07-25 Julian Brown <julian@codesourcery.com>
Paul Brook <paul@codesourcery.com>
Joseph Myers <joseph@codesourcery.com>
Mark Shinwell <shinwell@codesourcery.com>
* gcc.dg/vect/vect.exp: Check is-effective-target arm_neon_hw.
* gcc.dg/vect/tree-vect.h: Check for NEON SIMD support.
* lib/gcc-dg.exp (cleanup-saved-temps): Fix comment.
* lib/target-supports.exp (check_effective_target_arm_neon_ok)
(check_effective_target_arm_neon_hw): New.
* gcc.target/arm/neon/neon.exp: New file.
* gcc.target/arm/neon/polytypes.c: New file.
* gcc.target/arm/neon/v*.c (1870 files): New (autogenerated).
2007-07-25 Janis Johnson <janis187@us.ibm.com>
* gcc.c-torture/unsorted/dump-noaddr.c: Reduce string length for

View file

@ -0,0 +1,47 @@
// Test that ARM NEON vector types have their names mangled correctly.
// { dg-do compile }
// { dg-require-effective-target arm_neon_ok }
// { dg-options "-mfpu=neon -mfloat-abi=softfp" }
#include <arm_neon.h>
void f0 (int8x8_t a) {}
void f1 (int16x4_t a) {}
void f2 (int32x2_t a) {}
void f3 (uint8x8_t a) {}
void f4 (uint16x4_t a) {}
void f5 (uint32x2_t a) {}
void f6 (float32x2_t a) {}
void f7 (poly8x8_t a) {}
void f8 (poly16x4_t a) {}
void f9 (int8x16_t a) {}
void f10 (int16x8_t a) {}
void f11 (int32x4_t a) {}
void f12 (uint8x16_t a) {}
void f13 (uint16x8_t a) {}
void f14 (uint32x4_t a) {}
void f15 (float32x4_t a) {}
void f16 (poly8x16_t a) {}
void f17 (poly16x8_t a) {}
// { dg-final { scan-assembler "_Z2f015__simd64_int8_t:" } }
// { dg-final { scan-assembler "_Z2f116__simd64_int16_t:" } }
// { dg-final { scan-assembler "_Z2f216__simd64_int32_t:" } }
// { dg-final { scan-assembler "_Z2f316__simd64_uint8_t:" } }
// { dg-final { scan-assembler "_Z2f417__simd64_uint16_t:" } }
// { dg-final { scan-assembler "_Z2f517__simd64_uint32_t:" } }
// { dg-final { scan-assembler "_Z2f618__simd64_float32_t:" } }
// { dg-final { scan-assembler "_Z2f716__simd64_poly8_t:" } }
// { dg-final { scan-assembler "_Z2f817__simd64_poly16_t:" } }
// { dg-final { scan-assembler "_Z2f916__simd128_int8_t:" } }
// { dg-final { scan-assembler "_Z3f1017__simd128_int16_t:" } }
// { dg-final { scan-assembler "_Z3f1117__simd128_int32_t:" } }
// { dg-final { scan-assembler "_Z3f1217__simd128_uint8_t:" } }
// { dg-final { scan-assembler "_Z3f1318__simd128_uint16_t:" } }
// { dg-final { scan-assembler "_Z3f1418__simd128_uint32_t:" } }
// { dg-final { scan-assembler "_Z3f1519__simd128_float32_t:" } }
// { dg-final { scan-assembler "_Z3f1617__simd128_poly8_t:" } }
// { dg-final { scan-assembler "_Z3f1718__simd128_poly16_t:" } }

View file

@ -21,6 +21,18 @@ void check_vect (void)
asm volatile (".byte 0xf2,0x0f,0x10,0xc0");
#elif defined(__sparc__)
asm volatile (".word\t0x81b007c0");
#elif defined(__arm__)
{
/* On some processors without NEON support, this instruction may
be a no-op, on others it may trap, so check that it executes
correctly. */
long long a = 0, b = 1;
asm ("vorr %P0, %P1, %P2"
: "=w" (a)
: "0" (a), "w" (b));
if (a != 1)
exit (0);
}
#endif
signal (SIGILL, SIG_DFL);
}

View file

@ -83,6 +83,13 @@ if [istarget "powerpc*-*-*"] {
}
} elseif [istarget "ia64-*-*"] {
set dg-do-what-default run
} elseif [is-effective-target arm_neon_ok] {
lappend DEFAULT_VECTCFLAGS "-mfpu=neon" "-mfloat-abi=softfp"
if [is-effective-target arm_neon_hw] {
set dg-do-what-default run
} else {
set dg-do-what-default compile
}
} else {
return
}

View file

@ -0,0 +1,35 @@
# Copyright (C) 1997, 2004, 2006 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
# GCC testsuite that uses the `dg.exp' driver.
# Exit immediately if this isn't an ARM target.
if ![istarget arm*-*-*] then {
return
}
# Load support procs.
load_lib gcc-dg.exp
# Initialize `dg'.
dg-init
# Main loop.
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
"" ""
# All done.
dg-finish

View file

@ -0,0 +1,47 @@
/* Check that NEON polynomial vector types are suitably incompatible with
integer vector types of the same layout. */
/* { dg-do compile } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-mfpu=neon -mfloat-abi=softfp" } */
#include <arm_neon.h>
void s64_8 (int8x8_t a) {}
void u64_8 (uint8x8_t a) {}
void p64_8 (poly8x8_t a) {}
void s64_16 (int16x4_t a) {}
void u64_16 (uint16x4_t a) {}
void p64_16 (poly16x4_t a) {}
void s128_8 (int8x16_t a) {}
void u128_8 (uint8x16_t a) {}
void p128_8 (poly8x16_t a) {}
void s128_16 (int16x8_t a) {}
void u128_16 (uint16x8_t a) {}
void p128_16 (poly16x8_t a) {}
void foo ()
{
poly8x8_t v64_8;
poly16x4_t v64_16;
poly8x16_t v128_8;
poly16x8_t v128_16;
s64_8 (v64_8); /* { dg-error "use -flax-vector-conversions.*incompatible type for argument 1 of 's64_8'" } */
u64_8 (v64_8); /* { dg-error "incompatible type for argument 1 of 'u64_8'" } */
p64_8 (v64_8);
s64_16 (v64_16); /* { dg-error "incompatible type for argument 1 of 's64_16'" } */
u64_16 (v64_16); /* { dg-error "incompatible type for argument 1 of 'u64_16'" } */
p64_16 (v64_16);
s128_8 (v128_8); /* { dg-error "incompatible type for argument 1 of 's128_8'" } */
u128_8 (v128_8); /* { dg-error "incompatible type for argument 1 of 'u128_8'" } */
p128_8 (v128_8);
s128_16 (v128_16); /* { dg-error "incompatible type for argument 1 of 's128_16'" } */
u128_16 (v128_16); /* { dg-error "incompatible type for argument 1 of 'u128_16'" } */
p128_16 (v128_16);
}

View file

@ -0,0 +1,20 @@
/* Test the `vRaddhns16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRaddhns16 (void)
{
int8x8_t out_int8x8_t;
int16x8_t arg0_int16x8_t;
int16x8_t arg1_int16x8_t;
out_int8x8_t = vraddhn_s16 (arg0_int16x8_t, arg1_int16x8_t);
}
/* { dg-final { scan-assembler "vraddhn\.i16\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRaddhns32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRaddhns32 (void)
{
int16x4_t out_int16x4_t;
int32x4_t arg0_int32x4_t;
int32x4_t arg1_int32x4_t;
out_int16x4_t = vraddhn_s32 (arg0_int32x4_t, arg1_int32x4_t);
}
/* { dg-final { scan-assembler "vraddhn\.i32\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRaddhns64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRaddhns64 (void)
{
int32x2_t out_int32x2_t;
int64x2_t arg0_int64x2_t;
int64x2_t arg1_int64x2_t;
out_int32x2_t = vraddhn_s64 (arg0_int64x2_t, arg1_int64x2_t);
}
/* { dg-final { scan-assembler "vraddhn\.i64\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRaddhnu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRaddhnu16 (void)
{
uint8x8_t out_uint8x8_t;
uint16x8_t arg0_uint16x8_t;
uint16x8_t arg1_uint16x8_t;
out_uint8x8_t = vraddhn_u16 (arg0_uint16x8_t, arg1_uint16x8_t);
}
/* { dg-final { scan-assembler "vraddhn\.i16\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRaddhnu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRaddhnu32 (void)
{
uint16x4_t out_uint16x4_t;
uint32x4_t arg0_uint32x4_t;
uint32x4_t arg1_uint32x4_t;
out_uint16x4_t = vraddhn_u32 (arg0_uint32x4_t, arg1_uint32x4_t);
}
/* { dg-final { scan-assembler "vraddhn\.i32\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRaddhnu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRaddhnu64 (void)
{
uint32x2_t out_uint32x2_t;
uint64x2_t arg0_uint64x2_t;
uint64x2_t arg1_uint64x2_t;
out_uint32x2_t = vraddhn_u64 (arg0_uint64x2_t, arg1_uint64x2_t);
}
/* { dg-final { scan-assembler "vraddhn\.i64\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddQs16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddQs16 (void)
{
int16x8_t out_int16x8_t;
int16x8_t arg0_int16x8_t;
int16x8_t arg1_int16x8_t;
out_int16x8_t = vrhaddq_s16 (arg0_int16x8_t, arg1_int16x8_t);
}
/* { dg-final { scan-assembler "vrhadd\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddQs32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddQs32 (void)
{
int32x4_t out_int32x4_t;
int32x4_t arg0_int32x4_t;
int32x4_t arg1_int32x4_t;
out_int32x4_t = vrhaddq_s32 (arg0_int32x4_t, arg1_int32x4_t);
}
/* { dg-final { scan-assembler "vrhadd\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddQs8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddQs8 (void)
{
int8x16_t out_int8x16_t;
int8x16_t arg0_int8x16_t;
int8x16_t arg1_int8x16_t;
out_int8x16_t = vrhaddq_s8 (arg0_int8x16_t, arg1_int8x16_t);
}
/* { dg-final { scan-assembler "vrhadd\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddQu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddQu16 (void)
{
uint16x8_t out_uint16x8_t;
uint16x8_t arg0_uint16x8_t;
uint16x8_t arg1_uint16x8_t;
out_uint16x8_t = vrhaddq_u16 (arg0_uint16x8_t, arg1_uint16x8_t);
}
/* { dg-final { scan-assembler "vrhadd\.u16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddQu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddQu32 (void)
{
uint32x4_t out_uint32x4_t;
uint32x4_t arg0_uint32x4_t;
uint32x4_t arg1_uint32x4_t;
out_uint32x4_t = vrhaddq_u32 (arg0_uint32x4_t, arg1_uint32x4_t);
}
/* { dg-final { scan-assembler "vrhadd\.u32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddQu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddQu8 (void)
{
uint8x16_t out_uint8x16_t;
uint8x16_t arg0_uint8x16_t;
uint8x16_t arg1_uint8x16_t;
out_uint8x16_t = vrhaddq_u8 (arg0_uint8x16_t, arg1_uint8x16_t);
}
/* { dg-final { scan-assembler "vrhadd\.u8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhadds16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhadds16 (void)
{
int16x4_t out_int16x4_t;
int16x4_t arg0_int16x4_t;
int16x4_t arg1_int16x4_t;
out_int16x4_t = vrhadd_s16 (arg0_int16x4_t, arg1_int16x4_t);
}
/* { dg-final { scan-assembler "vrhadd\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhadds32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhadds32 (void)
{
int32x2_t out_int32x2_t;
int32x2_t arg0_int32x2_t;
int32x2_t arg1_int32x2_t;
out_int32x2_t = vrhadd_s32 (arg0_int32x2_t, arg1_int32x2_t);
}
/* { dg-final { scan-assembler "vrhadd\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhadds8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhadds8 (void)
{
int8x8_t out_int8x8_t;
int8x8_t arg0_int8x8_t;
int8x8_t arg1_int8x8_t;
out_int8x8_t = vrhadd_s8 (arg0_int8x8_t, arg1_int8x8_t);
}
/* { dg-final { scan-assembler "vrhadd\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddu16 (void)
{
uint16x4_t out_uint16x4_t;
uint16x4_t arg0_uint16x4_t;
uint16x4_t arg1_uint16x4_t;
out_uint16x4_t = vrhadd_u16 (arg0_uint16x4_t, arg1_uint16x4_t);
}
/* { dg-final { scan-assembler "vrhadd\.u16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddu32 (void)
{
uint32x2_t out_uint32x2_t;
uint32x2_t arg0_uint32x2_t;
uint32x2_t arg1_uint32x2_t;
out_uint32x2_t = vrhadd_u32 (arg0_uint32x2_t, arg1_uint32x2_t);
}
/* { dg-final { scan-assembler "vrhadd\.u32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRhaddu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRhaddu8 (void)
{
uint8x8_t out_uint8x8_t;
uint8x8_t arg0_uint8x8_t;
uint8x8_t arg1_uint8x8_t;
out_uint8x8_t = vrhadd_u8 (arg0_uint8x8_t, arg1_uint8x8_t);
}
/* { dg-final { scan-assembler "vrhadd\.u8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQs16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQs16 (void)
{
int16x8_t out_int16x8_t;
int16x8_t arg0_int16x8_t;
int16x8_t arg1_int16x8_t;
out_int16x8_t = vrshlq_s16 (arg0_int16x8_t, arg1_int16x8_t);
}
/* { dg-final { scan-assembler "vrshl\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQs32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQs32 (void)
{
int32x4_t out_int32x4_t;
int32x4_t arg0_int32x4_t;
int32x4_t arg1_int32x4_t;
out_int32x4_t = vrshlq_s32 (arg0_int32x4_t, arg1_int32x4_t);
}
/* { dg-final { scan-assembler "vrshl\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQs64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQs64 (void)
{
int64x2_t out_int64x2_t;
int64x2_t arg0_int64x2_t;
int64x2_t arg1_int64x2_t;
out_int64x2_t = vrshlq_s64 (arg0_int64x2_t, arg1_int64x2_t);
}
/* { dg-final { scan-assembler "vrshl\.s64\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQs8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQs8 (void)
{
int8x16_t out_int8x16_t;
int8x16_t arg0_int8x16_t;
int8x16_t arg1_int8x16_t;
out_int8x16_t = vrshlq_s8 (arg0_int8x16_t, arg1_int8x16_t);
}
/* { dg-final { scan-assembler "vrshl\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQu16 (void)
{
uint16x8_t out_uint16x8_t;
uint16x8_t arg0_uint16x8_t;
int16x8_t arg1_int16x8_t;
out_uint16x8_t = vrshlq_u16 (arg0_uint16x8_t, arg1_int16x8_t);
}
/* { dg-final { scan-assembler "vrshl\.u16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQu32 (void)
{
uint32x4_t out_uint32x4_t;
uint32x4_t arg0_uint32x4_t;
int32x4_t arg1_int32x4_t;
out_uint32x4_t = vrshlq_u32 (arg0_uint32x4_t, arg1_int32x4_t);
}
/* { dg-final { scan-assembler "vrshl\.u32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQu64 (void)
{
uint64x2_t out_uint64x2_t;
uint64x2_t arg0_uint64x2_t;
int64x2_t arg1_int64x2_t;
out_uint64x2_t = vrshlq_u64 (arg0_uint64x2_t, arg1_int64x2_t);
}
/* { dg-final { scan-assembler "vrshl\.u64\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlQu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlQu8 (void)
{
uint8x16_t out_uint8x16_t;
uint8x16_t arg0_uint8x16_t;
int8x16_t arg1_int8x16_t;
out_uint8x16_t = vrshlq_u8 (arg0_uint8x16_t, arg1_int8x16_t);
}
/* { dg-final { scan-assembler "vrshl\.u8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshls16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshls16 (void)
{
int16x4_t out_int16x4_t;
int16x4_t arg0_int16x4_t;
int16x4_t arg1_int16x4_t;
out_int16x4_t = vrshl_s16 (arg0_int16x4_t, arg1_int16x4_t);
}
/* { dg-final { scan-assembler "vrshl\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshls32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshls32 (void)
{
int32x2_t out_int32x2_t;
int32x2_t arg0_int32x2_t;
int32x2_t arg1_int32x2_t;
out_int32x2_t = vrshl_s32 (arg0_int32x2_t, arg1_int32x2_t);
}
/* { dg-final { scan-assembler "vrshl\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshls64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshls64 (void)
{
int64x1_t out_int64x1_t;
int64x1_t arg0_int64x1_t;
int64x1_t arg1_int64x1_t;
out_int64x1_t = vrshl_s64 (arg0_int64x1_t, arg1_int64x1_t);
}
/* { dg-final { scan-assembler "vrshl\.s64\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshls8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshls8 (void)
{
int8x8_t out_int8x8_t;
int8x8_t arg0_int8x8_t;
int8x8_t arg1_int8x8_t;
out_int8x8_t = vrshl_s8 (arg0_int8x8_t, arg1_int8x8_t);
}
/* { dg-final { scan-assembler "vrshl\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlu16 (void)
{
uint16x4_t out_uint16x4_t;
uint16x4_t arg0_uint16x4_t;
int16x4_t arg1_int16x4_t;
out_uint16x4_t = vrshl_u16 (arg0_uint16x4_t, arg1_int16x4_t);
}
/* { dg-final { scan-assembler "vrshl\.u16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlu32 (void)
{
uint32x2_t out_uint32x2_t;
uint32x2_t arg0_uint32x2_t;
int32x2_t arg1_int32x2_t;
out_uint32x2_t = vrshl_u32 (arg0_uint32x2_t, arg1_int32x2_t);
}
/* { dg-final { scan-assembler "vrshl\.u32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlu64 (void)
{
uint64x1_t out_uint64x1_t;
uint64x1_t arg0_uint64x1_t;
int64x1_t arg1_int64x1_t;
out_uint64x1_t = vrshl_u64 (arg0_uint64x1_t, arg1_int64x1_t);
}
/* { dg-final { scan-assembler "vrshl\.u64\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRshlu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshlu8 (void)
{
uint8x8_t out_uint8x8_t;
uint8x8_t arg0_uint8x8_t;
int8x8_t arg1_int8x8_t;
out_uint8x8_t = vrshl_u8 (arg0_uint8x8_t, arg1_int8x8_t);
}
/* { dg-final { scan-assembler "vrshl\.u8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_ns16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_ns16 (void)
{
int16x8_t out_int16x8_t;
int16x8_t arg0_int16x8_t;
out_int16x8_t = vrshrq_n_s16 (arg0_int16x8_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_ns32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_ns32 (void)
{
int32x4_t out_int32x4_t;
int32x4_t arg0_int32x4_t;
out_int32x4_t = vrshrq_n_s32 (arg0_int32x4_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_ns64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_ns64 (void)
{
int64x2_t out_int64x2_t;
int64x2_t arg0_int64x2_t;
out_int64x2_t = vrshrq_n_s64 (arg0_int64x2_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s64\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_ns8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_ns8 (void)
{
int8x16_t out_int8x16_t;
int8x16_t arg0_int8x16_t;
out_int8x16_t = vrshrq_n_s8 (arg0_int8x16_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_nu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_nu16 (void)
{
uint16x8_t out_uint16x8_t;
uint16x8_t arg0_uint16x8_t;
out_uint16x8_t = vrshrq_n_u16 (arg0_uint16x8_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_nu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_nu32 (void)
{
uint32x4_t out_uint32x4_t;
uint32x4_t arg0_uint32x4_t;
out_uint32x4_t = vrshrq_n_u32 (arg0_uint32x4_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_nu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_nu64 (void)
{
uint64x2_t out_uint64x2_t;
uint64x2_t arg0_uint64x2_t;
out_uint64x2_t = vrshrq_n_u64 (arg0_uint64x2_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u64\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrQ_nu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrQ_nu8 (void)
{
uint8x16_t out_uint8x16_t;
uint8x16_t arg0_uint8x16_t;
out_uint8x16_t = vrshrq_n_u8 (arg0_uint8x16_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_ns16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_ns16 (void)
{
int16x4_t out_int16x4_t;
int16x4_t arg0_int16x4_t;
out_int16x4_t = vrshr_n_s16 (arg0_int16x4_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_ns32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_ns32 (void)
{
int32x2_t out_int32x2_t;
int32x2_t arg0_int32x2_t;
out_int32x2_t = vrshr_n_s32 (arg0_int32x2_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_ns64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_ns64 (void)
{
int64x1_t out_int64x1_t;
int64x1_t arg0_int64x1_t;
out_int64x1_t = vrshr_n_s64 (arg0_int64x1_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s64\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_ns8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_ns8 (void)
{
int8x8_t out_int8x8_t;
int8x8_t arg0_int8x8_t;
out_int8x8_t = vrshr_n_s8 (arg0_int8x8_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_nu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_nu16 (void)
{
uint16x4_t out_uint16x4_t;
uint16x4_t arg0_uint16x4_t;
out_uint16x4_t = vrshr_n_u16 (arg0_uint16x4_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_nu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_nu32 (void)
{
uint32x2_t out_uint32x2_t;
uint32x2_t arg0_uint32x2_t;
out_uint32x2_t = vrshr_n_u32 (arg0_uint32x2_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_nu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_nu64 (void)
{
uint64x1_t out_uint64x1_t;
uint64x1_t arg0_uint64x1_t;
out_uint64x1_t = vrshr_n_u64 (arg0_uint64x1_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u64\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshr_nu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshr_nu8 (void)
{
uint8x8_t out_uint8x8_t;
uint8x8_t arg0_uint8x8_t;
out_uint8x8_t = vrshr_n_u8 (arg0_uint8x8_t, 1);
}
/* { dg-final { scan-assembler "vrshr\.u8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrn_ns16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrn_ns16 (void)
{
int8x8_t out_int8x8_t;
int16x8_t arg0_int16x8_t;
out_int8x8_t = vrshrn_n_s16 (arg0_int16x8_t, 1);
}
/* { dg-final { scan-assembler "vrshrn\.i16\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrn_ns32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrn_ns32 (void)
{
int16x4_t out_int16x4_t;
int32x4_t arg0_int32x4_t;
out_int16x4_t = vrshrn_n_s32 (arg0_int32x4_t, 1);
}
/* { dg-final { scan-assembler "vrshrn\.i32\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrn_ns64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrn_ns64 (void)
{
int32x2_t out_int32x2_t;
int64x2_t arg0_int64x2_t;
out_int32x2_t = vrshrn_n_s64 (arg0_int64x2_t, 1);
}
/* { dg-final { scan-assembler "vrshrn\.i64\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrn_nu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrn_nu16 (void)
{
uint8x8_t out_uint8x8_t;
uint16x8_t arg0_uint16x8_t;
out_uint8x8_t = vrshrn_n_u16 (arg0_uint16x8_t, 1);
}
/* { dg-final { scan-assembler "vrshrn\.i16\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrn_nu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrn_nu32 (void)
{
uint16x4_t out_uint16x4_t;
uint32x4_t arg0_uint32x4_t;
out_uint16x4_t = vrshrn_n_u32 (arg0_uint32x4_t, 1);
}
/* { dg-final { scan-assembler "vrshrn\.i32\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,19 @@
/* Test the `vRshrn_nu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRshrn_nu64 (void)
{
uint32x2_t out_uint32x2_t;
uint64x2_t arg0_uint64x2_t;
out_uint32x2_t = vrshrn_n_u64 (arg0_uint64x2_t, 1);
}
/* { dg-final { scan-assembler "vrshrn\.i64\[ \]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_ns16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_ns16 (void)
{
int16x8_t out_int16x8_t;
int16x8_t arg0_int16x8_t;
int16x8_t arg1_int16x8_t;
out_int16x8_t = vrsraq_n_s16 (arg0_int16x8_t, arg1_int16x8_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_ns32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_ns32 (void)
{
int32x4_t out_int32x4_t;
int32x4_t arg0_int32x4_t;
int32x4_t arg1_int32x4_t;
out_int32x4_t = vrsraq_n_s32 (arg0_int32x4_t, arg1_int32x4_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_ns64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_ns64 (void)
{
int64x2_t out_int64x2_t;
int64x2_t arg0_int64x2_t;
int64x2_t arg1_int64x2_t;
out_int64x2_t = vrsraq_n_s64 (arg0_int64x2_t, arg1_int64x2_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s64\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_ns8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_ns8 (void)
{
int8x16_t out_int8x16_t;
int8x16_t arg0_int8x16_t;
int8x16_t arg1_int8x16_t;
out_int8x16_t = vrsraq_n_s8 (arg0_int8x16_t, arg1_int8x16_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_nu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_nu16 (void)
{
uint16x8_t out_uint16x8_t;
uint16x8_t arg0_uint16x8_t;
uint16x8_t arg1_uint16x8_t;
out_uint16x8_t = vrsraq_n_u16 (arg0_uint16x8_t, arg1_uint16x8_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.u16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_nu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_nu32 (void)
{
uint32x4_t out_uint32x4_t;
uint32x4_t arg0_uint32x4_t;
uint32x4_t arg1_uint32x4_t;
out_uint32x4_t = vrsraq_n_u32 (arg0_uint32x4_t, arg1_uint32x4_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.u32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_nu64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_nu64 (void)
{
uint64x2_t out_uint64x2_t;
uint64x2_t arg0_uint64x2_t;
uint64x2_t arg1_uint64x2_t;
out_uint64x2_t = vrsraq_n_u64 (arg0_uint64x2_t, arg1_uint64x2_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.u64\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsraQ_nu8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsraQ_nu8 (void)
{
uint8x16_t out_uint8x16_t;
uint8x16_t arg0_uint8x16_t;
uint8x16_t arg1_uint8x16_t;
out_uint8x16_t = vrsraq_n_u8 (arg0_uint8x16_t, arg1_uint8x16_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.u8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsra_ns16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsra_ns16 (void)
{
int16x4_t out_int16x4_t;
int16x4_t arg0_int16x4_t;
int16x4_t arg1_int16x4_t;
out_int16x4_t = vrsra_n_s16 (arg0_int16x4_t, arg1_int16x4_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsra_ns32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsra_ns32 (void)
{
int32x2_t out_int32x2_t;
int32x2_t arg0_int32x2_t;
int32x2_t arg1_int32x2_t;
out_int32x2_t = vrsra_n_s32 (arg0_int32x2_t, arg1_int32x2_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsra_ns64' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsra_ns64 (void)
{
int64x1_t out_int64x1_t;
int64x1_t arg0_int64x1_t;
int64x1_t arg1_int64x1_t;
out_int64x1_t = vrsra_n_s64 (arg0_int64x1_t, arg1_int64x1_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s64\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsra_ns8' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsra_ns8 (void)
{
int8x8_t out_int8x8_t;
int8x8_t arg0_int8x8_t;
int8x8_t arg1_int8x8_t;
out_int8x8_t = vrsra_n_s8 (arg0_int8x8_t, arg1_int8x8_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsra_nu16' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsra_nu16 (void)
{
uint16x4_t out_uint16x4_t;
uint16x4_t arg0_uint16x4_t;
uint16x4_t arg1_uint16x4_t;
out_uint16x4_t = vrsra_n_u16 (arg0_uint16x4_t, arg1_uint16x4_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.u16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

View file

@ -0,0 +1,20 @@
/* Test the `vRsra_nu32' ARM Neon intrinsic. */
/* This file was autogenerated by neon-testgen. */
/* { dg-do assemble } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
#include "arm_neon.h"
void test_vRsra_nu32 (void)
{
uint32x2_t out_uint32x2_t;
uint32x2_t arg0_uint32x2_t;
uint32x2_t arg1_uint32x2_t;
out_uint32x2_t = vrsra_n_u32 (arg0_uint32x2_t, arg1_uint32x2_t, 1);
}
/* { dg-final { scan-assembler "vrsra\.u32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */
/* { dg-final { cleanup-saved-temps } } */

Some files were not shown because too many files have changed in this diff Show more