Makefile.in (TEXI_GCC_FILES): Add arm-neon-intrinsics.texi.

gcc/ * Makefile.in (TEXI_GCC_FILES): Add arm-neon-intrinsics.texi. * config.gcc (arm*-*-*): Add arm_neon.h to extra headers. (with_fpu): Allow --with-fpu=neon. * config/arm/aof.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15. * config/arm/aout.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15. * config/arm/arm-modes.def (EI, OI, CI, XI): New modes. * config/arm/arm-protos.h (neon_immediate_valid_for_move) (neon_immediate_valid_for_logic, neon_output_logic_immediate) (neon_pairwise_reduce, neon_expand_vector_init, neon_reinterpret) (neon_emit_pair_result_insn, neon_disambiguate_copy) (neon_vector_mem_operand, neon_struct_mem_operand, output_move_quad) (output_move_neon): Add prototypes. * config/arm/arm.c (FL_NEON): New flag for NEON processor capability. (all_fpus): Add FPUTYPE_NEON. (fp_model_for_fpu): Add NEON field. (arm_return_in_memory): Return vectors <= 16 bytes in ARM registers. (arm_arg_partial_bytes): Allow NEON vectors to be passed partially in registers. (arm_legitimate_address_p): Don't support fancy addressing for NEON structure moves. (thumb2_legitimate_address_p): Likewise. (neon_valid_immediate): Recognize and prepare constants suitable for NEON instructions. (neon_immediate_valid_for_move): New function. Recognize and prepare immediates for NEON move instructions. (neon_immediate_valid_for_logic): New function. Recognize and prepare immediates for NEON logic instructions. (neon_output_logic_immediate): New function. Create asm string suitable for outputting immediate logic instructions. (neon_pairwise_reduce): New function. Implement reduction using pairwise operations. (neon_expand_vector_init): New function. Expand a (possibly non-constant) vector initialization. (neon_vector_mem_operand): New function. Memory operands supported for quad-word loads/stores to/from ARM or NEON registers. Don't allow base+offset addressing for core regs. (neon_struct_mem_operand): New function. Valid mems for NEON structure moves. (coproc_secondary_reload_class): Enable NEON registers to be loaded from neon_vector_mem_operand addresses without a secondary register. (add_minipool_forward_ref): Handle >8-byte minipool entries. (add_minipool_backward_ref): Likewise. (dump_minipool): Likewise. (push_minipool_fix): Likewise. (output_move_quad): New function. Output quad-word moves, loads and stores using ARM registers. (output_move_vfp): Add support for vectors in VFP (NEON) D registers. (output_move_neon): Output a NEON load/store to/from a quadword register. (arm_print_operand): Implement new codes: - 'c' for unadorned integers (without a # sign). - 'J', 'K' for reg+2/reg+3, reg+3/reg+2 in little/big-endian mode. - 'e', 'f' for the low and high D parts of a NEON Q register. - 'q' outputs a NEON Q register. - 'h' outputs ranges of D registers for VLDM/VSTM etc. - 'T' prints NEON opcode features from a coded bitmask. - 'F' is similar to T, but signed/unsigned codes both print as 'i'. - 't' is similar to T, but 'u' is printed instead of 'p'. - 'O' prints 'r' if NEON instruction should perform rounding (as specified by bitmask), else prints nothing. - '#' is a punctuation character to stop operand numbers from running together with following digits in the assembler strings for instructions (when using mode attributes). (arm_assemble_integer): Handle extra NEON vector modes. Permute constant vectors in big-endian mode, where necessary. (arm_hard_regno_mode_ok): Allow vectors in VFP/NEON registers. Handle EI, OI, CI, XI modes. (ashlv4hi3, ashlv2si3, lshrv4hi3, lshrv2si3, ashrv4hi3) (ashrv2si3): Rename IWMMXT2_BUILTINs to... (ashlv4hi3_iwmmxt, ashlv2si3_iwmmxt, lshrv4hi3_iwmmxt) (lshrv2si3_iwmmxt, ashrv4hi3_iwmmxt, ashrv2si3_iwmmxt): New names. (neon_builtin_type_bits): Add enumeration, one bit for each vector type. (v8qi_UP, v4hi_UP, v2si_UP, v2sf_UP, di_UP, v16qi_UP, v8hi_UP) (v4si_UP, v4sf_UP, v2di_UP, ti_UP, ei_UP, oi_UP, UP): Define macros to turn v8qi, etc. into bits defined above. (neon_itype): New enumeration. Classifications of NEON builtins. (neon_builtin_datum): Define struct. Contains information about a single builtin (with multiple modes). (CF): Define helper macro for... (VAR1...VAR10): Define builtins with a type, name and 1-10 different modes. (neon_builtin_data): New array. Define information about builtins for use during initialization/expansion. (arm_init_neon_builtins): New function. (arm_init_builtins): Call arm_init_neon_builtins if TARGET_NEON is true. (neon_builtin_compare): New function. (locate_neon_builtin_icode): New function. Find an insn code for a builtin given a function code for that builtin. Also return type of builtin (NEON_BINOP, NEON_UNOP etc.). (builtin_arg): New enumeration. Types of arguments for builtins. (arm_expand_neon_args): New function. Expand a generic NEON builtin. Takes a variable argument list of builtin_arg types, terminated by NEON_ARG_STOP. (arm_expand_neon_builtin): New function. Expand a NEON builtin. (neon_reinterpret): New function. Expand NEON reinterpret intrinsic. (neon_emit_pair_result_insn): New function. Support returning pairs of vectors via a pointer. (neon_disambiguate_copy): New function. Set up operands for a multi-word copy such that registers do not get clobbered. (arm_expand_builtin): Call arm_expand_neon_builtin if fcode >= ARM_BUILTIN_NEON_BASE. (arm_file_start): Set float-abi attribute for NEON. (arm_vector_mode_supported_p): Enable NEON vector modes. (arm_mangle_map_entry): New. (arm_mangle_map): New. (arm_mangle_vector_type): New. * config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_NEON__ when appropriate. (TARGET_NEON): New macro. Target supports NEON. (fputype): Add FPUTYPE_NEON. (UNITS_PER_SIMD_WORD): Define. Allow quad-word registers to be used for vectorization based on command-line arg. (NEON_REGNO_OK_FOR_NREGS): Define. (VALID_NEON_DREG_MODE, VALID_NEON_QREG_MODE) (VALID_NEON_STRUCT_MODE): Define. (PRINT_OPERAND_PUNCT_VALID_P): '#' is valid punctuation. (arm_builtins): Add ARM_BUILTIN_NEON_BASE. * config/arm/arm.md (VUNSPEC_POOL_16): Insert constant for unspec. (consttable_16): Add pattern for outputting 16-byte minipool entries. (movv2si, movv4hi, movv8qi): Remove blank expanders (redefined in vec-common.md). (vec-common.md, neon.md): Include md files. * config/arm/arm.opt (mvectorize-with-neon-quad): Add option. * config/arm/constraints.md (constraint "Dn", "Dl", "DL"): Define. (memory_constraint "Ut", "Un", "Us"): Define. * config/arm/iwmmxt.md (VMMX, VSHFT): New mode macros. (MMX_char): New mode attribute. (addv8qi3, addv4hi3, addv2si3): Remove. Replace with... (*add<mode>3_iwmmxt): New insn pattern. (subv8qi3, subv4hi3, subv2si3): Remove. Replace with... (*sub<mode>3_iwmmxt): New insn pattern. (mulv4hi3): Rename to... (*mulv4hi3_iwmmxt): This. (smaxv8qi3, smaxv4hi3, smaxv2si3, umaxv8qi3, umaxv4hi3) (umaxv2si3, sminv8qi3, sminv4hi3, sminv2si3, uminv8qi3) (uminv4hi3, uminv2si3): Remove. Replace with... (*smax<mode>3_iwmmxt, *umax<mode>3_iwmmxt, *smin<mode>3_iwmmxt) (*umin<mode>3_iwmmxt): These. (ashrv4hi3, ashrv2si3, ashrdi3_iwmmxt): Replace with... (ashr<mode>3_iwmmxt): This new pattern. (lshrv4hi3, lshrv2si3, lshrdi3_iwmmxt): Replace with... (lshr<mode>3_iwmmxt): This new pattern. (ashlv4hi3, ashlv2si3, ashldi3_iwmmxt): Replace with... (ashl<mode>3_iwmmxt): This new pattern. * config/arm/neon-docgen.ml: New file. Generate documentation for intrinsics. * config/arm/neon-gen.ml: New file. Generate arm_neon.h header. * config/arm/arm_neon.h: New (autogenerated). * config/arm/neon-testgen.ml: New file. Generate NEON tests automatically. * config/arm/neon.md: New file. Define NEON instructions. * config/arm/neon.ml: New file. Abstract description of NEON instructions, used to generate arm_neon.h header, documentation and tests. * config/arm/t-arm (MD_INCLUDES): Add vec-common.md, neon.md. * vec-common.md: New file. Shared parts for iWMMXt and NEON vector support. * doc/extend.texi (ARM Built-in Functions): Rename and remove extraneous comma. (ARM NEON Intrinsics): New subsection. * doc/arm-neon-intrinsics.texi: New (autogenerated). gcc/testsuite/ * gcc.dg/vect/vect.exp: Check is-effective-target arm_neon_hw. * gcc.dg/vect/tree-vect.h: Check for NEON SIMD support. * lib/gcc-dg.exp (cleanup-saved-temps): Fix comment. * lib/target-supports.exp (check_effective_target_arm_neon_ok) (check_effective_target_arm_neon_hw): New. * gcc.target/arm/neon/neon.exp: New file. * gcc.target/arm/neon/polytypes.c: New file. * gcc.target/arm/neon/v*.c (1870 files): New (autogenerated). Co-Authored-By: Joseph Myers <joseph@codesourcery.com> Co-Authored-By: Mark Shinwell <shinwell@codesourcery.com> Co-Authored-By: Paul Brook <paul@codesourcery.com> From-SVN: r126911
2007-07-25 12:28:31 +00:00 · 2007-07-25 12:28:31 +00:00 · 88f77cba02
commit 88f77cba02
parent 15d92b36a1
1902 changed files with 69377 additions and 303 deletions
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@ -1,3 +1,176 @@
+2007-07-25  Julian Brown  <julian@codesourcery.com>
+	    Paul Brook  <paul@codesourcery.com>
+	    Joseph Myers  <joseph@codesourcery.com>
+	    Mark Shinwell  <shinwell@codesourcery.com>
+
+	* Makefile.in (TEXI_GCC_FILES): Add arm-neon-intrinsics.texi.
+	* config.gcc (arm*-*-*): Add arm_neon.h to extra headers.
+	(with_fpu): Allow --with-fpu=neon.
+	* config/arm/aof.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15.
+	* config/arm/aout.h (ADDITIONAL_REGISTER_NAMES): Add Q0-Q15.
+	* config/arm/arm-modes.def (EI, OI, CI, XI): New modes.
+	* config/arm/arm-protos.h (neon_immediate_valid_for_move)
+	(neon_immediate_valid_for_logic, neon_output_logic_immediate)
+	(neon_pairwise_reduce, neon_expand_vector_init, neon_reinterpret)
+	(neon_emit_pair_result_insn, neon_disambiguate_copy)
+	(neon_vector_mem_operand, neon_struct_mem_operand, output_move_quad)
+	(output_move_neon): Add prototypes.
+	* config/arm/arm.c (FL_NEON): New flag for NEON processor capability.
+	(all_fpus): Add FPUTYPE_NEON.
+	(fp_model_for_fpu): Add NEON field.
+	(arm_return_in_memory): Return vectors <= 16 bytes in ARM registers.
+	(arm_arg_partial_bytes): Allow NEON vectors to be passed partially
+	in registers.
+	(arm_legitimate_address_p): Don't support fancy addressing for NEON
+	structure moves.
+	(thumb2_legitimate_address_p): Likewise.
+	(neon_valid_immediate): Recognize and prepare constants suitable for
+	NEON instructions.
+	(neon_immediate_valid_for_move): New function. Recognize and prepare
+	immediates for NEON move instructions.
+	(neon_immediate_valid_for_logic): New function. Recognize and
+	prepare immediates for NEON logic instructions.
+	(neon_output_logic_immediate): New function. Create asm string
+	suitable for outputting immediate logic instructions.
+	(neon_pairwise_reduce): New function. Implement reduction using
+	pairwise operations.
+	(neon_expand_vector_init): New function. Expand a (possibly
+	non-constant) vector initialization.
+	(neon_vector_mem_operand): New function. Memory operands supported
+	for quad-word loads/stores to/from ARM or NEON registers. Don't
+	allow base+offset addressing for core regs.
+	(neon_struct_mem_operand): New function. Valid mems for NEON
+	structure moves.
+	(coproc_secondary_reload_class): Enable NEON registers to be loaded
+	from neon_vector_mem_operand addresses without a secondary register.
+	(add_minipool_forward_ref): Handle >8-byte minipool entries.
+	(add_minipool_backward_ref): Likewise.
+	(dump_minipool): Likewise.
+	(push_minipool_fix): Likewise.
+	(output_move_quad): New function. Output quad-word moves, loads and
+	stores using ARM registers.
+	(output_move_vfp): Add support for vectors in VFP (NEON) D
+	registers.
+	(output_move_neon): Output a NEON load/store to/from a quadword
+	register.
+	(arm_print_operand): Implement new codes:
+	- 'c' for unadorned integers (without a # sign).
+	- 'J', 'K' for reg+2/reg+3, reg+3/reg+2 in little/big-endian
+	mode.
+	- 'e', 'f' for the low and high D parts of a NEON Q register.
+	- 'q' outputs a NEON Q register.
+	- 'h' outputs ranges of D registers for VLDM/VSTM etc.
+	- 'T' prints NEON opcode features from a coded bitmask.
+	- 'F' is similar to T, but signed/unsigned codes both print as
+	'i'.
+	- 't' is similar to T, but 'u' is printed instead of 'p'.
+	- 'O' prints 'r' if NEON instruction should perform rounding (as
+	specified by bitmask), else prints nothing.
+	- '#' is a punctuation character to stop operand numbers from
+	running together with following digits in the assembler
+	strings for instructions (when using mode attributes).
+	(arm_assemble_integer): Handle extra NEON vector modes. Permute
+	constant vectors in big-endian mode, where necessary.
+	(arm_hard_regno_mode_ok): Allow vectors in VFP/NEON registers.
+	Handle EI, OI, CI, XI modes.
+	(ashlv4hi3, ashlv2si3, lshrv4hi3, lshrv2si3, ashrv4hi3)
+	(ashrv2si3): Rename IWMMXT2_BUILTINs to...
+	(ashlv4hi3_iwmmxt, ashlv2si3_iwmmxt, lshrv4hi3_iwmmxt)
+	(lshrv2si3_iwmmxt, ashrv4hi3_iwmmxt, ashrv2si3_iwmmxt): New names.
+	(neon_builtin_type_bits): Add enumeration, one bit for each vector
+	type.
+	(v8qi_UP, v4hi_UP, v2si_UP, v2sf_UP, di_UP, v16qi_UP, v8hi_UP)
+	(v4si_UP, v4sf_UP, v2di_UP, ti_UP, ei_UP, oi_UP, UP): Define macros
+	to turn v8qi, etc. into bits defined above.
+	(neon_itype): New enumeration. Classifications of NEON builtins.
+	(neon_builtin_datum): Define struct. Contains information about
+	a single builtin (with multiple modes).
+	(CF): Define helper macro for...
+	(VAR1...VAR10): Define builtins with a type, name and 1-10 different
+	modes.
+	(neon_builtin_data): New array. Define information about builtins
+	for use during initialization/expansion.
+	(arm_init_neon_builtins): New function.
+	(arm_init_builtins): Call arm_init_neon_builtins if TARGET_NEON is
+	true.
+	(neon_builtin_compare): New function.
+	(locate_neon_builtin_icode): New function. Find an insn code for a
+	builtin given a function code for that builtin. Also return type of
+	builtin (NEON_BINOP, NEON_UNOP etc.).
+	(builtin_arg): New enumeration. Types of arguments for builtins.
+	(arm_expand_neon_args): New function. Expand a generic NEON builtin.
+	Takes a variable argument list of builtin_arg types, terminated by
+	NEON_ARG_STOP.
+	(arm_expand_neon_builtin): New function. Expand a NEON builtin.
+	(neon_reinterpret): New function. Expand NEON reinterpret intrinsic.
+	(neon_emit_pair_result_insn): New function. Support returning pairs
+	of vectors via a pointer.
+	(neon_disambiguate_copy): New function. Set up operands for a
+	multi-word copy such that registers do not get clobbered.
+	(arm_expand_builtin): Call arm_expand_neon_builtin if fcode >=
+	ARM_BUILTIN_NEON_BASE.
+	(arm_file_start): Set float-abi attribute for NEON.
+	(arm_vector_mode_supported_p): Enable NEON vector modes.
+	(arm_mangle_map_entry): New.
+	(arm_mangle_map): New.
+	(arm_mangle_vector_type): New.
+	* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_NEON__
+	when appropriate.
+	(TARGET_NEON): New macro. Target supports NEON.
+	(fputype): Add FPUTYPE_NEON.
+	(UNITS_PER_SIMD_WORD): Define. Allow quad-word registers to be used
+	for vectorization based on command-line arg.
+	(NEON_REGNO_OK_FOR_NREGS): Define.
+	(VALID_NEON_DREG_MODE, VALID_NEON_QREG_MODE)
+	(VALID_NEON_STRUCT_MODE): Define.
+	(PRINT_OPERAND_PUNCT_VALID_P): '#' is valid punctuation.
+	(arm_builtins): Add ARM_BUILTIN_NEON_BASE.
+	* config/arm/arm.md (VUNSPEC_POOL_16): Insert constant for unspec.
+	(consttable_16): Add pattern for outputting 16-byte minipool
+	entries.
+	(movv2si, movv4hi, movv8qi): Remove blank expanders (redefined in
+	vec-common.md).
+	(vec-common.md, neon.md): Include md files.
+	* config/arm/arm.opt (mvectorize-with-neon-quad): Add option.
+	* config/arm/constraints.md (constraint "Dn", "Dl", "DL"): Define.
+	(memory_constraint "Ut", "Un", "Us"): Define.
+	* config/arm/iwmmxt.md (VMMX, VSHFT): New mode macros.
+	(MMX_char): New mode attribute.
+	(addv8qi3, addv4hi3, addv2si3): Remove. Replace with...
+	(*add<mode>3_iwmmxt): New insn pattern.
+	(subv8qi3, subv4hi3, subv2si3): Remove. Replace with...
+	(*sub<mode>3_iwmmxt): New insn pattern.
+	(mulv4hi3): Rename to...
+	(*mulv4hi3_iwmmxt): This.
+	(smaxv8qi3, smaxv4hi3, smaxv2si3, umaxv8qi3, umaxv4hi3)
+	(umaxv2si3, sminv8qi3, sminv4hi3, sminv2si3, uminv8qi3)
+	(uminv4hi3, uminv2si3): Remove. Replace with...
+	(*smax<mode>3_iwmmxt, *umax<mode>3_iwmmxt, *smin<mode>3_iwmmxt)
+	(*umin<mode>3_iwmmxt): These.
+	(ashrv4hi3, ashrv2si3, ashrdi3_iwmmxt): Replace with...
+	(ashr<mode>3_iwmmxt): This new pattern.
+	(lshrv4hi3, lshrv2si3, lshrdi3_iwmmxt): Replace with...
+	(lshr<mode>3_iwmmxt): This new pattern.
+	(ashlv4hi3, ashlv2si3, ashldi3_iwmmxt): Replace with...
+	(ashl<mode>3_iwmmxt): This new pattern.
+	* config/arm/neon-docgen.ml: New file. Generate documentation for
+	intrinsics.
+	* config/arm/neon-gen.ml: New file. Generate arm_neon.h header.
+	* config/arm/arm_neon.h: New (autogenerated).
+	* config/arm/neon-testgen.ml: New file. Generate NEON tests
+	automatically.
+	* config/arm/neon.md: New file. Define NEON instructions.
+	* config/arm/neon.ml: New file. Abstract description of NEON
+	instructions, used to generate arm_neon.h header, documentation and
+	tests.
+	* config/arm/t-arm (MD_INCLUDES): Add vec-common.md, neon.md.
+	* vec-common.md: New file. Shared parts for iWMMXt and NEON vector
+	support.
+	* doc/extend.texi (ARM Built-in Functions): Rename and remove
+	extraneous comma.
+	(ARM NEON Intrinsics): New subsection.
+	* doc/arm-neon-intrinsics.texi: New (autogenerated).
+
 2007-07-25  Danny Smith   <dannysmith@users.sourceforge.net>

 	* config/i386/i386-protos.h (i386_pe_asm_file_end): Remove
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@ -3581,7 +3581,7 @@ TEXI_GCC_FILES = gcc.texi gcc-common.texi gcc-vers.texi frontends.texi	\
 	 gcov.texi trouble.texi bugreport.texi service.texi		\
 	 contribute.texi compat.texi funding.texi gnu.texi gpl.texi	\
 	 fdl.texi contrib.texi cppenv.texi cppopts.texi			\
-	 implement-c.texi
+	 implement-c.texi arm-neon-intrinsics.texi

 TEXI_GCCINT_FILES = gccint.texi gcc-common.texi gcc-vers.texi		\
 	 contribute.texi makefile.texi configterms.texi options.texi	\
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@ -259,7 +259,7 @@ strongarm*-*-*)
 	;;
 arm*-*-*)
 	cpu_type=arm
-	extra_headers="mmintrin.h"
+	extra_headers="mmintrin.h arm_neon.h"
 	;;
 bfin*-*)
 	cpu_type=bfin
@ -2841,7 +2841,7 @@ case "${target}" in

 		case "$with_fpu" in
 		"" \
-		| fpa | fpe2 | fpe3 | maverick | vfp | vfp3 )
+		| fpa | fpe2 | fpe3 | maverick | vfp | vfp3 | neon )
 			# OK
 			;;
 		*)
--- a/gcc/config/arm/aof.h
+++ b/gcc/config/arm/aof.h
@ -239,22 +239,30 @@ do {					\
  {"r13", 13}, {"sp", 13}, 			\
  {"r14", 14}, {"lr", 14},			\
  {"r15", 15}, {"pc", 15},			\
-  {"d0", 63},					\
+  {"d0", 63}, {"q0", 63},			\
  {"d1", 65},					\
-  {"d2", 67},					\
+  {"d2", 67}, {"q1", 67},			\
  {"d3", 69},					\
-  {"d4", 71},					\
+  {"d4", 71}, {"q2", 71},			\
  {"d5", 73},					\
-  {"d6", 75},					\
+  {"d6", 75}, {"q3", 75},			\
  {"d7", 77},					\
-  {"d8", 79},					\
+  {"d8", 79}, {"q4", 79},			\
  {"d9", 81},					\
-  {"d10", 83},					\
+  {"d10", 83}, {"q5", 83},			\
  {"d11", 85},					\
-  {"d12", 87},					\
+  {"d12", 87}, {"q6", 87},			\
  {"d13", 89},					\
-  {"d14", 91},					\
-  {"d15", 93}					\
+  {"d14", 91}, {"q7", 91},			\
+  {"d15", 93},					\
+  {"q8", 95},					\
+  {"q9", 99},					\
+  {"q10", 103},					\
+  {"q11", 107},					\
+  {"q12", 111},					\
+  {"q13", 115},					\
+  {"q14", 119},					\
+  {"q15", 123}					\
 }

 #define REGISTER_PREFIX "__"
--- a/gcc/config/arm/aout.h
+++ b/gcc/config/arm/aout.h
@ -165,22 +165,30 @@
  {"mvdx13", 40},				\
  {"mvdx14", 41},				\
  {"mvdx15", 42},				\
-  {"d0", 63},					\
+  {"d0", 63}, {"q0", 63},			\
  {"d1", 65},					\
-  {"d2", 67},					\
+  {"d2", 67}, {"q1", 67},			\
  {"d3", 69},					\
-  {"d4", 71},					\
+  {"d4", 71}, {"q2", 71},			\
  {"d5", 73},					\
-  {"d6", 75},					\
+  {"d6", 75}, {"q3", 75},			\
  {"d7", 77},					\
-  {"d8", 79},					\
+  {"d8", 79}, {"q4", 79},			\
  {"d9", 81},					\
-  {"d10", 83},					\
+  {"d10", 83}, {"q5", 83},			\
  {"d11", 85},					\
-  {"d12", 87},					\
+  {"d12", 87}, {"q6", 87},			\
  {"d13", 89},					\
-  {"d14", 91},					\
+  {"d14", 91}, {"q7", 91},			\
  {"d15", 93},					\
+  {"q8", 95},					\
+  {"q9", 99},					\
+  {"q10", 103},					\
+  {"q11", 107},					\
+  {"q12", 111},					\
+  {"q13", 115},					\
+  {"q14", 119},					\
+  {"q15", 123}					\
 }
 #endif

--- a/gcc/config/arm/arm-modes.def
+++ b/gcc/config/arm/arm-modes.def
@ -58,3 +58,11 @@ VECTOR_MODES (INT, 16);       /* V16QI V8HI V4SI V2DI */
 VECTOR_MODES (FLOAT, 8);      /*            V4HF V2SF */
 VECTOR_MODES (FLOAT, 16);     /*       V8HF V4SF V2DF */

+/* Opaque integer modes for 3, 4, 6 or 8 Neon double registers (2 is
+   TImode).  */
+INT_MODE (EI, 24);
+INT_MODE (OI, 32);
+INT_MODE (CI, 48);
+/* ??? This should actually have 512 bits but the precision only has 9
+   bits.  */
+FRACTIONAL_INT_MODE (XI, 511, 64);
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@ -68,6 +68,19 @@ extern rtx thumb_legitimize_reload_address (rtx *, enum machine_mode, int, int,
 extern int arm_const_double_rtx (rtx);
 extern int neg_const_double_rtx_ok_for_fpa (rtx);
 extern int vfp3_const_double_rtx (rtx);
+extern int neon_immediate_valid_for_move (rtx, enum machine_mode, rtx *, int *);
+extern int neon_immediate_valid_for_logic (rtx, enum machine_mode, int, rtx *,
+					   int *);
+extern char *neon_output_logic_immediate (const char *, rtx *,
+					  enum machine_mode, int, int);
+extern void neon_pairwise_reduce (rtx, rtx, enum machine_mode,
+				  rtx (*) (rtx, rtx, rtx));
+extern void neon_expand_vector_init (rtx, rtx);
+extern void neon_reinterpret (rtx, rtx);
+extern void neon_emit_pair_result_insn (enum machine_mode,
+					rtx (*) (rtx, rtx, rtx, rtx),
+					rtx, rtx, rtx);
+extern void neon_disambiguate_copy (rtx *, rtx *, rtx *, unsigned int);
 extern enum reg_class coproc_secondary_reload_class (enum machine_mode, rtx,
 						     bool);
 extern bool arm_tls_referenced_p (rtx);
@ -75,6 +88,8 @@ extern bool arm_cannot_force_const_mem (rtx);

 extern int cirrus_memory_offset (rtx);
 extern int arm_coproc_mem_operand (rtx, bool);
+extern int neon_vector_mem_operand (rtx, bool);
+extern int neon_struct_mem_operand (rtx);
 extern int arm_no_early_store_addr_dep (rtx, rtx);
 extern int arm_no_early_alu_shift_dep (rtx, rtx);
 extern int arm_no_early_alu_shift_value_dep (rtx, rtx);
@ -113,7 +128,9 @@ extern const char *output_mov_long_double_arm_from_arm (rtx *);
 extern const char *output_mov_double_fpa_from_arm (rtx *);
 extern const char *output_mov_double_arm_from_fpa (rtx *);
 extern const char *output_move_double (rtx *);
+extern const char *output_move_quad (rtx *);
 extern const char *output_move_vfp (rtx *operands);
+extern const char *output_move_neon (rtx *operands);
 extern const char *output_add_immediate (rtx *);
 extern const char *arithmetic_instr (rtx, int);
 extern void output_ascii_pseudo_op (FILE *, const unsigned char *, int);
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@ -65,6 +65,9 @@ extern char arm_arch_name[];
 	if (TARGET_VFP)					\
 	  builtin_define ("__VFP_FP__");		\
 							\
+	if (TARGET_NEON)				\
+	  builtin_define ("__ARM_NEON__");		\
+							\
 	/* Add a define for interworking.		\
 	   Needed when building libgcc.a.  */		\
 	if (arm_cpp_interwork)				\
@ -206,10 +209,23 @@ extern GTY(()) rtx aof_pic_label;
 /* 32-bit Thumb-2 code.  */
 #define TARGET_THUMB2			(TARGET_THUMB && arm_arch_thumb2)

+/* The following two macros concern the ability to execute coprocessor
+   instructions for VFPv3 or NEON.  TARGET_VFP3 is currently only ever
+   tested when we know we are generating for VFP hardware; we need to
+   be more careful with TARGET_NEON as noted below.  */
+
 /* FPU is VFPv3 (with twice the number of D registers).  Setting the FPU to
   Neon automatically enables VFPv3 too.  */
 #define TARGET_VFP3 (arm_fp_model == ARM_FP_MODEL_VFP \
-		     && (arm_fpu_arch == FPUTYPE_VFP3))
+		     && (arm_fpu_arch == FPUTYPE_VFP3 \
+			 || arm_fpu_arch == FPUTYPE_NEON))
+/* FPU supports Neon instructions.  The setting of this macro gets
+   revealed via __ARM_NEON__ so we add extra guards upon TARGET_32BIT
+   and TARGET_HARD_FLOAT to ensure that NEON instructions are
+   available.  */
+#define TARGET_NEON (TARGET_32BIT && TARGET_HARD_FLOAT \
+		     && arm_fp_model == ARM_FP_MODEL_VFP \
+		     && arm_fpu_arch == FPUTYPE_NEON)

 /* "DSP" multiply instructions, eg. SMULxy.  */
 #define TARGET_DSP_MULTIPLY \
@ -282,7 +298,9 @@ enum fputype
  /* VFP.  */
  FPUTYPE_VFP,
  /* VFPv3.  */
-  FPUTYPE_VFP3
+  FPUTYPE_VFP3,
+  /* Neon.  */
+  FPUTYPE_NEON
 };

 /* Recast the floating point class to be the floating point attribute.  */
@ -483,6 +501,12 @@ extern int arm_arch_hwdiv;

 #define UNITS_PER_WORD	4

+/* Use the option -mvectorize-with-neon-quad to override the use of doubleword
+   registers when autovectorizing for Neon, at least until multiple vector
+   widths are supported properly by the middle-end.  */
+#define UNITS_PER_SIMD_WORD \
+  (TARGET_NEON ? (TARGET_NEON_VECTORIZE_QUAD ? 16 : 8) : UNITS_PER_WORD)
+
 /* True if natural alignment is used for doubleword types.  */
 #define ARM_DOUBLEWORD_ALIGN	TARGET_AAPCS_BASED

@ -941,6 +965,18 @@ extern int arm_structure_size_boundary;
 #define VFP_REGNO_OK_FOR_DOUBLE(REGNUM) \
  ((((REGNUM) - FIRST_VFP_REGNUM) & 1) == 0)

+/* Neon Quad values must start at a multiple of four registers.  */
+#define NEON_REGNO_OK_FOR_QUAD(REGNUM) \
+  ((((REGNUM) - FIRST_VFP_REGNUM) & 3) == 0)
+
+/* Neon structures of vectors must be in even register pairs and there
+   must be enough registers available.  Because of various patterns
+   requiring quad registers, we require them to start at a multiple of
+   four.  */
+#define NEON_REGNO_OK_FOR_NREGS(REGNUM, N) \
+  ((((REGNUM) - FIRST_VFP_REGNUM) & 3) == 0 \
+   && (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1))
+
 /* The number of hard registers is 16 ARM + 8 FPA + 1 CC + 1 SFP + 1 AFP.  */
 /* + 16 Cirrus registers take us up to 43.  */
 /* Intel Wireless MMX Technology registers add 16 + 4 more.  */
@ -994,6 +1030,21 @@ extern int arm_structure_size_boundary;
 #define VALID_IWMMXT_REG_MODE(MODE) \
 (arm_vector_mode_supported_p (MODE) || (MODE) == DImode)

+/* Modes valid for Neon D registers.  */
+#define VALID_NEON_DREG_MODE(MODE) \
+  ((MODE) == V2SImode || (MODE) == V4HImode || (MODE) == V8QImode \
+   || (MODE) == V2SFmode || (MODE) == DImode)
+
+/* Modes valid for Neon Q registers.  */
+#define VALID_NEON_QREG_MODE(MODE) \
+  ((MODE) == V4SImode || (MODE) == V8HImode || (MODE) == V16QImode \
+   || (MODE) == V4SFmode || (MODE) == V2DImode)
+
+/* Structure modes valid for Neon registers.  */
+#define VALID_NEON_STRUCT_MODE(MODE) \
+  ((MODE) == TImode || (MODE) == EImode || (MODE) == OImode \
+   || (MODE) == CImode || (MODE) == XImode)
+
 /* The order in which register should be allocated.  It is good to use ip
   since no saving is required (though calls clobber it) and it never contains
   function parameters.  It is quite good to use lr since other calls may
@ -2409,7 +2460,7 @@ extern int making_const_table;

 #define PRINT_OPERAND_PUNCT_VALID_P(CODE)	\
  (CODE == '@' || CODE == '|' || CODE == '.'	\
-   || CODE == '(' || CODE == ')'		\
+   || CODE == '(' || CODE == ')' || CODE == '#'	\
   || (TARGET_32BIT && (CODE == '?'))		\
   || (TARGET_THUMB2 && (CODE == '!'))		\
   || (TARGET_THUMB && (CODE == '_')))
@ -2581,6 +2632,9 @@ extern int making_const_table;
   : arm_gen_return_addr_mask ())


+/* Neon defines builtins from ARM_BUILTIN_MAX upwards, though they don't have
+   symbolic names defined here (which would require too much duplication).
+   FIXME?  */
 enum arm_builtins
 {
  ARM_BUILTIN_GETWCX,
@ -2745,7 +2799,9 @@ enum arm_builtins

  ARM_BUILTIN_THREAD_POINTER,

-  ARM_BUILTIN_MAX
+  ARM_BUILTIN_NEON_BASE,
+
+  ARM_BUILTIN_MAX = ARM_BUILTIN_NEON_BASE  /* FIXME: Wrong!  */
 };

 /* Do not emit .note.GNU-stack by default.  */
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@ -51,6 +51,7 @@

 ;; UNSPEC Usage:
 ;; Note: sin and cos are no-longer used.
+;; Unspec constants for Neon are defined in neon.md.

 (define_constants
  [(UNSPEC_SIN       0)	; `sin' operation (MODE_FLOAT):
@ -121,12 +122,14 @@
 			;   a 32-bit object.
   (VUNSPEC_POOL_8   7) ; `pool-entry(8)'.  An entry in the constant pool for
 			;   a 64-bit object.
-   (VUNSPEC_TMRC     8) ; Used by the iWMMXt TMRC instruction.
-   (VUNSPEC_TMCR     9) ; Used by the iWMMXt TMCR instruction.
-   (VUNSPEC_ALIGN8   10) ; 8-byte alignment version of VUNSPEC_ALIGN
-   (VUNSPEC_WCMP_EQ  11) ; Used by the iWMMXt WCMPEQ instructions
-   (VUNSPEC_WCMP_GTU 12) ; Used by the iWMMXt WCMPGTU instructions
-   (VUNSPEC_WCMP_GT  13) ; Used by the iwMMXT WCMPGT instructions
+   (VUNSPEC_POOL_16  8) ; `pool-entry(16)'.  An entry in the constant pool for
+			;   a 128-bit object.
+   (VUNSPEC_TMRC     9) ; Used by the iWMMXt TMRC instruction.
+   (VUNSPEC_TMCR     10) ; Used by the iWMMXt TMCR instruction.
+   (VUNSPEC_ALIGN8   11) ; 8-byte alignment version of VUNSPEC_ALIGN
+   (VUNSPEC_WCMP_EQ  12) ; Used by the iWMMXt WCMPEQ instructions
+   (VUNSPEC_WCMP_GTU 13) ; Used by the iWMMXt WCMPGTU instructions
+   (VUNSPEC_WCMP_GT  14) ; Used by the iwMMXT WCMPGT instructions
   (VUNSPEC_EH_RETURN 20); Use to override the return address for exception
 			 ; handling.
  ]
@ -5768,27 +5771,6 @@
  "
 )

-;; Vector Moves
-(define_expand "movv2si"
-  [(set (match_operand:V2SI 0 "nonimmediate_operand" "")
-	(match_operand:V2SI 1 "general_operand" ""))]
-  "TARGET_REALLY_IWMMXT"
-{
-})
-
-(define_expand "movv4hi"
-  [(set (match_operand:V4HI 0 "nonimmediate_operand" "")
-	(match_operand:V4HI 1 "general_operand" ""))]
-  "TARGET_REALLY_IWMMXT"
-{
-})
-
-(define_expand "movv8qi"
-  [(set (match_operand:V8QI 0 "nonimmediate_operand" "")
-	(match_operand:V8QI 1 "general_operand" ""))]
-  "TARGET_REALLY_IWMMXT"
-{
-})


 ;; load- and store-multiple insns
@ -10731,6 +10713,30 @@
  [(set_attr "length" "8")]
 )

+(define_insn "consttable_16"
+  [(unspec_volatile [(match_operand 0 "" "")] VUNSPEC_POOL_16)]
+  "TARGET_EITHER"
+  "*
+  {
+    making_const_table = TRUE;
+    switch (GET_MODE_CLASS (GET_MODE (operands[0])))
+      {
+       case MODE_FLOAT:
+        {
+          REAL_VALUE_TYPE r;
+          REAL_VALUE_FROM_CONST_DOUBLE (r, operands[0]);
+          assemble_real (r, GET_MODE (operands[0]), BITS_PER_WORD);
+          break;
+        }
+      default:
+        assemble_integer (operands[0], 16, BITS_PER_WORD, 1);
+        break;
+      }
+    return \"\";
+  }"
+  [(set_attr "length" "16")]
+)
+
 ;; Miscellaneous Thumb patterns

 (define_expand "tablejump"
@ -10906,10 +10912,14 @@
 (include "fpa.md")
 ;; Load the Maverick co-processor patterns
 (include "cirrus.md")
+;; Vector bits common to IWMMXT and Neon
+(include "vec-common.md")
 ;; Load the Intel Wireless Multimedia Extension patterns
 (include "iwmmxt.md")
 ;; Load the VFP co-processor patterns
 (include "vfp.md")
 ;; Thumb-2 patterns
 (include "thumb2.md")
+;; Neon patterns
+(include "neon.md")

--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@ -153,3 +153,7 @@ Tune code for the given processor
 mwords-little-endian
 Target Report RejectNegative Mask(LITTLE_WORDS)
 Assume big endian bytes, little endian words
+
+mvectorize-with-neon-quad
+Target Report Mask(NEON_VECTORIZE_QUAD)
+Use Neon quad-word (rather than double-word) registers for vectorization
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@ -30,10 +30,10 @@
 ;; in Thumb-1 state: I, J, K, L, M, N, O

 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dv
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv

 ;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Q, Uv, Uy
+;; in ARM/Thumb-2 state: Q, Ut, Uv, Uy, Un, Us
 ;; in ARM state: Uq


@ -164,6 +164,30 @@
      (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4
 		   && !(optimize_size || arm_ld_sched)")))

+(define_constraint "Dn"
+ "@internal
+  In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov
+  immediate instruction."
+ (and (match_code "const_vector")
+      (match_test "TARGET_32BIT
+		   && imm_for_neon_mov_operand (op, GET_MODE (op))")))
+
+(define_constraint "Dl"
+ "@internal
+  In ARM/Thumb-2 state a const_vector which can be used with a Neon vorr or
+  vbic instruction."
+ (and (match_code "const_vector")
+      (match_test "TARGET_32BIT
+		   && imm_for_neon_logic_operand (op, GET_MODE (op))")))
+
+(define_constraint "DL"
+ "@internal
+  In ARM/Thumb-2 state a const_vector which can be used with a Neon vorn or
+  vand instruction."
+ (and (match_code "const_vector")
+      (match_test "TARGET_32BIT
+		   && imm_for_neon_inv_logic_operand (op, GET_MODE (op))")))
+
 (define_constraint "Dv"
 "@internal
  In ARM/Thumb-2 state a const_double which can be used with a VFP fconsts
@ -171,6 +195,13 @@
 (and (match_code "const_double")
      (match_test "TARGET_32BIT && vfp3_const_double_rtx (op)")))

+(define_memory_constraint "Ut"
+ "@internal
+  In ARM/Thumb-2 state an address valid for loading/storing opaque structure
+  types wider than TImode."
+ (and (match_code "mem")
+      (match_test "TARGET_32BIT && neon_struct_mem_operand (op)")))
+
 (define_memory_constraint "Uv"
 "@internal
  In ARM/Thumb-2 state a valid VFP load/store address."
@ -183,6 +214,20 @@
 (and (match_code "mem")
      (match_test "TARGET_32BIT && arm_coproc_mem_operand (op, TRUE)")))

+(define_memory_constraint "Un"
+ "@internal
+  In ARM/Thumb-2 state a valid address for Neon element and structure
+  load/store instructions."
+ (and (match_code "mem")
+      (match_test "TARGET_32BIT && neon_vector_mem_operand (op, FALSE)")))
+
+(define_memory_constraint "Us"
+ "@internal
+  In ARM/Thumb-2 state a valid address for non-offset loads/stores of
+  quad-word values in four ARM registers."
+ (and (match_code "mem")
+      (match_test "TARGET_32BIT && neon_vector_mem_operand (op, TRUE)")))
+
 (define_memory_constraint "Uq"
 "@internal
  In ARM state an address valid in ldrsb instructions."
--- a/gcc/config/arm/iwmmxt.md
+++ b/gcc/config/arm/iwmmxt.md
@ -20,6 +20,15 @@
 ;; the Free Software Foundation, 51 Franklin Street, Fifth Floor,
 ;; Boston, MA 02110-1301, USA.

+;; Integer element sizes implemented by IWMMXT.
+(define_mode_macro VMMX [V2SI V4HI V8QI])
+
+;; Integer element sizes for shifts.
+(define_mode_macro VSHFT [V4HI V2SI DI])
+
+;; Determine element size suffix from vector mode.
+(define_mode_attr MMX_char [(V8QI "b") (V4HI "h") (V2SI "w") (DI "d")])
+
 (define_insn "iwmmxt_iordi3"
  [(set (match_operand:DI         0 "register_operand" "=y,?&r,?&r")
        (ior:DI (match_operand:DI 1 "register_operand" "%y,0,r")
@ -239,28 +248,12 @@

 ;; Vector add/subtract

-(define_insn "addv8qi3"
-  [(set (match_operand:V8QI            0 "register_operand" "=y")
-        (plus:V8QI (match_operand:V8QI 1 "register_operand"  "y")
-	           (match_operand:V8QI 2 "register_operand"  "y")))]
+(define_insn "*add<mode>3_iwmmxt"
+  [(set (match_operand:VMMX            0 "register_operand" "=y")
+        (plus:VMMX (match_operand:VMMX 1 "register_operand"  "y")
+	           (match_operand:VMMX 2 "register_operand"  "y")))]
  "TARGET_REALLY_IWMMXT"
-  "waddb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "addv4hi3"
-  [(set (match_operand:V4HI            0 "register_operand" "=y")
-        (plus:V4HI (match_operand:V4HI 1 "register_operand"  "y")
-	           (match_operand:V4HI 2 "register_operand"  "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "waddh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "addv2si3"
-  [(set (match_operand:V2SI            0 "register_operand" "=y")
-        (plus:V2SI (match_operand:V2SI 1 "register_operand"  "y")
-	           (match_operand:V2SI 2 "register_operand"  "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "waddw%?\\t%0, %1, %2"
+  "wadd<MMX_char>%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

 (define_insn "ssaddv8qi3"
@ -311,28 +304,12 @@
  "waddwus%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "subv8qi3"
-  [(set (match_operand:V8QI             0 "register_operand" "=y")
-        (minus:V8QI (match_operand:V8QI 1 "register_operand"  "y")
-		    (match_operand:V8QI 2 "register_operand"  "y")))]
+(define_insn "*sub<mode>3_iwmmxt"
+  [(set (match_operand:VMMX             0 "register_operand" "=y")
+        (minus:VMMX (match_operand:VMMX 1 "register_operand"  "y")
+		    (match_operand:VMMX 2 "register_operand"  "y")))]
  "TARGET_REALLY_IWMMXT"
-  "wsubb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "subv4hi3"
-  [(set (match_operand:V4HI             0 "register_operand" "=y")
-        (minus:V4HI (match_operand:V4HI 1 "register_operand"  "y")
-		    (match_operand:V4HI 2 "register_operand"  "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsubh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "subv2si3"
-  [(set (match_operand:V2SI             0 "register_operand" "=y")
-        (minus:V2SI (match_operand:V2SI 1 "register_operand"  "y")
-		    (match_operand:V2SI 2 "register_operand"  "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsubw%?\\t%0, %1, %2"
+  "wsub<MMX_char>%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

 (define_insn "sssubv8qi3"
@ -383,7 +360,7 @@
  "wsubwus%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "mulv4hi3"
+(define_insn "*mulv4hi3_iwmmxt"
  [(set (match_operand:V4HI            0 "register_operand" "=y")
        (mult:V4HI (match_operand:V4HI 1 "register_operand" "y")
 		   (match_operand:V4HI 2 "register_operand" "y")))]
@ -734,100 +711,36 @@

 ;; Max/min insns

-(define_insn "smaxv8qi3"
-  [(set (match_operand:V8QI            0 "register_operand" "=y")
-        (smax:V8QI (match_operand:V8QI 1 "register_operand" "y")
-		   (match_operand:V8QI 2 "register_operand" "y")))]
+(define_insn "*smax<mode>3_iwmmxt"
+  [(set (match_operand:VMMX            0 "register_operand" "=y")
+        (smax:VMMX (match_operand:VMMX 1 "register_operand" "y")
+		   (match_operand:VMMX 2 "register_operand" "y")))]
  "TARGET_REALLY_IWMMXT"
-  "wmaxsb%?\\t%0, %1, %2"
+  "wmaxs<MMX_char>%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "umaxv8qi3"
-  [(set (match_operand:V8QI            0 "register_operand" "=y")
-        (umax:V8QI (match_operand:V8QI 1 "register_operand" "y")
-		   (match_operand:V8QI 2 "register_operand" "y")))]
+(define_insn "*umax<mode>3_iwmmxt"
+  [(set (match_operand:VMMX            0 "register_operand" "=y")
+        (umax:VMMX (match_operand:VMMX 1 "register_operand" "y")
+		   (match_operand:VMMX 2 "register_operand" "y")))]
  "TARGET_REALLY_IWMMXT"
-  "wmaxub%?\\t%0, %1, %2"
+  "wmaxu<MMX_char>%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "smaxv4hi3"
-  [(set (match_operand:V4HI            0 "register_operand" "=y")
-        (smax:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		   (match_operand:V4HI 2 "register_operand" "y")))]
+(define_insn "*smin<mode>3_iwmmxt"
+  [(set (match_operand:VMMX            0 "register_operand" "=y")
+        (smin:VMMX (match_operand:VMMX 1 "register_operand" "y")
+		   (match_operand:VMMX 2 "register_operand" "y")))]
  "TARGET_REALLY_IWMMXT"
-  "wmaxsh%?\\t%0, %1, %2"
+  "wmins<MMX_char>%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "umaxv4hi3"
-  [(set (match_operand:V4HI            0 "register_operand" "=y")
-        (umax:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		   (match_operand:V4HI 2 "register_operand" "y")))]
+(define_insn "*umin<mode>3_iwmmxt"
+  [(set (match_operand:VMMX            0 "register_operand" "=y")
+        (umin:VMMX (match_operand:VMMX 1 "register_operand" "y")
+		   (match_operand:VMMX 2 "register_operand" "y")))]
  "TARGET_REALLY_IWMMXT"
-  "wmaxuh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "smaxv2si3"
-  [(set (match_operand:V2SI            0 "register_operand" "=y")
-        (smax:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		   (match_operand:V2SI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wmaxsw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "umaxv2si3"
-  [(set (match_operand:V2SI            0 "register_operand" "=y")
-        (umax:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		   (match_operand:V2SI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wmaxuw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "sminv8qi3"
-  [(set (match_operand:V8QI            0 "register_operand" "=y")
-        (smin:V8QI (match_operand:V8QI 1 "register_operand" "y")
-		   (match_operand:V8QI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wminsb%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "uminv8qi3"
-  [(set (match_operand:V8QI            0 "register_operand" "=y")
-        (umin:V8QI (match_operand:V8QI 1 "register_operand" "y")
-		   (match_operand:V8QI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wminub%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "sminv4hi3"
-  [(set (match_operand:V4HI            0 "register_operand" "=y")
-        (smin:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		   (match_operand:V4HI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wminsh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "uminv4hi3"
-  [(set (match_operand:V4HI            0 "register_operand" "=y")
-        (umin:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		   (match_operand:V4HI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wminuh%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "sminv2si3"
-  [(set (match_operand:V2SI            0 "register_operand" "=y")
-        (smin:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		   (match_operand:V2SI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wminsw%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "uminv2si3"
-  [(set (match_operand:V2SI            0 "register_operand" "=y")
-        (umin:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		   (match_operand:V2SI 2 "register_operand" "y")))]
-  "TARGET_REALLY_IWMMXT"
-  "wminuw%?\\t%0, %1, %2"
+  "wminu<MMX_char>%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

 ;; Pack/unpack insns.
@ -1141,76 +1054,28 @@
  "wrordg%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "ashrv4hi3"
-  [(set (match_operand:V4HI                0 "register_operand" "=y")
-        (ashiftrt:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		       (match_operand:SI   2 "register_operand" "z")))]
+(define_insn "ashr<mode>3_iwmmxt"
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y")
+        (ashiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
+			(match_operand:SI    2 "register_operand" "z")))]
  "TARGET_REALLY_IWMMXT"
-  "wsrahg%?\\t%0, %1, %2"
+  "wsra<MMX_char>g%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "ashrv2si3"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
-        (ashiftrt:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:SI   2 "register_operand" "z")))]
+(define_insn "lshr<mode>3_iwmmxt"
+  [(set (match_operand:VSHFT                 0 "register_operand" "=y")
+        (lshiftrt:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
+			(match_operand:SI    2 "register_operand" "z")))]
  "TARGET_REALLY_IWMMXT"
-  "wsrawg%?\\t%0, %1, %2"
+  "wsrl<MMX_char>g%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

-(define_insn "ashrdi3_iwmmxt"
-  [(set (match_operand:DI              0 "register_operand" "=y")
-	(ashiftrt:DI (match_operand:DI 1 "register_operand" "y")
-		   (match_operand:SI   2 "register_operand" "z")))]
+(define_insn "ashl<mode>3_iwmmxt"
+  [(set (match_operand:VSHFT               0 "register_operand" "=y")
+        (ashift:VSHFT (match_operand:VSHFT 1 "register_operand" "y")
+		      (match_operand:SI    2 "register_operand" "z")))]
  "TARGET_REALLY_IWMMXT"
-  "wsradg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "lshrv4hi3"
-  [(set (match_operand:V4HI                0 "register_operand" "=y")
-        (lshiftrt:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		       (match_operand:SI   2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrlhg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "lshrv2si3"
-  [(set (match_operand:V2SI                0 "register_operand" "=y")
-        (lshiftrt:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:SI   2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrlwg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "lshrdi3_iwmmxt"
-  [(set (match_operand:DI              0 "register_operand" "=y")
-	(lshiftrt:DI (match_operand:DI 1 "register_operand" "y")
-		     (match_operand:SI 2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsrldg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "ashlv4hi3"
-  [(set (match_operand:V4HI              0 "register_operand" "=y")
-        (ashift:V4HI (match_operand:V4HI 1 "register_operand" "y")
-		     (match_operand:SI   2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsllhg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "ashlv2si3"
-  [(set (match_operand:V2SI              0 "register_operand" "=y")
-        (ashift:V2SI (match_operand:V2SI 1 "register_operand" "y")
-		       (match_operand:SI 2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wsllwg%?\\t%0, %1, %2"
-  [(set_attr "predicable" "yes")])
-
-(define_insn "ashldi3_iwmmxt"
-  [(set (match_operand:DI            0 "register_operand" "=y")
-	(ashift:DI (match_operand:DI 1 "register_operand" "y")
-		   (match_operand:SI 2 "register_operand" "z")))]
-  "TARGET_REALLY_IWMMXT"
-  "wslldg%?\\t%0, %1, %2"
+  "wsll<MMX_char>g%?\\t%0, %1, %2"
  [(set_attr "predicable" "yes")])

 (define_insn "rorv4hi3_di"
--- a/gcc/config/arm/neon-docgen.ml
+++ b/gcc/config/arm/neon-docgen.ml
@ -0,0 +1,337 @@
+(* ARM NEON documentation generator.
+
+   Copyright (C) 2006 Free Software Foundation, Inc.
+   Contributed by CodeSourcery.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 2, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301, USA.
+
+   This is an O'Caml program.  The O'Caml compiler is available from:
+
+     http://caml.inria.fr/
+
+   Or from your favourite OS's friendly packaging system. Tested with version
+   3.09.2, though other versions will probably work too.
+
+   Compile with:
+     ocamlc -c neon.ml
+     ocamlc -o neon-docgen neon.cmo neon-docgen.ml
+
+   Run with:
+     /path/to/neon-docgen /path/to/gcc/doc/arm-neon-intrinsics.texi
+*)
+
+open Neon
+
+(* The combined "ops" and "reinterp" table.  *)
+let ops_reinterp = reinterp @ ops
+
+(* Helper functions for extracting things from the "ops" table.  *)
+let single_opcode desired_opcode () =
+  List.fold_left (fun got_so_far ->
+                  fun row ->
+                    match row with
+                      (opcode, _, _, _, _, _) ->
+                        if opcode = desired_opcode then row :: got_so_far
+                                                   else got_so_far
+                 ) [] ops_reinterp
+
+let multiple_opcodes desired_opcodes () =
+  List.fold_left (fun got_so_far ->
+                  fun desired_opcode ->
+                    (single_opcode desired_opcode ()) @ got_so_far)
+                 [] desired_opcodes
+
+let ldx_opcode number () =
+  List.fold_left (fun got_so_far ->
+                  fun row ->
+                    match row with
+                      (opcode, _, _, _, _, _) ->
+                        match opcode with
+                          Vldx n | Vldx_lane n | Vldx_dup n when n = number ->
+                            row :: got_so_far
+                          | _ -> got_so_far
+                 ) [] ops_reinterp
+
+let stx_opcode number () =
+  List.fold_left (fun got_so_far ->
+                  fun row ->
+                    match row with
+                      (opcode, _, _, _, _, _) ->
+                        match opcode with
+                          Vstx n | Vstx_lane n when n = number ->
+                            row :: got_so_far
+                          | _ -> got_so_far
+                 ) [] ops_reinterp
+
+let tbl_opcode () =
+  List.fold_left (fun got_so_far ->
+                  fun row ->
+                    match row with
+                      (opcode, _, _, _, _, _) ->
+                        match opcode with
+                          Vtbl _ -> row :: got_so_far
+                          | _ -> got_so_far
+                 ) [] ops_reinterp
+
+let tbx_opcode () =
+  List.fold_left (fun got_so_far ->
+                  fun row ->
+                    match row with
+                      (opcode, _, _, _, _, _) ->
+                        match opcode with
+                          Vtbx _ -> row :: got_so_far
+                          | _ -> got_so_far
+                 ) [] ops_reinterp
+
+(* The groups of intrinsics.  *)
+let intrinsic_groups =
+  [ "Addition", single_opcode Vadd;
+    "Multiplication", single_opcode Vmul;
+    "Multiply-accumulate", single_opcode Vmla;
+    "Multiply-subtract", single_opcode Vmls;
+    "Subtraction", single_opcode Vsub;
+    "Comparison (equal-to)", single_opcode Vceq;
+    "Comparison (greater-than-or-equal-to)", single_opcode Vcge;
+    "Comparison (less-than-or-equal-to)", single_opcode Vcle;
+    "Comparison (greater-than)", single_opcode Vcgt;
+    "Comparison (less-than)", single_opcode Vclt;
+    "Comparison (absolute greater-than-or-equal-to)", single_opcode Vcage;
+    "Comparison (absolute less-than-or-equal-to)", single_opcode Vcale;
+    "Comparison (absolute greater-than)", single_opcode Vcagt;
+    "Comparison (absolute less-than)", single_opcode Vcalt;
+    "Test bits", single_opcode Vtst;
+    "Absolute difference", single_opcode Vabd;
+    "Absolute difference and accumulate", single_opcode Vaba;
+    "Maximum", single_opcode Vmax;
+    "Minimum", single_opcode Vmin;
+    "Pairwise add", single_opcode Vpadd;
+    "Pairwise add, single_opcode widen and accumulate", single_opcode Vpada;
+    "Folding maximum", single_opcode Vpmax;
+    "Folding minimum", single_opcode Vpmin;
+    "Reciprocal step", multiple_opcodes [Vrecps; Vrsqrts];
+    "Vector shift left", single_opcode Vshl;
+    "Vector shift left by constant", single_opcode Vshl_n;
+    "Vector shift right by constant", single_opcode Vshr_n;
+    "Vector shift right by constant and accumulate", single_opcode Vsra_n;
+    "Vector shift right and insert", single_opcode Vsri;
+    "Vector shift left and insert", single_opcode Vsli;
+    "Absolute value", single_opcode Vabs;
+    "Negation", single_opcode Vneg;
+    "Bitwise not", single_opcode Vmvn;
+    "Count leading sign bits", single_opcode Vcls;
+    "Count leading zeros", single_opcode Vclz;
+    "Count number of set bits", single_opcode Vcnt;
+    "Reciprocal estimate", single_opcode Vrecpe;
+    "Reciprocal square-root estimate", single_opcode Vrsqrte;
+    "Get lanes from a vector", single_opcode Vget_lane;
+    "Set lanes in a vector", single_opcode Vset_lane;
+    "Create vector from literal bit pattern", single_opcode Vcreate;
+    "Set all lanes to the same value",
+      multiple_opcodes [Vdup_n; Vmov_n; Vdup_lane];
+    "Combining vectors", single_opcode Vcombine;
+    "Splitting vectors", multiple_opcodes [Vget_high; Vget_low];
+    "Conversions", multiple_opcodes [Vcvt; Vcvt_n];
+    "Move, single_opcode narrowing", single_opcode Vmovn;
+    "Move, single_opcode long", single_opcode Vmovl;
+    "Table lookup", tbl_opcode;
+    "Extended table lookup", tbx_opcode;
+    "Multiply, lane", single_opcode Vmul_lane;
+    "Long multiply, lane", single_opcode Vmull_lane;
+    "Saturating doubling long multiply, lane", single_opcode Vqdmull_lane;
+    "Saturating doubling multiply high, lane", single_opcode Vqdmulh_lane;
+    "Multiply-accumulate, lane", single_opcode Vmla_lane;
+    "Multiply-subtract, lane", single_opcode Vmls_lane;
+    "Vector multiply by scalar", single_opcode Vmul_n;
+    "Vector long multiply by scalar", single_opcode Vmull_n;
+    "Vector saturating doubling long multiply by scalar",
+      single_opcode Vqdmull_n;
+    "Vector saturating doubling multiply high by scalar",
+      single_opcode Vqdmulh_n;
+    "Vector multiply-accumulate by scalar", single_opcode Vmla_n;
+    "Vector multiply-subtract by scalar", single_opcode Vmls_n;
+    "Vector extract", single_opcode Vext;
+    "Reverse elements", multiple_opcodes [Vrev64; Vrev32; Vrev16];
+    "Bit selection", single_opcode Vbsl;
+    "Transpose elements", single_opcode Vtrn;
+    "Zip elements", single_opcode Vzip;
+    "Unzip elements", single_opcode Vuzp;
+    "Element/structure loads, VLD1 variants", ldx_opcode 1;
+    "Element/structure stores, VST1 variants", stx_opcode 1;
+    "Element/structure loads, VLD2 variants", ldx_opcode 2;
+    "Element/structure stores, VST2 variants", stx_opcode 2;
+    "Element/structure loads, VLD3 variants", ldx_opcode 3;
+    "Element/structure stores, VST3 variants", stx_opcode 3;
+    "Element/structure loads, VLD4 variants", ldx_opcode 4;
+    "Element/structure stores, VST4 variants", stx_opcode 4;
+    "Logical operations (AND)", single_opcode Vand;
+    "Logical operations (OR)", single_opcode Vorr;
+    "Logical operations (exclusive OR)", single_opcode Veor;
+    "Logical operations (AND-NOT)", single_opcode Vbic;
+    "Logical operations (OR-NOT)", single_opcode Vorn;
+    "Reinterpret casts", single_opcode Vreinterp ]
+
+(* Given an intrinsic shape, produce a string to document the corresponding
+   operand shapes.  *)
+let rec analyze_shape shape =
+  let rec n_things n thing =
+    match n with
+      0 -> []
+    | n -> thing :: (n_things (n - 1) thing)
+  in
+  let rec analyze_shape_elt reg_no elt =
+    match elt with
+      Dreg -> "@var{d" ^ (string_of_int reg_no) ^ "}"
+    | Qreg -> "@var{q" ^ (string_of_int reg_no) ^ "}"
+    | Corereg -> "@var{r" ^ (string_of_int reg_no) ^ "}"
+    | Immed -> "#@var{0}"
+    | VecArray (1, elt) ->
+        let elt_regexp = analyze_shape_elt 0 elt in
+          "@{" ^ elt_regexp ^ "@}"
+    | VecArray (n, elt) ->
+      let rec f m =
+        match m with
+          0 -> []
+        | m -> (analyze_shape_elt (m - 1) elt) :: (f (m - 1))
+      in
+      let ops = List.rev (f n) in
+        "@{" ^ (commas (fun x -> x) ops "") ^ "@}"
+    | (PtrTo elt | CstPtrTo elt) ->
+      "[" ^ (analyze_shape_elt reg_no elt) ^ "]"
+    | Element_of_dreg -> (analyze_shape_elt reg_no Dreg) ^ "[@var{0}]"
+    | Element_of_qreg -> (analyze_shape_elt reg_no Qreg) ^ "[@var{0}]"
+    | All_elements_of_dreg -> (analyze_shape_elt reg_no Dreg) ^ "[]"
+  in
+    match shape with
+      All (n, elt) -> commas (analyze_shape_elt 0) (n_things n elt) ""
+    | Long -> (analyze_shape_elt 0 Qreg) ^ ", " ^ (analyze_shape_elt 0 Dreg) ^
+              ", " ^ (analyze_shape_elt 0 Dreg)
+    | Long_noreg elt -> (analyze_shape_elt 0 elt) ^ ", " ^
+              (analyze_shape_elt 0 elt)
+    | Wide -> (analyze_shape_elt 0 Qreg) ^ ", " ^ (analyze_shape_elt 0 Qreg) ^
+              ", " ^ (analyze_shape_elt 0 Dreg)
+    | Wide_noreg elt -> analyze_shape (Long_noreg elt)
+    | Narrow -> (analyze_shape_elt 0 Dreg) ^ ", " ^ (analyze_shape_elt 0 Qreg) ^
+                ", " ^ (analyze_shape_elt 0 Qreg)
+    | Use_operands elts -> commas (analyze_shape_elt 0) (Array.to_list elts) ""
+    | By_scalar Dreg ->
+        analyze_shape (Use_operands [| Dreg; Dreg; Element_of_dreg |])
+    | By_scalar Qreg ->
+        analyze_shape (Use_operands [| Qreg; Qreg; Element_of_dreg |])
+    | By_scalar _ -> assert false
+    | Wide_lane ->
+        analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
+    | Wide_scalar ->
+        analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
+    | Pair_result elt ->
+      let elt_regexp = analyze_shape_elt 0 elt in
+      let elt_regexp' = analyze_shape_elt 1 elt in
+        elt_regexp ^ ", " ^ elt_regexp'
+    | Unary_scalar _ -> "FIXME Unary_scalar"
+    | Binary_imm elt -> analyze_shape (Use_operands [| elt; elt; Immed |])
+    | Narrow_imm -> analyze_shape (Use_operands [| Dreg; Qreg; Immed |])
+    | Long_imm -> analyze_shape (Use_operands [| Qreg; Dreg; Immed |])
+
+(* Document a single intrinsic.  *)
+let describe_intrinsic first chan
+                       (elt_ty, (_, features, shape, name, munge, _)) =
+  let c_arity, new_elt_ty = munge shape elt_ty in
+  let c_types = strings_of_arity c_arity in
+  Printf.fprintf chan "@itemize @bullet\n";
+  let item_code = if first then "@item" else "@itemx" in
+    Printf.fprintf chan "%s %s %s_%s (" item_code (List.hd c_types)
+                   (intrinsic_name name) (string_of_elt elt_ty);
+    Printf.fprintf chan "%s)\n" (commas (fun ty -> ty) (List.tl c_types) "");
+    if not (List.exists (fun feature -> feature = No_op) features) then
+    begin
+      let print_one_insn name =
+        Printf.fprintf chan "@code{";
+        let no_suffix = (new_elt_ty = NoElts) in
+        let name_with_suffix =
+          if no_suffix then name
+          else name ^ "." ^ (string_of_elt_dots new_elt_ty)
+        in
+        let possible_operands = analyze_all_shapes features shape
+                                                   analyze_shape
+        in
+	let rec print_one_possible_operand op =
+	  Printf.fprintf chan "%s %s}" name_with_suffix op
+        in
+          (* If the intrinsic expands to multiple instructions, we assume
+             they are all of the same form.  *)
+          print_one_possible_operand (List.hd possible_operands)
+      in
+      let rec print_insns names =
+        match names with
+          [] -> ()
+        | [name] -> print_one_insn name
+        | name::names -> (print_one_insn name;
+                          Printf.fprintf chan " @emph{or} ";
+                          print_insns names)
+      in
+      let insn_names = get_insn_names features name in
+        Printf.fprintf chan "@*@emph{Form of expected instruction(s):} ";
+        print_insns insn_names;
+        Printf.fprintf chan "\n"
+    end;
+    Printf.fprintf chan "@end itemize\n";
+    Printf.fprintf chan "\n\n"
+
+(* Document a group of intrinsics.  *)
+let document_group chan (group_title, group_extractor) =
+  (* Extract the rows in question from the ops table and then turn them
+     into a list of intrinsics.  *)
+  let intrinsics =
+    List.fold_left (fun got_so_far ->
+                    fun row ->
+                      match row with
+                        (_, _, _, _, _, elt_tys) ->
+                          List.fold_left (fun got_so_far' ->
+                                          fun elt_ty ->
+                                            (elt_ty, row) :: got_so_far')
+                                         got_so_far elt_tys
+                   ) [] (group_extractor ())
+  in
+    (* Emit the title for this group.  *)
+    Printf.fprintf chan "@subsubsection %s\n\n" group_title;
+    (* Emit a description of each intrinsic.  *)
+    List.iter (describe_intrinsic true chan) intrinsics;
+    (* Close this group.  *)
+    Printf.fprintf chan "\n\n"
+
+let gnu_header chan =
+  List.iter (fun s -> Printf.fprintf chan "%s\n" s) [
+  "@c Copyright (C) 2006 Free Software Foundation, Inc.";
+  "@c This is part of the GCC manual.";
+  "@c For copying conditions, see the file gcc.texi.";
+  "";
+  "@c This file is generated automatically using gcc/config/arm/neon-docgen.ml";
+  "@c Please do not edit manually."]
+
+(* Program entry point.  *)
+let _ =
+  if Array.length Sys.argv <> 2 then
+    failwith "Usage: neon-docgen <output filename>"
+  else
+  let file = Sys.argv.(1) in
+    try
+      let chan = open_out file in
+        gnu_header chan;
+        List.iter (document_group chan) intrinsic_groups;
+        close_out chan
+    with Sys_error sys ->
+      failwith ("Could not create output file " ^ file ^ ": " ^ sys)
--- a/gcc/config/arm/neon-gen.ml
+++ b/gcc/config/arm/neon-gen.ml
@ -0,0 +1,419 @@
+(* Auto-generate ARM Neon intrinsics header file.
+   Copyright (C) 2006, 2007 Free Software Foundation, Inc.
+   Contributed by CodeSourcery.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 2, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301, USA.
+
+   This is an O'Caml program.  The O'Caml compiler is available from:
+
+     http://caml.inria.fr/
+
+   Or from your favourite OS's friendly packaging system. Tested with version
+   3.09.2, though other versions will probably work too.
+
+   Compile with:
+     ocamlc -c neon.ml
+     ocamlc -o neon-gen neon.cmo neon-gen.ml
+
+   Run with:
+     ./neon-gen > arm_neon.h
+*)
+
+open Neon
+
+(* The format codes used in the following functions are documented at:
+     http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html\
+     #6_printflikefunctionsforprettyprinting
+   (one line, remove the backslash.)
+*)
+
+(* Following functions can be used to approximate GNU indentation style.  *)
+let start_function () =
+  Format.printf "@[<v 0>";
+  ref 0
+
+let end_function nesting =
+  match !nesting with
+    0 -> Format.printf "@;@;@]"
+  | _ -> failwith ("Bad nesting (ending function at level "
+                   ^ (string_of_int !nesting) ^ ")")
+
+let open_braceblock nesting =
+  begin match !nesting with
+    0 -> Format.printf "@,@<0>{@[<v 2>@,"
+  | _ -> Format.printf "@,@[<v 2>  @<0>{@[<v 2>@,"
+  end;
+  incr nesting
+
+let close_braceblock nesting =
+  decr nesting;
+  match !nesting with
+    0 -> Format.printf "@]@,@<0>}"
+  | _ -> Format.printf "@]@,@<0>}@]"
+
+let print_function arity fnname body =
+  let ffmt = start_function () in
+  Format.printf "__extension__ static __inline ";
+  let inl = "__attribute__ ((__always_inline__))" in
+  begin match arity with
+    Arity0 ret ->
+      Format.printf "%s %s@,%s (void)" (string_of_vectype ret) inl fnname
+  | Arity1 (ret, arg0) ->
+      Format.printf "%s %s@,%s (%s __a)" (string_of_vectype ret) inl fnname
+                                        (string_of_vectype arg0)
+  | Arity2 (ret, arg0, arg1) ->
+      Format.printf "%s %s@,%s (%s __a, %s __b)"
+        (string_of_vectype ret) inl fnname (string_of_vectype arg0)
+	(string_of_vectype arg1)
+  | Arity3 (ret, arg0, arg1, arg2) ->
+      Format.printf "%s %s@,%s (%s __a, %s __b, %s __c)"
+        (string_of_vectype ret) inl fnname (string_of_vectype arg0)
+	(string_of_vectype arg1) (string_of_vectype arg2)
+  | Arity4 (ret, arg0, arg1, arg2, arg3) ->
+      Format.printf "%s %s@,%s (%s __a, %s __b, %s __c, %s __d)"
+        (string_of_vectype ret) inl fnname (string_of_vectype arg0)
+	(string_of_vectype arg1) (string_of_vectype arg2)
+        (string_of_vectype arg3)
+  end;
+  open_braceblock ffmt;
+  let rec print_lines = function
+    [] -> ()
+  | [line] -> Format.printf "%s" line
+  | line::lines -> Format.printf "%s@," line; print_lines lines in
+  print_lines body;
+  close_braceblock ffmt;
+  end_function ffmt
+
+let return_by_ptr features = List.mem ReturnPtr features
+
+let union_string num elts base =
+  let itype = inttype_for_array num elts in
+  let iname = string_of_inttype itype
+  and sname = string_of_vectype (T_arrayof (num, elts)) in
+  Printf.sprintf "union { %s __i; %s __o; } %s" sname iname base
+
+let rec signed_ctype = function
+    T_uint8x8 | T_poly8x8 -> T_int8x8
+  | T_uint8x16 | T_poly8x16 -> T_int8x16
+  | T_uint16x4 | T_poly16x4 -> T_int16x4
+  | T_uint16x8 | T_poly16x8 -> T_int16x8
+  | T_uint32x2 -> T_int32x2
+  | T_uint32x4 -> T_int32x4
+  | T_uint64x1 -> T_int64x1
+  | T_uint64x2 -> T_int64x2
+  (* Cast to types defined by mode in arm.c, not random types pulled in from
+     the <stdint.h> header in use. This fixes incompatible pointer errors when
+     compiling with C++.  *)
+  | T_uint8 | T_int8 -> T_intQI
+  | T_uint16 | T_int16 -> T_intHI
+  | T_uint32 | T_int32 -> T_intSI
+  | T_uint64 | T_int64 -> T_intDI
+  | T_poly8 -> T_intQI
+  | T_poly16 -> T_intHI
+  | T_arrayof (n, elt) -> T_arrayof (n, signed_ctype elt)
+  | T_ptrto elt -> T_ptrto (signed_ctype elt)
+  | T_const elt -> T_const (signed_ctype elt)
+  | x -> x
+
+let add_cast ctype cval =
+  let stype = signed_ctype ctype in
+  if ctype <> stype then
+    Printf.sprintf "(%s) %s" (string_of_vectype stype) cval
+  else
+    cval
+
+let cast_for_return to_ty = "(" ^ (string_of_vectype to_ty) ^ ")"
+
+(* Return a tuple of a list of declarations to go at the start of the function,
+   and a list of statements needed to return THING.  *)
+let return arity return_by_ptr thing =
+  match arity with
+    Arity0 (ret) | Arity1 (ret, _) | Arity2 (ret, _, _) | Arity3 (ret, _, _, _)
+  | Arity4 (ret, _, _, _, _) ->
+    match ret with
+      T_arrayof (num, vec) ->
+        if return_by_ptr then
+          let sname = string_of_vectype ret in
+          [Printf.sprintf "%s __rv;" sname],
+          [thing ^ ";"; "return __rv;"]
+        else
+          let uname = union_string num vec "__rv" in
+          [uname ^ ";"], ["__rv.__o = " ^ thing ^ ";"; "return __rv.__i;"]
+    | T_void -> [], [thing ^ ";"]
+    | _ ->
+        [], ["return " ^ (cast_for_return ret) ^ thing ^ ";"]
+
+let rec element_type ctype =
+  match ctype with
+    T_arrayof (_, v) -> element_type v
+  | _ -> ctype
+
+let params return_by_ptr ps =
+  let pdecls = ref [] in
+  let ptype t p =
+    match t with
+      T_arrayof (num, elts) ->
+        let uname = union_string num elts (p ^ "u") in
+        let decl = Printf.sprintf "%s = { %s };" uname p in
+        pdecls := decl :: !pdecls;
+        p ^ "u.__o"
+    | _ -> add_cast t p in
+  let plist = match ps with
+    Arity0 _ -> []
+  | Arity1 (_, t1) -> [ptype t1 "__a"]
+  | Arity2 (_, t1, t2) -> [ptype t1 "__a"; ptype t2 "__b"]
+  | Arity3 (_, t1, t2, t3) -> [ptype t1 "__a"; ptype t2 "__b"; ptype t3 "__c"]
+  | Arity4 (_, t1, t2, t3, t4) ->
+      [ptype t1 "__a"; ptype t2 "__b"; ptype t3 "__c"; ptype t4 "__d"] in
+  match ps with
+    Arity0 ret | Arity1 (ret, _) | Arity2 (ret, _, _) | Arity3 (ret, _, _, _)
+  | Arity4 (ret, _, _, _, _) ->
+      if return_by_ptr then
+        !pdecls, add_cast (T_ptrto (element_type ret)) "&__rv.val[0]" :: plist
+      else
+        !pdecls, plist
+
+let modify_params features plist =
+  let is_flipped =
+    List.exists (function Flipped _ -> true | _ -> false) features in
+  if is_flipped then
+    match plist with
+      [ a; b ] -> [ b; a ]
+    | _ ->
+      failwith ("Don't know how to flip args " ^ (String.concat ", " plist))
+  else
+    plist
+
+(* !!! Decide whether to add an extra information word based on the shape
+   form.  *)
+let extra_word shape features paramlist bits =
+  let use_word =
+    match shape with
+      All _ | Long | Long_noreg _ | Wide | Wide_noreg _ | Narrow
+    | By_scalar _ | Wide_scalar | Wide_lane | Binary_imm _ | Long_imm
+    | Narrow_imm -> true
+    | _ -> List.mem InfoWord features
+  in
+    if use_word then
+      paramlist @ [string_of_int bits]
+    else
+      paramlist
+
+(* Bit 0 represents signed (1) vs unsigned (0), or float (1) vs poly (0).
+   Bit 1 represents floats & polynomials (1), or ordinary integers (0).
+   Bit 2 represents rounding (1) vs none (0).  *)
+let infoword_value elttype features =
+  let bits01 =
+    match elt_class elttype with
+      Signed | ConvClass (Signed, _) | ConvClass (_, Signed) -> 0b001
+    | Poly -> 0b010
+    | Float -> 0b011
+    | _ -> 0b000
+  and rounding_bit = if List.mem Rounding features then 0b100 else 0b000 in
+  bits01 lor rounding_bit
+
+(* "Cast" type operations will throw an exception in mode_of_elt (actually in
+   elt_width, called from there). Deal with that here, and generate a suffix
+   with multiple modes (<to><from>).  *)
+let rec mode_suffix elttype shape =
+  try
+    let mode = mode_of_elt elttype shape in
+    string_of_mode mode
+  with MixedMode (dst, src) ->
+    let dstmode = mode_of_elt dst shape
+    and srcmode = mode_of_elt src shape in
+    string_of_mode dstmode ^ string_of_mode srcmode
+
+let print_variant opcode features shape name (ctype, asmtype, elttype) =
+  let bits = infoword_value elttype features in
+  let modesuf = mode_suffix elttype shape in
+  let return_by_ptr = return_by_ptr features in
+  let pdecls, paramlist = params return_by_ptr ctype in
+  let paramlist' = modify_params features paramlist in
+  let paramlist'' = extra_word shape features paramlist' bits in
+  let parstr = String.concat ", " paramlist'' in
+  let builtin = Printf.sprintf "__builtin_neon_%s%s (%s)"
+                  (builtin_name features name) modesuf parstr in
+  let rdecls, stmts = return ctype return_by_ptr builtin in
+  let body = pdecls @ rdecls @ stmts
+  and fnname = (intrinsic_name name) ^ "_" ^ (string_of_elt elttype) in
+  print_function ctype fnname body
+
+(* When this function processes the element types in the ops table, it rewrites
+   them in a list of tuples (a,b,c):
+     a : C type as an "arity", e.g. Arity1 (T_poly8x8, T_poly8x8)
+     b : Asm type : a single, processed element type, e.g. P16. This is the
+         type which should be attached to the asm opcode.
+     c : Variant type : the unprocessed type for this variant (e.g. in add
+         instructions which don't care about the sign, b might be i16 and c
+         might be s16.)
+*)
+
+let print_op (opcode, features, shape, name, munge, types) =
+  let sorted_types = List.sort compare types in
+  let munged_types = List.map
+    (fun elt -> let c, asm = munge shape elt in c, asm, elt) sorted_types in
+  List.iter
+    (fun variant -> print_variant opcode features shape name variant)
+    munged_types
+
+let print_ops ops =
+  List.iter print_op ops
+
+(* Output type definitions. Table entries are:
+     cbase : "C" name for the type.
+     abase : "ARM" base name for the type (i.e. int in int8x8_t).
+     esize : element size.
+     enum : element count.
+*)
+
+let deftypes () =
+  let typeinfo = [
+    (* Doubleword vector types.  *)
+    "__builtin_neon_qi", "int", 8, 8;
+    "__builtin_neon_hi", "int", 16, 4;
+    "__builtin_neon_si", "int", 32, 2;
+    "__builtin_neon_di", "int", 64, 1;
+    "__builtin_neon_sf", "float", 32, 2;
+    "__builtin_neon_poly8", "poly", 8, 8;
+    "__builtin_neon_poly16", "poly", 16, 4;
+    "__builtin_neon_uqi", "uint", 8, 8;
+    "__builtin_neon_uhi", "uint", 16, 4;
+    "__builtin_neon_usi", "uint", 32, 2;
+    "__builtin_neon_udi", "uint", 64, 1;
+
+    (* Quadword vector types.  *)
+    "__builtin_neon_qi", "int", 8, 16;
+    "__builtin_neon_hi", "int", 16, 8;
+    "__builtin_neon_si", "int", 32, 4;
+    "__builtin_neon_di", "int", 64, 2;
+    "__builtin_neon_sf", "float", 32, 4;
+    "__builtin_neon_poly8", "poly", 8, 16;
+    "__builtin_neon_poly16", "poly", 16, 8;
+    "__builtin_neon_uqi", "uint", 8, 16;
+    "__builtin_neon_uhi", "uint", 16, 8;
+    "__builtin_neon_usi", "uint", 32, 4;
+    "__builtin_neon_udi", "uint", 64, 2
+  ] in
+  List.iter
+    (fun (cbase, abase, esize, enum) ->
+      let attr =
+        match enum with
+          1 -> ""
+        | _ -> Printf.sprintf "\t__attribute__ ((__vector_size__ (%d)))"
+                              (esize * enum / 8) in
+      Format.printf "typedef %s %s%dx%d_t%s;@\n" cbase abase esize enum attr)
+    typeinfo;
+  Format.print_newline ();
+  (* Extra types not in <stdint.h>.  *)
+  Format.printf "typedef __builtin_neon_sf float32_t;\n";
+  Format.printf "typedef __builtin_neon_poly8 poly8_t;\n";
+  Format.printf "typedef __builtin_neon_poly16 poly16_t;\n"
+
+(* Output structs containing arrays, for load & store instructions etc.  *)
+
+let arrtypes () =
+  let typeinfo = [
+    "int", 8;    "int", 16;
+    "int", 32;   "int", 64;
+    "uint", 8;   "uint", 16;
+    "uint", 32;  "uint", 64;
+    "float", 32; "poly", 8;
+    "poly", 16
+  ] in
+  let writestruct elname elsize regsize arrsize =
+    let elnum = regsize / elsize in
+    let structname =
+      Printf.sprintf "%s%dx%dx%d_t" elname elsize elnum arrsize in
+    let sfmt = start_function () in
+    Format.printf "typedef struct %s" structname;
+    open_braceblock sfmt;
+    Format.printf "%s%dx%d_t val[%d];" elname elsize elnum arrsize;
+    close_braceblock sfmt;
+    Format.printf " %s;" structname;
+    end_function sfmt;
+  in
+    for n = 2 to 4 do
+      List.iter
+        (fun (elname, elsize) ->
+          writestruct elname elsize 64 n;
+          writestruct elname elsize 128 n)
+        typeinfo
+    done
+
+let print_lines = List.iter (fun s -> Format.printf "%s@\n" s)
+
+(* Do it.  *)
+
+let _ =
+  print_lines [
+"/* ARM NEON intrinsics include file. This file is generated automatically";
+"   using neon-gen.ml.  Please do not edit manually.";
+"";
+"   Copyright (C) 2006, 2007 Free Software Foundation, Inc.";
+"   Contributed by CodeSourcery.";
+"";
+"   This file is part of GCC.";
+"";
+"   GCC is free software; you can redistribute it and/or modify it";
+"   under the terms of the GNU General Public License as published";
+"   by the Free Software Foundation; either version 2, or (at your";
+"   option) any later version.";
+"";
+"   GCC is distributed in the hope that it will be useful, but WITHOUT";
+"   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY";
+"   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public";
+"   License for more details.";
+"";
+"   You should have received a copy of the GNU General Public License";
+"   along with GCC; see the file COPYING.  If not, write to the";
+"   Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston,";
+"   MA 02110-1301, USA.  */";
+"";
+"/* As a special exception, if you include this header file into source";
+"   files compiled by GCC, this header file does not by itself cause";
+"   the resulting executable to be covered by the GNU General Public";
+"   License.  This exception does not however invalidate any other";
+"   reasons why the executable file might be covered by the GNU General";
+"   Public License.  */";
+"";
+"#ifndef _GCC_ARM_NEON_H";
+"#define _GCC_ARM_NEON_H 1";
+"";
+"#ifndef __ARM_NEON__";
+"#error You must enable NEON instructions (e.g. -mfloat-abi=softfp -mfpu=neon) to use arm_neon.h";
+"#else";
+"";
+"#ifdef __cplusplus";
+"extern \"C\" {";
+"#endif";
+"";
+"#include <stdint.h>";
+""];
+  deftypes ();
+  arrtypes ();
+  Format.print_newline ();
+  print_ops ops;
+  Format.print_newline ();
+  print_ops reinterp;
+  print_lines [
+"#ifdef __cplusplus";
+"}";
+"#endif";
+"#endif";
+"#endif"]
--- a/gcc/config/arm/neon-testgen.ml
+++ b/gcc/config/arm/neon-testgen.ml
@ -0,0 +1,277 @@
+(* Auto-generate ARM Neon intrinsics tests.
+   Copyright (C) 2006 Free Software Foundation, Inc.
+   Contributed by CodeSourcery.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 2, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING.  If not, write to the Free
+   Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+   02110-1301, USA.
+
+   This is an O'Caml program.  The O'Caml compiler is available from:
+
+     http://caml.inria.fr/
+
+   Or from your favourite OS's friendly packaging system. Tested with version
+   3.09.2, though other versions will probably work too.
+
+   Compile with:
+     ocamlc -c neon.ml
+     ocamlc -o neon-testgen neon.cmo neon-testgen.ml
+
+   Run with:
+     cd /path/to/gcc/testsuite/gcc.target/arm/neon
+     /path/to/neon-testgen
+*)
+
+open Neon
+
+type c_type_flags = Pointer | Const
+
+(* Open a test source file.  *)
+let open_test_file dir name =
+  try
+    open_out (dir ^ "/" ^ name ^ ".c")
+  with Sys_error str ->
+    failwith ("Could not create test source file " ^ name ^ ": " ^ str)
+
+(* Emit prologue code to a test source file.  *)
+let emit_prologue chan test_name =
+  Printf.fprintf chan "/* Test the `%s' ARM Neon intrinsic.  */\n" test_name;
+  Printf.fprintf chan "/* This file was autogenerated by neon-testgen.  */\n\n";
+  Printf.fprintf chan "/* { dg-do assemble } */\n";
+  Printf.fprintf chan "/* { dg-require-effective-target arm_neon_ok } */\n";
+  Printf.fprintf chan
+                 "/* { dg-options \"-save-temps -O0 -mfpu=neon -mfloat-abi=softfp\" } */\n";
+  Printf.fprintf chan "\n#include \"arm_neon.h\"\n\n";
+  Printf.fprintf chan "void test_%s (void)\n{\n" test_name
+
+(* Emit declarations of local variables that are going to be passed
+   to an intrinsic, together with one to take a returned value if needed.  *)
+let emit_automatics chan c_types =
+  let emit () =
+    ignore (
+      List.fold_left (fun arg_number -> fun (flags, ty) ->
+                        let pointer_bit =
+                          if List.mem Pointer flags then "*" else ""
+                        in
+                          (* Const arguments to builtins are directly
+                             written in as constants.  *)
+                          if not (List.mem Const flags) then
+                            Printf.fprintf chan "  %s %sarg%d_%s;\n"
+                                           ty pointer_bit arg_number ty;
+                        arg_number + 1)
+                     0 (List.tl c_types))
+  in
+    match c_types with
+      (_, return_ty) :: tys ->
+        if return_ty <> "void" then
+          (* The intrinsic returns a value.  *)
+          (Printf.fprintf chan "  %s out_%s;\n" return_ty return_ty;
+           emit ())
+        else
+          (* The intrinsic does not return a value.  *)
+          emit ()
+    | _ -> assert false
+
+(* Emit code to call an intrinsic.  *)
+let emit_call chan const_valuator c_types name elt_ty =
+  (if snd (List.hd c_types) <> "void" then
+     Printf.fprintf chan "  out_%s = " (snd (List.hd c_types))
+   else
+     Printf.fprintf chan "  ");
+  Printf.fprintf chan "%s_%s (" (intrinsic_name name) (string_of_elt elt_ty);
+  let print_arg chan arg_number (flags, ty) =
+    (* If the argument is of const type, then directly write in the
+       constant now.  *)
+    if List.mem Const flags then
+      match const_valuator with
+        None ->
+          if List.mem Pointer flags then
+            Printf.fprintf chan "0"
+          else
+            Printf.fprintf chan "1"
+      | Some f -> Printf.fprintf chan "%s" (string_of_int (f arg_number))
+    else
+      Printf.fprintf chan "arg%d_%s" arg_number ty
+  in
+  let rec print_args arg_number tys =
+    match tys with
+      [] -> ()
+    | [ty] -> print_arg chan arg_number ty
+    | ty::tys ->
+      print_arg chan arg_number ty;
+      Printf.fprintf chan ", ";
+      print_args (arg_number + 1) tys
+  in
+    print_args 0 (List.tl c_types);
+    Printf.fprintf chan ");\n"
+
+(* Emit epilogue code to a test source file.  *)
+let emit_epilogue chan features regexps =
+  let no_op = List.exists (fun feature -> feature = No_op) features in
+    Printf.fprintf chan "}\n\n";
+    (if not no_op then
+       List.iter (fun regexp ->
+                   Printf.fprintf chan
+                     "/* { dg-final { scan-assembler \"%s\" } } */\n" regexp)
+                regexps
+     else
+       ()
+    );
+    Printf.fprintf chan "/* { dg-final { cleanup-saved-temps } } */\n"
+
+(* Check a list of C types to determine which ones are pointers and which
+   ones are const.  *)
+let check_types tys =
+  let tys' =
+    List.map (fun ty ->
+                let len = String.length ty in
+                  if len > 2 && String.get ty (len - 2) = ' '
+                             && String.get ty (len - 1) = '*'
+                  then ([Pointer], String.sub ty 0 (len - 2))
+                  else ([], ty)) tys
+  in
+    List.map (fun (flags, ty) ->
+                if String.length ty > 6 && String.sub ty 0 6 = "const "
+                then (Const :: flags, String.sub ty 6 ((String.length ty) - 6))
+                else (flags, ty)) tys'
+
+(* Given an intrinsic shape, produce a regexp that will match
+   the right-hand sides of instructions generated by an intrinsic of
+   that shape.  *)
+let rec analyze_shape shape =
+  let rec n_things n thing =
+    match n with
+      0 -> []
+    | n -> thing :: (n_things (n - 1) thing)
+  in
+  let rec analyze_shape_elt elt =
+    match elt with
+      Dreg -> "\\[dD\\]\\[0-9\\]+"
+    | Qreg -> "\\[qQ\\]\\[0-9\\]+"
+    | Corereg -> "\\[rR\\]\\[0-9\\]+"
+    | Immed -> "#\\[0-9\\]+"
+    | VecArray (1, elt) ->
+        let elt_regexp = analyze_shape_elt elt in
+          "((\\\\\\{" ^ elt_regexp ^ "\\\\\\})|(" ^ elt_regexp ^ "))"
+    | VecArray (n, elt) ->
+      let elt_regexp = analyze_shape_elt elt in
+      let alt1 = elt_regexp ^ "-" ^ elt_regexp in
+      let alt2 = commas (fun x -> x) (n_things n elt_regexp) "" in
+        "\\\\\\{((" ^ alt1 ^ ")|(" ^ alt2 ^ "))\\\\\\}"
+    | (PtrTo elt | CstPtrTo elt) ->
+      "\\\\\\[" ^ (analyze_shape_elt elt) ^ "\\\\\\]"
+    | Element_of_dreg -> (analyze_shape_elt Dreg) ^ "\\\\\\[\\[0-9\\]+\\\\\\]"
+    | Element_of_qreg -> (analyze_shape_elt Qreg) ^ "\\\\\\[\\[0-9\\]+\\\\\\]"
+    | All_elements_of_dreg -> (analyze_shape_elt Dreg) ^ "\\\\\\[\\\\\\]"
+  in
+    match shape with
+      All (n, elt) -> commas analyze_shape_elt (n_things n elt) ""
+    | Long -> (analyze_shape_elt Qreg) ^ ", " ^ (analyze_shape_elt Dreg) ^
+              ", " ^ (analyze_shape_elt Dreg)
+    | Long_noreg elt -> (analyze_shape_elt elt) ^ ", " ^ (analyze_shape_elt elt)
+    | Wide -> (analyze_shape_elt Qreg) ^ ", " ^ (analyze_shape_elt Qreg) ^
+              ", " ^ (analyze_shape_elt Dreg)
+    | Wide_noreg elt -> analyze_shape (Long_noreg elt)
+    | Narrow -> (analyze_shape_elt Dreg) ^ ", " ^ (analyze_shape_elt Qreg) ^
+                ", " ^ (analyze_shape_elt Qreg)
+    | Use_operands elts -> commas analyze_shape_elt (Array.to_list elts) ""
+    | By_scalar Dreg ->
+        analyze_shape (Use_operands [| Dreg; Dreg; Element_of_dreg |])
+    | By_scalar Qreg ->
+        analyze_shape (Use_operands [| Qreg; Qreg; Element_of_dreg |])
+    | By_scalar _ -> assert false
+    | Wide_lane ->
+        analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
+    | Wide_scalar ->
+        analyze_shape (Use_operands [| Qreg; Dreg; Element_of_dreg |])
+    | Pair_result elt ->
+      let elt_regexp = analyze_shape_elt elt in
+        elt_regexp ^ ", " ^ elt_regexp
+    | Unary_scalar _ -> "FIXME Unary_scalar"
+    | Binary_imm elt -> analyze_shape (Use_operands [| elt; elt; Immed |])
+    | Narrow_imm -> analyze_shape (Use_operands [| Dreg; Qreg; Immed |])
+    | Long_imm -> analyze_shape (Use_operands [| Qreg; Dreg; Immed |])
+
+(* Generate tests for one intrinsic.  *)
+let test_intrinsic dir opcode features shape name munge elt_ty =
+  (* Open the test source file.  *)
+  let test_name = name ^ (string_of_elt elt_ty) in
+  let chan = open_test_file dir test_name in
+  (* Work out what argument and return types the intrinsic has.  *)
+  let c_arity, new_elt_ty = munge shape elt_ty in
+  let c_types = check_types (strings_of_arity c_arity) in
+  (* Extract any constant valuator (a function specifying what constant
+     values are to be written into the intrinsic call) from the features
+     list.  *)
+  let const_valuator =
+    try
+      match (List.find (fun feature -> match feature with
+                                         Const_valuator _ -> true
+				       | _ -> false) features) with
+        Const_valuator f -> Some f
+      | _ -> assert false
+    with Not_found -> None
+  in
+  (* Work out what instruction name(s) to expect.  *)
+  let insns = get_insn_names features name in
+  let no_suffix = (new_elt_ty = NoElts) in
+  let insns =
+    if no_suffix then insns
+                 else List.map (fun insn ->
+                                  let suffix = string_of_elt_dots new_elt_ty in
+                                    insn ^ "\\." ^ suffix) insns
+  in
+  (* Construct a regexp to match against the expected instruction name(s).  *)
+  let insn_regexp =
+    match insns with
+      [] -> assert false
+    | [insn] -> insn
+    | _ ->
+      let rec calc_regexp insns cur_regexp =
+        match insns with
+          [] -> cur_regexp
+        | [insn] -> cur_regexp ^ "(" ^ insn ^ "))"
+        | insn::insns -> calc_regexp insns (cur_regexp ^ "(" ^ insn ^ ")|")
+      in calc_regexp insns "("
+  in
+  (* Construct regexps to match against the instructions that this
+     intrinsic expands to.  Watch out for any writeback character and
+     comments after the instruction.  *)
+  let regexps = List.map (fun regexp -> insn_regexp ^ "\\[ \t\\]+" ^ regexp ^
+			  "!?\\(\\[ \t\\]+@\\[a-zA-Z0-9 \\]+\\)?\\n")
+                         (analyze_all_shapes features shape analyze_shape)
+  in
+    (* Emit file and function prologues.  *)
+    emit_prologue chan test_name;
+    (* Emit local variable declarations.  *)
+    emit_automatics chan c_types;
+    Printf.fprintf chan "\n";
+    (* Emit the call to the intrinsic.  *)
+    emit_call chan const_valuator c_types name elt_ty;
+    (* Emit the function epilogue and the DejaGNU scan-assembler directives.  *)
+    emit_epilogue chan features regexps;
+    (* Close the test file.  *)
+    close_out chan
+
+(* Generate tests for one element of the "ops" table.  *)
+let test_intrinsic_group dir (opcode, features, shape, name, munge, types) =
+  List.iter (test_intrinsic dir opcode features shape name munge) types
+
+(* Program entry point.  *)
+let _ =
+  let directory = if Array.length Sys.argv <> 1 then Sys.argv.(1) else "." in
+    List.iter (test_intrinsic_group directory) (reinterp @ ops)
+
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
--- a/gcc/config/arm/neon.ml
+++ b/gcc/config/arm/neon.ml
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@ -470,3 +470,43 @@
       (match_test "((unsigned HOST_WIDE_INT) INTVAL (op)) < 64")))


+;; Neon predicates
+
+(define_predicate "const_multiple_of_8_operand"
+  (match_code "const_int")
+{
+  unsigned HOST_WIDE_INT val = INTVAL (op);
+  return (val & 7) == 0;
+})
+
+(define_predicate "imm_for_neon_mov_operand"
+  (match_code "const_vector")
+{
+  return neon_immediate_valid_for_move (op, mode, NULL, NULL);
+})
+
+(define_predicate "imm_for_neon_logic_operand"
+  (match_code "const_vector")
+{
+  return neon_immediate_valid_for_logic (op, mode, 0, NULL, NULL);
+})
+
+(define_predicate "imm_for_neon_inv_logic_operand"
+  (match_code "const_vector")
+{
+  return neon_immediate_valid_for_logic (op, mode, 1, NULL, NULL);
+})
+
+(define_predicate "neon_logic_op2"
+  (ior (match_operand 0 "imm_for_neon_logic_operand")
+       (match_operand 0 "s_register_operand")))
+
+(define_predicate "neon_inv_logic_op2"
+  (ior (match_operand 0 "imm_for_neon_inv_logic_operand")
+       (match_operand 0 "s_register_operand")))
+
+;; TODO: We could check lane numbers more precisely based on the mode.
+(define_predicate "neon_lane_number"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) >= 0 && INTVAL (op) <= 7")))
+
--- a/gcc/config/arm/t-arm
+++ b/gcc/config/arm/t-arm
@ -9,8 +9,10 @@ MD_INCLUDES= 	$(srcdir)/config/arm/arm-tune.md \
 		$(srcdir)/config/arm/arm926ejs.md \
 		$(srcdir)/config/arm/cirrus.md \
 		$(srcdir)/config/arm/fpa.md \
+		$(srcdir)/config/arm/vec-common.md \
 		$(srcdir)/config/arm/iwmmxt.md \
 		$(srcdir)/config/arm/vfp.md \
+		$(srcdir)/config/arm/neon.md \
 		$(srcdir)/config/arm/thumb2.md

 s-config s-conditions s-flags s-codes s-constants s-emit s-recog s-preds \
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@ -0,0 +1,107 @@
+;; Machine Description for shared bits common to IWMMXT and Neon.
+;; Copyright (C) 2006 Free Software Foundation, Inc.
+;; Written by CodeSourcery.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 2, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING.  If not, write to the Free
+;; Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+;; 02110-1301, USA.
+
+;; Vector Moves
+
+;; All integer and float modes supported by Neon and IWMMXT.
+(define_mode_macro VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
+
+;; All integer and float modes supported by Neon and IWMMXT, except V2DI.
+(define_mode_macro VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
+
+;; All integer modes supported by Neon and IWMMXT
+(define_mode_macro VINT [V2DI V2SI V4HI V8QI V4SI V8HI V16QI])
+
+;; All integer modes supported by Neon and IWMMXT, except V2DI
+(define_mode_macro VINTW [V2SI V4HI V8QI V4SI V8HI V16QI])
+
+(define_expand "mov<mode>"
+  [(set (match_operand:VALL 0 "nonimmediate_operand" "")
+	(match_operand:VALL 1 "general_operand" ""))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
+
+;; Vector arithmetic. Expanders are blank, then unnamed insns implement
+;; patterns seperately for IWMMXT and Neon.
+
+(define_expand "add<mode>3"
+  [(set (match_operand:VALL 0 "s_register_operand" "")
+        (plus:VALL (match_operand:VALL 1 "s_register_operand" "")
+                   (match_operand:VALL 2 "s_register_operand" "")))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
+
+(define_expand "sub<mode>3"
+  [(set (match_operand:VALL 0 "s_register_operand" "")
+        (minus:VALL (match_operand:VALL 1 "s_register_operand" "")
+                    (match_operand:VALL 2 "s_register_operand" "")))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
+
+(define_expand "mul<mode>3"
+  [(set (match_operand:VALLW 0 "s_register_operand" "")
+        (mult:VALLW (match_operand:VALLW 1 "s_register_operand" "")
+		    (match_operand:VALLW 2 "s_register_operand" "")))]
+  "TARGET_NEON || (<MODE>mode == V4HImode && TARGET_REALLY_IWMMXT)"
+{
+})
+
+(define_expand "smin<mode>3"
+  [(set (match_operand:VALLW 0 "s_register_operand" "")
+	(smin:VALLW (match_operand:VALLW 1 "s_register_operand" "")
+		    (match_operand:VALLW 2 "s_register_operand" "")))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
+
+(define_expand "umin<mode>3"
+  [(set (match_operand:VINTW 0 "s_register_operand" "")
+	(umin:VINTW (match_operand:VINTW 1 "s_register_operand" "")
+		    (match_operand:VINTW 2 "s_register_operand" "")))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
+
+(define_expand "smax<mode>3"
+  [(set (match_operand:VALLW 0 "s_register_operand" "")
+	(smax:VALLW (match_operand:VALLW 1 "s_register_operand" "")
+		    (match_operand:VALLW 2 "s_register_operand" "")))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
+
+(define_expand "umax<mode>3"
+  [(set (match_operand:VINTW 0 "s_register_operand" "")
+	(umax:VINTW (match_operand:VINTW 1 "s_register_operand" "")
+		    (match_operand:VINTW 2 "s_register_operand" "")))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
+{
+})
--- a/gcc/doc/arm-neon-intrinsics.texi
+++ b/gcc/doc/arm-neon-intrinsics.texi
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@ -6404,7 +6404,8 @@ instructions, but allow the compiler to schedule those calls.

@menu
 * Alpha Built-in Functions::
-* ARM Built-in Functions::
+* ARM iWMMXt Built-in Functions::
+* ARM NEON Intrinsics::
 * Blackfin Built-in Functions::
 * FR-V Built-in Functions::
 * X86 Built-in Functions::
@ -6497,11 +6498,11 @@ void *__builtin_thread_pointer (void)
 void __builtin_set_thread_pointer (void *)
@end smallexample

-@node ARM Built-in Functions
-@subsection ARM Built-in Functions
+@node ARM iWMMXt Built-in Functions
+@subsection ARM iWMMXt Built-in Functions

 These built-in functions are available for the ARM family of
-processors, when the @option{-mcpu=iwmmxt} switch is used:
+processors when the @option{-mcpu=iwmmxt} switch is used:

@smallexample
 typedef int v2si __attribute__ ((vector_size (8)));
@ -6644,6 +6645,14 @@ long long __builtin_arm_wxor (long long, long long)
 long long __builtin_arm_wzero ()
@end smallexample

+@node ARM NEON Intrinsics
+@subsection ARM NEON Intrinsics
+
+These built-in intrinsics for the ARM Advanced SIMD extension are available
+when the @option{-mfpu=neon} switch is used:
+
+@include arm-neon-intrinsics.texi
+
@node Blackfin Built-in Functions
@subsection Blackfin Built-in Functions

--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@ -1,3 +1,17 @@
+2007-07-25  Julian Brown  <julian@codesourcery.com>
+	    Paul Brook  <paul@codesourcery.com>
+	    Joseph Myers  <joseph@codesourcery.com>
+	    Mark Shinwell  <shinwell@codesourcery.com>
+
+	* gcc.dg/vect/vect.exp: Check is-effective-target arm_neon_hw.
+	* gcc.dg/vect/tree-vect.h: Check for NEON SIMD support.
+	* lib/gcc-dg.exp (cleanup-saved-temps): Fix comment.
+	* lib/target-supports.exp (check_effective_target_arm_neon_ok)
+	(check_effective_target_arm_neon_hw): New.
+	* gcc.target/arm/neon/neon.exp: New file.
+	* gcc.target/arm/neon/polytypes.c: New file.
+	* gcc.target/arm/neon/v*.c (1870 files): New (autogenerated).
+
 2007-07-25  Janis Johnson  <janis187@us.ibm.com>

 	* gcc.c-torture/unsorted/dump-noaddr.c: Reduce string length for
--- a/gcc/testsuite/g++.dg/abi/mangle-neon.C
+++ b/gcc/testsuite/g++.dg/abi/mangle-neon.C
@ -0,0 +1,47 @@
+// Test that ARM NEON vector types have their names mangled correctly. 
+
+// { dg-do compile }
+// { dg-require-effective-target arm_neon_ok }
+// { dg-options "-mfpu=neon -mfloat-abi=softfp" }
+
+#include <arm_neon.h>
+
+void f0 (int8x8_t a) {}
+void f1 (int16x4_t a) {}
+void f2 (int32x2_t a) {}
+void f3 (uint8x8_t a) {}
+void f4 (uint16x4_t a) {}
+void f5 (uint32x2_t a) {}
+void f6 (float32x2_t a) {}
+void f7 (poly8x8_t a) {}
+void f8 (poly16x4_t a) {}
+
+void f9 (int8x16_t a) {}
+void f10 (int16x8_t a) {}
+void f11 (int32x4_t a) {}
+void f12 (uint8x16_t a) {}
+void f13 (uint16x8_t a) {}
+void f14 (uint32x4_t a) {}
+void f15 (float32x4_t a) {}
+void f16 (poly8x16_t a) {}
+void f17 (poly16x8_t a) {}
+
+// { dg-final { scan-assembler "_Z2f015__simd64_int8_t:" } }
+// { dg-final { scan-assembler "_Z2f116__simd64_int16_t:" } }
+// { dg-final { scan-assembler "_Z2f216__simd64_int32_t:" } }
+// { dg-final { scan-assembler "_Z2f316__simd64_uint8_t:" } }
+// { dg-final { scan-assembler "_Z2f417__simd64_uint16_t:" } }
+// { dg-final { scan-assembler "_Z2f517__simd64_uint32_t:" } }
+// { dg-final { scan-assembler "_Z2f618__simd64_float32_t:" } }
+// { dg-final { scan-assembler "_Z2f716__simd64_poly8_t:" } }
+// { dg-final { scan-assembler "_Z2f817__simd64_poly16_t:" } }
+// { dg-final { scan-assembler "_Z2f916__simd128_int8_t:" } }
+// { dg-final { scan-assembler "_Z3f1017__simd128_int16_t:" } }
+// { dg-final { scan-assembler "_Z3f1117__simd128_int32_t:" } }
+// { dg-final { scan-assembler "_Z3f1217__simd128_uint8_t:" } }
+// { dg-final { scan-assembler "_Z3f1318__simd128_uint16_t:" } }
+// { dg-final { scan-assembler "_Z3f1418__simd128_uint32_t:" } }
+// { dg-final { scan-assembler "_Z3f1519__simd128_float32_t:" } }
+// { dg-final { scan-assembler "_Z3f1617__simd128_poly8_t:" } }
+// { dg-final { scan-assembler "_Z3f1718__simd128_poly16_t:" } }
+
--- a/gcc/testsuite/gcc.dg/vect/tree-vect.h
+++ b/gcc/testsuite/gcc.dg/vect/tree-vect.h
@ -21,6 +21,18 @@ void check_vect (void)
  asm volatile (".byte 0xf2,0x0f,0x10,0xc0");
 #elif defined(__sparc__)
  asm volatile (".word\t0x81b007c0");
+#elif defined(__arm__)
+  {
+    /* On some processors without NEON support, this instruction may
+       be a no-op, on others it may trap, so check that it executes
+       correctly.  */
+    long long a = 0, b = 1;
+    asm ("vorr %P0, %P1, %P2"
+	 : "=w" (a)
+	 : "0" (a), "w" (b));
+    if (a != 1)
+      exit (0);
+  }
 #endif
  signal (SIGILL, SIG_DFL);
 }
--- a/gcc/testsuite/gcc.dg/vect/vect.exp
+++ b/gcc/testsuite/gcc.dg/vect/vect.exp
@ -83,6 +83,13 @@ if [istarget "powerpc*-*-*"] {
    }
 } elseif [istarget "ia64-*-*"] {
    set dg-do-what-default run
+} elseif [is-effective-target arm_neon_ok] {
+    lappend DEFAULT_VECTCFLAGS "-mfpu=neon" "-mfloat-abi=softfp"
+    if [is-effective-target arm_neon_hw] {
+      set dg-do-what-default run
+    } else {
+      set dg-do-what-default compile
+    }
 } else {
    return
 }
--- a/gcc/testsuite/gcc.target/arm/neon/neon.exp
+++ b/gcc/testsuite/gcc.target/arm/neon/neon.exp
@ -0,0 +1,35 @@
+# Copyright (C) 1997, 2004, 2006 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an ARM target.
+if ![istarget arm*-*-*] then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
+	"" ""
+
+# All done.
+dg-finish
--- a/gcc/testsuite/gcc.target/arm/neon/polytypes.c
+++ b/gcc/testsuite/gcc.target/arm/neon/polytypes.c
@ -0,0 +1,47 @@
+/* Check that NEON polynomial vector types are suitably incompatible with
+   integer vector types of the same layout.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-mfpu=neon -mfloat-abi=softfp" } */
+
+#include <arm_neon.h>
+
+void s64_8 (int8x8_t a) {}
+void u64_8 (uint8x8_t a) {}
+void p64_8 (poly8x8_t a) {}
+void s64_16 (int16x4_t a) {}
+void u64_16 (uint16x4_t a) {}
+void p64_16 (poly16x4_t a) {}
+
+void s128_8 (int8x16_t a) {}
+void u128_8 (uint8x16_t a) {}
+void p128_8 (poly8x16_t a) {}
+void s128_16 (int16x8_t a) {}
+void u128_16 (uint16x8_t a) {}
+void p128_16 (poly16x8_t a) {}
+
+void foo ()
+{
+  poly8x8_t v64_8;
+  poly16x4_t v64_16;
+  poly8x16_t v128_8;
+  poly16x8_t v128_16;
+
+  s64_8 (v64_8); /* { dg-error "use -flax-vector-conversions.*incompatible type for argument 1 of 's64_8'" } */
+  u64_8 (v64_8); /* { dg-error "incompatible type for argument 1 of 'u64_8'" } */
+  p64_8 (v64_8);
+
+  s64_16 (v64_16); /* { dg-error "incompatible type for argument 1 of 's64_16'" } */
+  u64_16 (v64_16); /* { dg-error "incompatible type for argument 1 of 'u64_16'" } */
+  p64_16 (v64_16);
+
+  s128_8 (v128_8); /* { dg-error "incompatible type for argument 1 of 's128_8'" } */
+  u128_8 (v128_8); /* { dg-error "incompatible type for argument 1 of 'u128_8'" } */
+  p128_8 (v128_8);
+
+  s128_16 (v128_16); /* { dg-error "incompatible type for argument 1 of 's128_16'" } */
+  u128_16 (v128_16); /* { dg-error "incompatible type for argument 1 of 'u128_16'" } */
+  p128_16 (v128_16);
+}
+
--- a/gcc/testsuite/gcc.target/arm/neon/vRaddhns16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRaddhns16.c
@ -0,0 +1,20 @@
+/* Test the `vRaddhns16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRaddhns16 (void)
+{
+  int8x8_t out_int8x8_t;
+  int16x8_t arg0_int16x8_t;
+  int16x8_t arg1_int16x8_t;
+
+  out_int8x8_t = vraddhn_s16 (arg0_int16x8_t, arg1_int16x8_t);
+}
+
+/* { dg-final { scan-assembler "vraddhn\.i16\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRaddhns32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRaddhns32.c
@ -0,0 +1,20 @@
+/* Test the `vRaddhns32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRaddhns32 (void)
+{
+  int16x4_t out_int16x4_t;
+  int32x4_t arg0_int32x4_t;
+  int32x4_t arg1_int32x4_t;
+
+  out_int16x4_t = vraddhn_s32 (arg0_int32x4_t, arg1_int32x4_t);
+}
+
+/* { dg-final { scan-assembler "vraddhn\.i32\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRaddhns64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRaddhns64.c
@ -0,0 +1,20 @@
+/* Test the `vRaddhns64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRaddhns64 (void)
+{
+  int32x2_t out_int32x2_t;
+  int64x2_t arg0_int64x2_t;
+  int64x2_t arg1_int64x2_t;
+
+  out_int32x2_t = vraddhn_s64 (arg0_int64x2_t, arg1_int64x2_t);
+}
+
+/* { dg-final { scan-assembler "vraddhn\.i64\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRaddhnu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRaddhnu16.c
@ -0,0 +1,20 @@
+/* Test the `vRaddhnu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRaddhnu16 (void)
+{
+  uint8x8_t out_uint8x8_t;
+  uint16x8_t arg0_uint16x8_t;
+  uint16x8_t arg1_uint16x8_t;
+
+  out_uint8x8_t = vraddhn_u16 (arg0_uint16x8_t, arg1_uint16x8_t);
+}
+
+/* { dg-final { scan-assembler "vraddhn\.i16\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRaddhnu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRaddhnu32.c
@ -0,0 +1,20 @@
+/* Test the `vRaddhnu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRaddhnu32 (void)
+{
+  uint16x4_t out_uint16x4_t;
+  uint32x4_t arg0_uint32x4_t;
+  uint32x4_t arg1_uint32x4_t;
+
+  out_uint16x4_t = vraddhn_u32 (arg0_uint32x4_t, arg1_uint32x4_t);
+}
+
+/* { dg-final { scan-assembler "vraddhn\.i32\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRaddhnu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRaddhnu64.c
@ -0,0 +1,20 @@
+/* Test the `vRaddhnu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRaddhnu64 (void)
+{
+  uint32x2_t out_uint32x2_t;
+  uint64x2_t arg0_uint64x2_t;
+  uint64x2_t arg1_uint64x2_t;
+
+  out_uint32x2_t = vraddhn_u64 (arg0_uint64x2_t, arg1_uint64x2_t);
+}
+
+/* { dg-final { scan-assembler "vraddhn\.i64\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddQs16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddQs16.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddQs16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddQs16 (void)
+{
+  int16x8_t out_int16x8_t;
+  int16x8_t arg0_int16x8_t;
+  int16x8_t arg1_int16x8_t;
+
+  out_int16x8_t = vrhaddq_s16 (arg0_int16x8_t, arg1_int16x8_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddQs32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddQs32.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddQs32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddQs32 (void)
+{
+  int32x4_t out_int32x4_t;
+  int32x4_t arg0_int32x4_t;
+  int32x4_t arg1_int32x4_t;
+
+  out_int32x4_t = vrhaddq_s32 (arg0_int32x4_t, arg1_int32x4_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddQs8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddQs8.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddQs8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddQs8 (void)
+{
+  int8x16_t out_int8x16_t;
+  int8x16_t arg0_int8x16_t;
+  int8x16_t arg1_int8x16_t;
+
+  out_int8x16_t = vrhaddq_s8 (arg0_int8x16_t, arg1_int8x16_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddQu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddQu16.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddQu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddQu16 (void)
+{
+  uint16x8_t out_uint16x8_t;
+  uint16x8_t arg0_uint16x8_t;
+  uint16x8_t arg1_uint16x8_t;
+
+  out_uint16x8_t = vrhaddq_u16 (arg0_uint16x8_t, arg1_uint16x8_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.u16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddQu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddQu32.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddQu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddQu32 (void)
+{
+  uint32x4_t out_uint32x4_t;
+  uint32x4_t arg0_uint32x4_t;
+  uint32x4_t arg1_uint32x4_t;
+
+  out_uint32x4_t = vrhaddq_u32 (arg0_uint32x4_t, arg1_uint32x4_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.u32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddQu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddQu8.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddQu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddQu8 (void)
+{
+  uint8x16_t out_uint8x16_t;
+  uint8x16_t arg0_uint8x16_t;
+  uint8x16_t arg1_uint8x16_t;
+
+  out_uint8x16_t = vrhaddq_u8 (arg0_uint8x16_t, arg1_uint8x16_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.u8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhadds16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhadds16.c
@ -0,0 +1,20 @@
+/* Test the `vRhadds16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhadds16 (void)
+{
+  int16x4_t out_int16x4_t;
+  int16x4_t arg0_int16x4_t;
+  int16x4_t arg1_int16x4_t;
+
+  out_int16x4_t = vrhadd_s16 (arg0_int16x4_t, arg1_int16x4_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhadds32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhadds32.c
@ -0,0 +1,20 @@
+/* Test the `vRhadds32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhadds32 (void)
+{
+  int32x2_t out_int32x2_t;
+  int32x2_t arg0_int32x2_t;
+  int32x2_t arg1_int32x2_t;
+
+  out_int32x2_t = vrhadd_s32 (arg0_int32x2_t, arg1_int32x2_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhadds8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhadds8.c
@ -0,0 +1,20 @@
+/* Test the `vRhadds8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhadds8 (void)
+{
+  int8x8_t out_int8x8_t;
+  int8x8_t arg0_int8x8_t;
+  int8x8_t arg1_int8x8_t;
+
+  out_int8x8_t = vrhadd_s8 (arg0_int8x8_t, arg1_int8x8_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddu16.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddu16 (void)
+{
+  uint16x4_t out_uint16x4_t;
+  uint16x4_t arg0_uint16x4_t;
+  uint16x4_t arg1_uint16x4_t;
+
+  out_uint16x4_t = vrhadd_u16 (arg0_uint16x4_t, arg1_uint16x4_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.u16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddu32.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddu32 (void)
+{
+  uint32x2_t out_uint32x2_t;
+  uint32x2_t arg0_uint32x2_t;
+  uint32x2_t arg1_uint32x2_t;
+
+  out_uint32x2_t = vrhadd_u32 (arg0_uint32x2_t, arg1_uint32x2_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.u32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRhaddu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRhaddu8.c
@ -0,0 +1,20 @@
+/* Test the `vRhaddu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRhaddu8 (void)
+{
+  uint8x8_t out_uint8x8_t;
+  uint8x8_t arg0_uint8x8_t;
+  uint8x8_t arg1_uint8x8_t;
+
+  out_uint8x8_t = vrhadd_u8 (arg0_uint8x8_t, arg1_uint8x8_t);
+}
+
+/* { dg-final { scan-assembler "vrhadd\.u8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQs16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQs16.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQs16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQs16 (void)
+{
+  int16x8_t out_int16x8_t;
+  int16x8_t arg0_int16x8_t;
+  int16x8_t arg1_int16x8_t;
+
+  out_int16x8_t = vrshlq_s16 (arg0_int16x8_t, arg1_int16x8_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQs32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQs32.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQs32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQs32 (void)
+{
+  int32x4_t out_int32x4_t;
+  int32x4_t arg0_int32x4_t;
+  int32x4_t arg1_int32x4_t;
+
+  out_int32x4_t = vrshlq_s32 (arg0_int32x4_t, arg1_int32x4_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQs64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQs64.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQs64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQs64 (void)
+{
+  int64x2_t out_int64x2_t;
+  int64x2_t arg0_int64x2_t;
+  int64x2_t arg1_int64x2_t;
+
+  out_int64x2_t = vrshlq_s64 (arg0_int64x2_t, arg1_int64x2_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s64\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQs8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQs8.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQs8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQs8 (void)
+{
+  int8x16_t out_int8x16_t;
+  int8x16_t arg0_int8x16_t;
+  int8x16_t arg1_int8x16_t;
+
+  out_int8x16_t = vrshlq_s8 (arg0_int8x16_t, arg1_int8x16_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQu16.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQu16 (void)
+{
+  uint16x8_t out_uint16x8_t;
+  uint16x8_t arg0_uint16x8_t;
+  int16x8_t arg1_int16x8_t;
+
+  out_uint16x8_t = vrshlq_u16 (arg0_uint16x8_t, arg1_int16x8_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQu32.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQu32 (void)
+{
+  uint32x4_t out_uint32x4_t;
+  uint32x4_t arg0_uint32x4_t;
+  int32x4_t arg1_int32x4_t;
+
+  out_uint32x4_t = vrshlq_u32 (arg0_uint32x4_t, arg1_int32x4_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQu64.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQu64 (void)
+{
+  uint64x2_t out_uint64x2_t;
+  uint64x2_t arg0_uint64x2_t;
+  int64x2_t arg1_int64x2_t;
+
+  out_uint64x2_t = vrshlq_u64 (arg0_uint64x2_t, arg1_int64x2_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u64\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlQu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlQu8.c
@ -0,0 +1,20 @@
+/* Test the `vRshlQu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlQu8 (void)
+{
+  uint8x16_t out_uint8x16_t;
+  uint8x16_t arg0_uint8x16_t;
+  int8x16_t arg1_int8x16_t;
+
+  out_uint8x16_t = vrshlq_u8 (arg0_uint8x16_t, arg1_int8x16_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshls16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshls16.c
@ -0,0 +1,20 @@
+/* Test the `vRshls16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshls16 (void)
+{
+  int16x4_t out_int16x4_t;
+  int16x4_t arg0_int16x4_t;
+  int16x4_t arg1_int16x4_t;
+
+  out_int16x4_t = vrshl_s16 (arg0_int16x4_t, arg1_int16x4_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshls32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshls32.c
@ -0,0 +1,20 @@
+/* Test the `vRshls32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshls32 (void)
+{
+  int32x2_t out_int32x2_t;
+  int32x2_t arg0_int32x2_t;
+  int32x2_t arg1_int32x2_t;
+
+  out_int32x2_t = vrshl_s32 (arg0_int32x2_t, arg1_int32x2_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshls64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshls64.c
@ -0,0 +1,20 @@
+/* Test the `vRshls64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshls64 (void)
+{
+  int64x1_t out_int64x1_t;
+  int64x1_t arg0_int64x1_t;
+  int64x1_t arg1_int64x1_t;
+
+  out_int64x1_t = vrshl_s64 (arg0_int64x1_t, arg1_int64x1_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s64\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshls8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshls8.c
@ -0,0 +1,20 @@
+/* Test the `vRshls8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshls8 (void)
+{
+  int8x8_t out_int8x8_t;
+  int8x8_t arg0_int8x8_t;
+  int8x8_t arg1_int8x8_t;
+
+  out_int8x8_t = vrshl_s8 (arg0_int8x8_t, arg1_int8x8_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlu16.c
@ -0,0 +1,20 @@
+/* Test the `vRshlu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlu16 (void)
+{
+  uint16x4_t out_uint16x4_t;
+  uint16x4_t arg0_uint16x4_t;
+  int16x4_t arg1_int16x4_t;
+
+  out_uint16x4_t = vrshl_u16 (arg0_uint16x4_t, arg1_int16x4_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlu32.c
@ -0,0 +1,20 @@
+/* Test the `vRshlu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlu32 (void)
+{
+  uint32x2_t out_uint32x2_t;
+  uint32x2_t arg0_uint32x2_t;
+  int32x2_t arg1_int32x2_t;
+
+  out_uint32x2_t = vrshl_u32 (arg0_uint32x2_t, arg1_int32x2_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlu64.c
@ -0,0 +1,20 @@
+/* Test the `vRshlu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlu64 (void)
+{
+  uint64x1_t out_uint64x1_t;
+  uint64x1_t arg0_uint64x1_t;
+  int64x1_t arg1_int64x1_t;
+
+  out_uint64x1_t = vrshl_u64 (arg0_uint64x1_t, arg1_int64x1_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u64\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshlu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshlu8.c
@ -0,0 +1,20 @@
+/* Test the `vRshlu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshlu8 (void)
+{
+  uint8x8_t out_uint8x8_t;
+  uint8x8_t arg0_uint8x8_t;
+  int8x8_t arg1_int8x8_t;
+
+  out_uint8x8_t = vrshl_u8 (arg0_uint8x8_t, arg1_int8x8_t);
+}
+
+/* { dg-final { scan-assembler "vrshl\.u8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns16.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_ns16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_ns16 (void)
+{
+  int16x8_t out_int16x8_t;
+  int16x8_t arg0_int16x8_t;
+
+  out_int16x8_t = vrshrq_n_s16 (arg0_int16x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns32.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_ns32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_ns32 (void)
+{
+  int32x4_t out_int32x4_t;
+  int32x4_t arg0_int32x4_t;
+
+  out_int32x4_t = vrshrq_n_s32 (arg0_int32x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns64.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_ns64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_ns64 (void)
+{
+  int64x2_t out_int64x2_t;
+  int64x2_t arg0_int64x2_t;
+
+  out_int64x2_t = vrshrq_n_s64 (arg0_int64x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s64\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_ns8.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_ns8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_ns8 (void)
+{
+  int8x16_t out_int8x16_t;
+  int8x16_t arg0_int8x16_t;
+
+  out_int8x16_t = vrshrq_n_s8 (arg0_int8x16_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu16.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_nu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_nu16 (void)
+{
+  uint16x8_t out_uint16x8_t;
+  uint16x8_t arg0_uint16x8_t;
+
+  out_uint16x8_t = vrshrq_n_u16 (arg0_uint16x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu32.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_nu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_nu32 (void)
+{
+  uint32x4_t out_uint32x4_t;
+  uint32x4_t arg0_uint32x4_t;
+
+  out_uint32x4_t = vrshrq_n_u32 (arg0_uint32x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu64.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_nu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_nu64 (void)
+{
+  uint64x2_t out_uint64x2_t;
+  uint64x2_t arg0_uint64x2_t;
+
+  out_uint64x2_t = vrshrq_n_u64 (arg0_uint64x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u64\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrQ_nu8.c
@ -0,0 +1,19 @@
+/* Test the `vRshrQ_nu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrQ_nu8 (void)
+{
+  uint8x16_t out_uint8x16_t;
+  uint8x16_t arg0_uint8x16_t;
+
+  out_uint8x16_t = vrshrq_n_u8 (arg0_uint8x16_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_ns16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_ns16.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_ns16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_ns16 (void)
+{
+  int16x4_t out_int16x4_t;
+  int16x4_t arg0_int16x4_t;
+
+  out_int16x4_t = vrshr_n_s16 (arg0_int16x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_ns32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_ns32.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_ns32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_ns32 (void)
+{
+  int32x2_t out_int32x2_t;
+  int32x2_t arg0_int32x2_t;
+
+  out_int32x2_t = vrshr_n_s32 (arg0_int32x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_ns64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_ns64.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_ns64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_ns64 (void)
+{
+  int64x1_t out_int64x1_t;
+  int64x1_t arg0_int64x1_t;
+
+  out_int64x1_t = vrshr_n_s64 (arg0_int64x1_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s64\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_ns8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_ns8.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_ns8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_ns8 (void)
+{
+  int8x8_t out_int8x8_t;
+  int8x8_t arg0_int8x8_t;
+
+  out_int8x8_t = vrshr_n_s8 (arg0_int8x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_nu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_nu16.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_nu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_nu16 (void)
+{
+  uint16x4_t out_uint16x4_t;
+  uint16x4_t arg0_uint16x4_t;
+
+  out_uint16x4_t = vrshr_n_u16 (arg0_uint16x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_nu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_nu32.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_nu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_nu32 (void)
+{
+  uint32x2_t out_uint32x2_t;
+  uint32x2_t arg0_uint32x2_t;
+
+  out_uint32x2_t = vrshr_n_u32 (arg0_uint32x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_nu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_nu64.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_nu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_nu64 (void)
+{
+  uint64x1_t out_uint64x1_t;
+  uint64x1_t arg0_uint64x1_t;
+
+  out_uint64x1_t = vrshr_n_u64 (arg0_uint64x1_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u64\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshr_nu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshr_nu8.c
@ -0,0 +1,19 @@
+/* Test the `vRshr_nu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshr_nu8 (void)
+{
+  uint8x8_t out_uint8x8_t;
+  uint8x8_t arg0_uint8x8_t;
+
+  out_uint8x8_t = vrshr_n_u8 (arg0_uint8x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshr\.u8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrn_ns16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrn_ns16.c
@ -0,0 +1,19 @@
+/* Test the `vRshrn_ns16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrn_ns16 (void)
+{
+  int8x8_t out_int8x8_t;
+  int16x8_t arg0_int16x8_t;
+
+  out_int8x8_t = vrshrn_n_s16 (arg0_int16x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshrn\.i16\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrn_ns32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrn_ns32.c
@ -0,0 +1,19 @@
+/* Test the `vRshrn_ns32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrn_ns32 (void)
+{
+  int16x4_t out_int16x4_t;
+  int32x4_t arg0_int32x4_t;
+
+  out_int16x4_t = vrshrn_n_s32 (arg0_int32x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshrn\.i32\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrn_ns64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrn_ns64.c
@ -0,0 +1,19 @@
+/* Test the `vRshrn_ns64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrn_ns64 (void)
+{
+  int32x2_t out_int32x2_t;
+  int64x2_t arg0_int64x2_t;
+
+  out_int32x2_t = vrshrn_n_s64 (arg0_int64x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshrn\.i64\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrn_nu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrn_nu16.c
@ -0,0 +1,19 @@
+/* Test the `vRshrn_nu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrn_nu16 (void)
+{
+  uint8x8_t out_uint8x8_t;
+  uint16x8_t arg0_uint16x8_t;
+
+  out_uint8x8_t = vrshrn_n_u16 (arg0_uint16x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshrn\.i16\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrn_nu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrn_nu32.c
@ -0,0 +1,19 @@
+/* Test the `vRshrn_nu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrn_nu32 (void)
+{
+  uint16x4_t out_uint16x4_t;
+  uint32x4_t arg0_uint32x4_t;
+
+  out_uint16x4_t = vrshrn_n_u32 (arg0_uint32x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshrn\.i32\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRshrn_nu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRshrn_nu64.c
@ -0,0 +1,19 @@
+/* Test the `vRshrn_nu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRshrn_nu64 (void)
+{
+  uint32x2_t out_uint32x2_t;
+  uint64x2_t arg0_uint64x2_t;
+
+  out_uint32x2_t = vrshrn_n_u64 (arg0_uint64x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrshrn\.i64\[ 	\]+\[dD\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns16.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_ns16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_ns16 (void)
+{
+  int16x8_t out_int16x8_t;
+  int16x8_t arg0_int16x8_t;
+  int16x8_t arg1_int16x8_t;
+
+  out_int16x8_t = vrsraq_n_s16 (arg0_int16x8_t, arg1_int16x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns32.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_ns32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_ns32 (void)
+{
+  int32x4_t out_int32x4_t;
+  int32x4_t arg0_int32x4_t;
+  int32x4_t arg1_int32x4_t;
+
+  out_int32x4_t = vrsraq_n_s32 (arg0_int32x4_t, arg1_int32x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns64.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_ns64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_ns64 (void)
+{
+  int64x2_t out_int64x2_t;
+  int64x2_t arg0_int64x2_t;
+  int64x2_t arg1_int64x2_t;
+
+  out_int64x2_t = vrsraq_n_s64 (arg0_int64x2_t, arg1_int64x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s64\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_ns8.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_ns8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_ns8 (void)
+{
+  int8x16_t out_int8x16_t;
+  int8x16_t arg0_int8x16_t;
+  int8x16_t arg1_int8x16_t;
+
+  out_int8x16_t = vrsraq_n_s8 (arg0_int8x16_t, arg1_int8x16_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu16.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_nu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_nu16 (void)
+{
+  uint16x8_t out_uint16x8_t;
+  uint16x8_t arg0_uint16x8_t;
+  uint16x8_t arg1_uint16x8_t;
+
+  out_uint16x8_t = vrsraq_n_u16 (arg0_uint16x8_t, arg1_uint16x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.u16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu32.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_nu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_nu32 (void)
+{
+  uint32x4_t out_uint32x4_t;
+  uint32x4_t arg0_uint32x4_t;
+  uint32x4_t arg1_uint32x4_t;
+
+  out_uint32x4_t = vrsraq_n_u32 (arg0_uint32x4_t, arg1_uint32x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.u32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu64.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_nu64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_nu64 (void)
+{
+  uint64x2_t out_uint64x2_t;
+  uint64x2_t arg0_uint64x2_t;
+  uint64x2_t arg1_uint64x2_t;
+
+  out_uint64x2_t = vrsraq_n_u64 (arg0_uint64x2_t, arg1_uint64x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.u64\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsraQ_nu8.c
@ -0,0 +1,20 @@
+/* Test the `vRsraQ_nu8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsraQ_nu8 (void)
+{
+  uint8x16_t out_uint8x16_t;
+  uint8x16_t arg0_uint8x16_t;
+  uint8x16_t arg1_uint8x16_t;
+
+  out_uint8x16_t = vrsraq_n_u8 (arg0_uint8x16_t, arg1_uint8x16_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.u8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsra_ns16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsra_ns16.c
@ -0,0 +1,20 @@
+/* Test the `vRsra_ns16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsra_ns16 (void)
+{
+  int16x4_t out_int16x4_t;
+  int16x4_t arg0_int16x4_t;
+  int16x4_t arg1_int16x4_t;
+
+  out_int16x4_t = vrsra_n_s16 (arg0_int16x4_t, arg1_int16x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsra_ns32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsra_ns32.c
@ -0,0 +1,20 @@
+/* Test the `vRsra_ns32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsra_ns32 (void)
+{
+  int32x2_t out_int32x2_t;
+  int32x2_t arg0_int32x2_t;
+  int32x2_t arg1_int32x2_t;
+
+  out_int32x2_t = vrsra_n_s32 (arg0_int32x2_t, arg1_int32x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsra_ns64.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsra_ns64.c
@ -0,0 +1,20 @@
+/* Test the `vRsra_ns64' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsra_ns64 (void)
+{
+  int64x1_t out_int64x1_t;
+  int64x1_t arg0_int64x1_t;
+  int64x1_t arg1_int64x1_t;
+
+  out_int64x1_t = vrsra_n_s64 (arg0_int64x1_t, arg1_int64x1_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s64\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsra_ns8.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsra_ns8.c
@ -0,0 +1,20 @@
+/* Test the `vRsra_ns8' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsra_ns8 (void)
+{
+  int8x8_t out_int8x8_t;
+  int8x8_t arg0_int8x8_t;
+  int8x8_t arg1_int8x8_t;
+
+  out_int8x8_t = vrsra_n_s8 (arg0_int8x8_t, arg1_int8x8_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsra_nu16.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsra_nu16.c
@ -0,0 +1,20 @@
+/* Test the `vRsra_nu16' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsra_nu16 (void)
+{
+  uint16x4_t out_uint16x4_t;
+  uint16x4_t arg0_uint16x4_t;
+  uint16x4_t arg1_uint16x4_t;
+
+  out_uint16x4_t = vrsra_n_u16 (arg0_uint16x4_t, arg1_uint16x4_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.u16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/gcc/testsuite/gcc.target/arm/neon/vRsra_nu32.c
+++ b/gcc/testsuite/gcc.target/arm/neon/vRsra_nu32.c
@ -0,0 +1,20 @@
+/* Test the `vRsra_nu32' ARM Neon intrinsic.  */
+/* This file was autogenerated by neon-testgen.  */
+
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-save-temps -O0 -mfpu=neon -mfloat-abi=softfp" } */
+
+#include "arm_neon.h"
+
+void test_vRsra_nu32 (void)
+{
+  uint32x2_t out_uint32x2_t;
+  uint32x2_t arg0_uint32x2_t;
+  uint32x2_t arg1_uint32x2_t;
+
+  out_uint32x2_t = vrsra_n_u32 (arg0_uint32x2_t, arg1_uint32x2_t, 1);
+}
+
+/* { dg-final { scan-assembler "vrsra\.u32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #\[0-9\]+!?\(\[ 	\]+@\[a-zA-Z0-9 \]+\)?\n" } } */
+/* { dg-final { cleanup-saved-temps } } */
--- a/Show more
+++ b/Show more