re PR target/36079 (cld instruction is not emitted anymore.)
PR target/36079 * configure.ac: Handle --enable-cld. * configure: Regenerated. * config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets. * config/i386/i386.h (struct machine_function): Add needs_cld field. (ix86_current_function_needs_cld): New define. * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant. (cld): New isns pattern. (strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1, cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag. * config/i386/i386.opt (mcld): New option. * config/i386/i386.c (ix86_expand_prologue): Emit cld insn if TARGET_CLD and ix86_current_function_needs_cld. (override_options): Use -mcld by default for 32-bit code if USE_IX86_CLD. * doc/install.texi (Options specification): Document --enable-cld. * doc/invoke.texi (Machine Dependent Options) [i386 and x86-64 Options]: Add -mcld option. (Intel 386 and AMD x86-64 Options): Document -mcld option. From-SVN: r135792
This commit is contained in:
parent
71995c2c69
commit
922e3e33b2
10 changed files with 122 additions and 34 deletions
|
@ -1,3 +1,27 @@
|
|||
2008-05-23 Uros Bizjak <ubizjak@gmail.com>
|
||||
Jakub Jelinek <jakub@redhat.com>
|
||||
|
||||
PR target/36079
|
||||
* configure.ac: Handle --enable-cld.
|
||||
* configure: Regenerated.
|
||||
* config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets.
|
||||
* config/i386/i386.h (struct machine_function): Add needs_cld field.
|
||||
(ix86_current_function_needs_cld): New define.
|
||||
* config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
|
||||
(cld): New isns pattern.
|
||||
(strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
|
||||
cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
|
||||
* config/i386/i386.opt (mcld): New option.
|
||||
* config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
|
||||
TARGET_CLD and ix86_current_function_needs_cld.
|
||||
(override_options): Use -mcld by default for 32-bit code if
|
||||
USE_IX86_CLD.
|
||||
|
||||
* doc/install.texi (Options specification): Document --enable-cld.
|
||||
* doc/invoke.texi (Machine Dependent Options)
|
||||
[i386 and x86-64 Options]: Add -mcld option.
|
||||
(Intel 386 and AMD x86-64 Options): Document -mcld option.
|
||||
|
||||
2008-05-23 Kai Tietz <kai.tietz@onevison.com>
|
||||
* config/i386/i386.c (return_in_memory_32): Add ATTRIBUTE_UNUSED.
|
||||
(return_in_memory_64): Likewise.
|
||||
|
@ -58,9 +82,8 @@
|
|||
(vector_alignment_reachable_p): Likewise.
|
||||
* tree-vect-transform.c (vectorizable_load): Likewise.
|
||||
* tree-vectorizer.c (vect_supportable_dr_alignment): Likewise.
|
||||
|
||||
* tree-vectorizer.c (get_vectype_for_scalar_type): Pass mode of
|
||||
scalar_type to UNITS_PER_SIMD_WORD.
|
||||
(get_vectype_for_scalar_type): Pass mode of scalar_type
|
||||
to UNITS_PER_SIMD_WORD.
|
||||
|
||||
* config/arm/arm.h (UNITS_PER_SIMD_WORD): Updated.
|
||||
* config/i386/i386.h (UNITS_PER_SIMD_WORD): Likewise.
|
||||
|
@ -206,27 +229,21 @@
|
|||
2008-05-20 David Daney <ddaney@avtrex.com>
|
||||
|
||||
* config/mips/mips.md (UNSPEC_SYNC_NEW_OP_12,
|
||||
UNSPEC_SYNC_OLD_OP_12,
|
||||
UNSPEC_SYNC_EXCHANGE_12): New define_constants.
|
||||
(UNSPEC_SYNC_EXCHANGE, UNSPEC_MEMORY_BARRIER,
|
||||
UNSPEC_SET_GOT_VERSION,
|
||||
UNSPEC_SYNC_OLD_OP_12, UNSPEC_SYNC_EXCHANGE_12): New define_constants.
|
||||
(UNSPEC_SYNC_EXCHANGE, UNSPEC_MEMORY_BARRIER, UNSPEC_SET_GOT_VERSION,
|
||||
UNSPEC_UPDATE_GOT_VERSION): Renumber.
|
||||
(optab, insn): Add 'plus' and 'minus' to define_code_attr.
|
||||
(atomic_hiqi_op): New define_code_iterator.
|
||||
(sync_compare_and_swap<mode>): Call
|
||||
mips_expand_atomic_qihi instead of
|
||||
(sync_compare_and_swap<mode>): Call mips_expand_atomic_qihi instead of
|
||||
mips_expand_compare_and_swap_12.
|
||||
(compare_and_swap_12): Use MIPS_COMPARE_AND_SWAP_12 instead of
|
||||
MIPS_COMPARE_AND_SWAP_12_0. Pass argument to
|
||||
MIPS_COMPARE_AND_SWAP_12.
|
||||
MIPS_COMPARE_AND_SWAP_12_0. Pass argument to MIPS_COMPARE_AND_SWAP_12.
|
||||
(sync_<optab><mode>, sync_old_<optab><mode>,
|
||||
sync_new_<optab><mode>, sync_nand<mode>, sync_old_nand<mode>,
|
||||
sync_new_nand<mode>): New define_expands for HI and QI mode
|
||||
operands.
|
||||
sync_new_nand<mode>): New define_expands for HI and QI mode operands.
|
||||
(sync_<optab>_12, sync_old_<optab>_12, sync_new_<optab>_12,
|
||||
sync_nand_12, sync_old_nand_12, sync_new_nand_12): New insns.
|
||||
(sync_lock_test_and_set<mode>): New define_expand for HI and QI
|
||||
modes.
|
||||
(sync_lock_test_and_set<mode>): New define_expand for HI and QI modes.
|
||||
(test_and_set_12): New insn.
|
||||
(sync_old_add<mode>, sync_new_add<mode>, sync_old_<optab><mode>,
|
||||
sync_new_<optab><mode>, sync_old_nand<mode>,
|
||||
|
@ -284,10 +301,12 @@
|
|||
2008-05-20 Jan Sjodin <jan.sjodin@amd.com>
|
||||
Sebastian Pop <sebastian.pop@amd.com>
|
||||
|
||||
* tree-loop-linear.c (gather_interchange_stats): Look in the access matrix,
|
||||
and never look at the tree representation of the memory accesses.
|
||||
* tree-loop-linear.c (gather_interchange_stats): Look in the access
|
||||
matrix, and never look at the tree representation of the memory
|
||||
accesses.
|
||||
(linear_transform_loops): Computes parameters and access matrices.
|
||||
* tree-data-ref.c (compute_data_dependences_for_loop): Returns false when fails.
|
||||
* tree-data-ref.c (compute_data_dependences_for_loop): Returns false
|
||||
when fails.
|
||||
(access_matrix_get_index_for_parameter): New.
|
||||
* tree-data-ref.h (struct access_matrix): New.
|
||||
(AM_LOOP_NEST_NUM, AM_NB_INDUCTION_VARS, AM_PARAMETERS, AM_MATRIX,
|
||||
|
@ -333,15 +352,15 @@
|
|||
|
||||
PR tree-optimization/36206
|
||||
* tree-chrec.h (chrec_fold_op): New.
|
||||
* tree-data-ref.c (initialize_matrix_A): Traverse NOP_EXPR, PLUS_EXPR, and
|
||||
other trees.
|
||||
* tree-data-ref.c (initialize_matrix_A): Traverse NOP_EXPR, PLUS_EXPR,
|
||||
and other trees.
|
||||
|
||||
2008-05-20 Nathan Sidwell <nathan@codesourcery.com>
|
||||
|
||||
* c-incpath.c (INO_T_EQ): Do not define on non-inode systems.
|
||||
(DIRS_EQ): New.
|
||||
(remove_duplicates): Do not set inode on non-inode systems. Use
|
||||
DIRS_EQ.
|
||||
(remove_duplicates): Do not set inode on non-inode systems.
|
||||
Use DIRS_EQ.
|
||||
|
||||
2008-05-20 Sandra Loosemore <sandra@codesourcery.com>
|
||||
|
||||
|
@ -349,8 +368,7 @@
|
|||
|
||||
2008-05-20 Richard Guenther <rguenther@suse.de>
|
||||
|
||||
* tree-ssa-reassoc.c (fini_reassoc): Use the statistics
|
||||
infrastructure.
|
||||
* tree-ssa-reassoc.c (fini_reassoc): Use the statistics infrastructure.
|
||||
* tree-ssa-sccvn.c (process_scc): Likewise.
|
||||
* tree-ssa-sink.c (execute_sink_code): Likewise.
|
||||
* tree-ssa-threadupdate.c (thread_through_all_blocks): Likewise.
|
||||
|
|
|
@ -397,8 +397,16 @@ then
|
|||
fi
|
||||
|
||||
case ${target} in
|
||||
i[34567]86-*-*)
|
||||
if test $enable_cld = yes; then
|
||||
tm_defines="${tm_defines} USE_IX86_CLD=1"
|
||||
fi
|
||||
;;
|
||||
x86_64-*-*)
|
||||
tm_file="i386/biarch64.h ${tm_file}"
|
||||
if test $enable_cld = yes; then
|
||||
tm_defines="${tm_defines} USE_IX86_CLD=1"
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
|
||||
|
|
|
@ -2764,6 +2764,12 @@ override_options (void)
|
|||
can be optimized to ap = __builtin_next_arg (0). */
|
||||
if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
|
||||
targetm.expand_builtin_va_start = NULL;
|
||||
|
||||
#ifdef USE_IX86_CLD
|
||||
/* Use -mcld by default for 32-bit code if configured with --enable-cld. */
|
||||
if (!TARGET_64BIT)
|
||||
target_flags |= MASK_CLD & ~target_flags_explicit;
|
||||
#endif
|
||||
}
|
||||
|
||||
/* Return true if this goes in large data/bss. */
|
||||
|
@ -6597,6 +6603,10 @@ ix86_expand_prologue (void)
|
|||
emit_insn (gen_prologue_use (pic_offset_table_rtx));
|
||||
emit_insn (gen_blockage ());
|
||||
}
|
||||
|
||||
/* Emit cld instruction if stringops are used in the function. */
|
||||
if (TARGET_CLD && ix86_current_function_needs_cld)
|
||||
emit_insn (gen_cld ());
|
||||
}
|
||||
|
||||
/* Emit code to restore saved registers using MOV insns. First register
|
||||
|
|
|
@ -2432,8 +2432,9 @@ struct machine_function GTY(())
|
|||
int save_varrargs_registers;
|
||||
int accesses_prev_frame;
|
||||
int optimize_mode_switching[MAX_386_ENTITIES];
|
||||
/* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to
|
||||
determine the style used. */
|
||||
int needs_cld;
|
||||
/* Set by ix86_compute_frame_layout and used by prologue/epilogue
|
||||
expander to determine the style used. */
|
||||
int use_fast_prologue_epilogue;
|
||||
/* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed
|
||||
for. */
|
||||
|
@ -2453,6 +2454,7 @@ struct machine_function GTY(())
|
|||
#define ix86_stack_locals (cfun->machine->stack_locals)
|
||||
#define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers)
|
||||
#define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching)
|
||||
#define ix86_current_function_needs_cld (cfun->machine->needs_cld)
|
||||
#define ix86_tls_descriptor_calls_expanded_in_cfun \
|
||||
(cfun->machine->tls_descriptor_call_expanded_p)
|
||||
/* Since tls_descriptor_call_expanded is not cleared, even if all TLS
|
||||
|
|
|
@ -213,6 +213,7 @@
|
|||
(UNSPECV_XCHG 12)
|
||||
(UNSPECV_LOCK 13)
|
||||
(UNSPECV_PROLOGUE_USE 14)
|
||||
(UNSPECV_CLD 15)
|
||||
])
|
||||
|
||||
;; Constants to represent pcomtrue/pcomfalse variants
|
||||
|
@ -18374,6 +18375,14 @@
|
|||
|
||||
;; Block operation instructions
|
||||
|
||||
(define_insn "cld"
|
||||
[(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
|
||||
""
|
||||
"cld"
|
||||
[(set_attr "length" "1")
|
||||
(set_attr "length_immediate" "0")
|
||||
(set_attr "modrm" "0")])
|
||||
|
||||
(define_expand "movmemsi"
|
||||
[(use (match_operand:BLK 0 "memory_operand" ""))
|
||||
(use (match_operand:BLK 1 "memory_operand" ""))
|
||||
|
@ -18446,7 +18455,7 @@
|
|||
(set (match_operand 2 "register_operand" "")
|
||||
(match_operand 5 "" ""))])]
|
||||
"TARGET_SINGLE_STRINGOP || optimize_size"
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*strmovdi_rex_1"
|
||||
[(set (mem:DI (match_operand:DI 2 "register_operand" "0"))
|
||||
|
@ -18563,7 +18572,7 @@
|
|||
(match_operand 3 "memory_operand" ""))
|
||||
(use (match_dup 4))])]
|
||||
""
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*rep_movdi_rex64"
|
||||
[(set (match_operand:DI 2 "register_operand" "=c") (const_int 0))
|
||||
|
@ -18723,7 +18732,7 @@
|
|||
(set (match_operand 0 "register_operand" "")
|
||||
(match_operand 3 "" ""))])]
|
||||
"TARGET_SINGLE_STRINGOP || optimize_size"
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*strsetdi_rex_1"
|
||||
[(set (mem:DI (match_operand:DI 1 "register_operand" "0"))
|
||||
|
@ -18817,7 +18826,7 @@
|
|||
(use (match_operand 3 "register_operand" ""))
|
||||
(use (match_dup 1))])]
|
||||
""
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*rep_stosdi_rex64"
|
||||
[(set (match_operand:DI 1 "register_operand" "=c") (const_int 0))
|
||||
|
@ -18993,7 +19002,7 @@
|
|||
(clobber (match_operand 1 "register_operand" ""))
|
||||
(clobber (match_dup 2))])]
|
||||
""
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*cmpstrnqi_nz_1"
|
||||
[(set (reg:CC FLAGS_REG)
|
||||
|
@ -19040,7 +19049,7 @@
|
|||
(clobber (match_operand 1 "register_operand" ""))
|
||||
(clobber (match_dup 2))])]
|
||||
""
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*cmpstrnqi_1"
|
||||
[(set (reg:CC FLAGS_REG)
|
||||
|
@ -19109,7 +19118,7 @@
|
|||
(clobber (match_operand 1 "register_operand" ""))
|
||||
(clobber (reg:CC FLAGS_REG))])]
|
||||
""
|
||||
"")
|
||||
"ix86_current_function_needs_cld = 1;")
|
||||
|
||||
(define_insn "*strlenqi_1"
|
||||
[(set (match_operand:SI 0 "register_operand" "=&c")
|
||||
|
|
|
@ -250,6 +250,10 @@ Support SSE5 built-in functions and code generation
|
|||
|
||||
;; Instruction support
|
||||
|
||||
mcld
|
||||
Target Report Mask(CLD)
|
||||
Generate cld instruction in the function prologue.
|
||||
|
||||
mabm
|
||||
Target Report RejectNegative Var(x86_abm)
|
||||
Support code generation of Advanced Bit Manipulation (ABM) instructions.
|
||||
|
|
9
gcc/configure
vendored
9
gcc/configure
vendored
|
@ -1046,6 +1046,7 @@ Optional Features:
|
|||
--enable-sjlj-exceptions
|
||||
arrange to use setjmp/longjmp exception handling
|
||||
--enable-secureplt enable -msecure-plt by default for PowerPC
|
||||
--enable-cld enable -mcld by default for 32bit x86
|
||||
--disable-win32-registry
|
||||
disable lookup of installation paths in the
|
||||
Registry on Windows hosts
|
||||
|
@ -13709,6 +13710,14 @@ if test "${enable_secureplt+set}" = set; then
|
|||
|
||||
fi;
|
||||
|
||||
# Check whether --enable-cld or --disable-cld was given.
|
||||
if test "${enable_cld+set}" = set; then
|
||||
enableval="$enable_cld"
|
||||
|
||||
else
|
||||
enable_cld=no
|
||||
fi;
|
||||
|
||||
# Windows32 Registry support for specifying GCC installation paths.
|
||||
# Check whether --enable-win32-registry or --disable-win32-registry was given.
|
||||
if test "${enable_win32_registry+set}" = set; then
|
||||
|
|
|
@ -1528,6 +1528,10 @@ AC_ARG_ENABLE(secureplt,
|
|||
[ --enable-secureplt enable -msecure-plt by default for PowerPC],
|
||||
[], [])
|
||||
|
||||
AC_ARG_ENABLE(cld,
|
||||
[ --enable-cld enable -mcld by default for 32bit x86], [],
|
||||
[enable_cld=no])
|
||||
|
||||
# Windows32 Registry support for specifying GCC installation paths.
|
||||
AC_ARG_ENABLE(win32-registry,
|
||||
[ --disable-win32-registry
|
||||
|
|
|
@ -1214,6 +1214,16 @@ Using the GNU Compiler Collection (GCC)},
|
|||
See ``RS/6000 and PowerPC Options'' in the main manual
|
||||
@end ifhtml
|
||||
|
||||
@item --enable-cld
|
||||
This option enables @option{-mcld} by default for 32-bit x86 targets.
|
||||
@ifnothtml
|
||||
@xref{i386 and x86-64 Options,, i386 and x86-64 Options, gcc,
|
||||
Using the GNU Compiler Collection (GCC)},
|
||||
@end ifnothtml
|
||||
@ifhtml
|
||||
See ``i386 and x86-64 Options'' in the main manual
|
||||
@end ifhtml
|
||||
|
||||
@item --enable-win32-registry
|
||||
@itemx --enable-win32-registry=@var{key}
|
||||
@itemx --disable-win32-registry
|
||||
|
|
|
@ -553,7 +553,7 @@ Objective-C and Objective-C++ Dialects}.
|
|||
-masm=@var{dialect} -mno-fancy-math-387 @gol
|
||||
-mno-fp-ret-in-387 -msoft-float @gol
|
||||
-mno-wide-multiply -mrtd -malign-double @gol
|
||||
-mpreferred-stack-boundary=@var{num} -mcx16 -msahf -mrecip @gol
|
||||
-mpreferred-stack-boundary=@var{num} -mcld -mcx16 -msahf -mrecip @gol
|
||||
-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 @gol
|
||||
-maes -mpclmul @gol
|
||||
-msse4a -m3dnow -mpopcnt -mabm -msse5 @gol
|
||||
|
@ -10814,6 +10814,20 @@ supported architecture, using the appropriate flags. In particular,
|
|||
the file containing the CPU detection code should be compiled without
|
||||
these options.
|
||||
|
||||
@item -mcld
|
||||
@opindex mcld
|
||||
This option instructs GCC to emit a @code{cld} instruction in the prologue
|
||||
of functions that use string instructions. String instructions depend on
|
||||
the DF flag to select between autoincrement or autodecrement mode. While the
|
||||
ABI specifies the DF flag to be cleared on function entry, some operating
|
||||
systems violate this specification by not clearing the DF flag in their
|
||||
exception dispatchers. The exception handler can be invoked with the DF flag
|
||||
set which leads to wrong direction mode, when string instructions are used.
|
||||
This option can be enabled by default on 32-bit x86 targets by configuring
|
||||
GCC with the @option{--enable-cld} configure option. Generation of @code{cld}
|
||||
instructions can be suppressed with the @option{-mno-cld} compiler option
|
||||
in this case.
|
||||
|
||||
@item -mcx16
|
||||
@opindex mcx16
|
||||
This option will enable GCC to use CMPXCHG16B instruction in generated code.
|
||||
|
|
Loading…
Add table
Reference in a new issue