procyberian/gcc - Masscollabs Services: Beyond Sharing , Liberating The Software World

Author	SHA1	Message	Date
Andrew Pinski	c3fecec65c	aarch64: Fix testcase pr112105.c This testcase started to fail with r15-268-g9dbff9c05520a7. When late_combine was added, it was turned on for -O2+ only, so this testcase still failed. This changes the option to be -O2 instead of -O and the testcase started to pass again. tested for aarch64-linux-gnu. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr112105.c: Change to be -O2 rather than -O1. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2025-02-19 09:25:19 -08:00
David Malcolm	ee6619b124	input: give file_cache_slot its own copy of the file path [PR118919] input.cc's file_cache was borrowing copies of the file name. This could lead to use-after-free when writing out sarif output from Fortran, which frees its filenames before the sarif output is fully written out. Fix by taking a copy in file_cache_slot. gcc/ChangeLog: PR other/118919 * input.cc (file_cache_slot::m_file_path): Make non-const. (file_cache_slot::evict): Free m_file_path. (file_cache_slot::create): Store a copy of file_path if non-null. (file_cache_slot::~file_cache_slot): Free m_file_path. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2025-02-19 09:46:43 -05:00
David Malcolm	58b90139e0	analyzer: handle more IFN_UBSAN_* as no-ops [PR118300] Previously the analyzer treated IFN_UBSAN_BOUNDS as a no-op, but the other IFN_UBSAN_* were unrecognized and conservatively treated as having arbitrary behavior. Treat IFN_UBSAN_NULL and IFN_UBSAN_PTR also as no-ops, which should make -fanalyzer behave better with -fsanitize=undefined. gcc/analyzer/ChangeLog: PR analyzer/118300 * kf.cc (class kf_ubsan_bounds): Replace this with... (class kf_ubsan_noop): ...this. (register_sanitizer_builtins): Use it to handle IFN_UBSAN_NULL, IFN_UBSAN_BOUNDS, and IFN_UBSAN_PTR as nop-ops. (register_known_functions): Drop handling of IFN_UBSAN_BOUNDS here, as it's now handled by register_sanitizer_builtins above. gcc/testsuite/ChangeLog: PR analyzer/118300 * gcc.dg/analyzer/ubsan-pr118300.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2025-02-19 09:44:46 -05:00
Pan Li	25256ec1df	Vect: Fix ICE when vect_verify_loop_lens acts on relevant mode [PR116351] This patch would like to fix the ICE similar as below, assump we have sample code: 1 │ int a, b, c; 2 │ short d, e, f; 3 │ long g (long h) { return h; } 4 │ 5 │ void i () { 6 │ for (; b; ++b) { 7 │ f = 5 >> a ? d : d << a; 8 │ e &= c \| g(f); 9 │ } 10 │ } It will ice when compile with -O3 -march=rv64gc_zve64f -mrvv-vector-bits=zvl during GIMPLE pass: vect pr116351-1.c: In function ‘i’: pr116351-1.c:8:6: internal compiler error: in get_len_load_store_mode, at optabs-tree.cc:655 8 \| void i () { \| ^ 0x44d6b9d internal_error(char const, ...) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517 0x44a26a6 fancy_abort(char const, int, char const) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722 0x19e4309 get_len_load_store_mode(machine_mode, bool, internal_fn, vec<int, va_heap, vl_ptr>) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs-tree.cc:655 0x1fada40 vect_verify_loop_lens /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:1566 0x1fb2b07 vect_analyze_loop_2 /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3037 0x1fb4302 vect_analyze_loop_1 /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3478 0x1fb4e9a vect_analyze_loop(loop, gimple, vec_info_shared) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3638 0x203c2dc try_vectorize_loop_1 /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1095 0x203c839 try_vectorize_loop /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1212 0x203cb2c execute During vectorization the override_widen pattern matched and then will get DImode as vector_mode in loop_info. After that the loop_vinfo will step in vect_analyze_xx with below flow: vect_analyze_loop_2 \|- vect_pattern_recog // over-widening and set loop_vinfo->vector_mode to DImode \|- ... \|- vect_analyze_loop_operations \|- stmt_info->def_type == vect_reduction_def \|- stmt_info->slp_type == pure_slp \|- vectorizable_lc_phi // Not Hit \|- vectorizable_induction // Not Hit \|- vectorizable_reduction // Not Hit \|- vectorizable_recurr // Not Hit \|- vectorizable_live_operation // Not Hit \|- vect_analyze_stmt \|- stmt_info->relevant == vect_unused_in_scope \|- stmt_info->live == false \|- p pattern_stmt_info == (stmt_vec_info) 0x0 \|- return opt_result::success (); OR \|- PURE_SLP_STMT (stmt_info) && !node then dump "handled only by SLP analysis\n" \|- Early return opt_result::success (); \|- vectorizable_load/store/call_convert/... // Not Hit \|- LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P && !LOOP_VINFO_MASKS(loop_vinfo).is_empty () \|- vect_verify_loop_lens (loop_vinfo) \|- assert (VECTOR_MODE_P (loop_vinfo->vector_mode); // Hit assert result in ICE Finally, the DImode in loop_vinfo will hit the assert (VECTOR_MODE_P (mode)) in vect_verify_loop_lens. This patch would like to return false directly if the loop_vinfo has relevant mode like DImode for the ICE fix, but still may have mis-optimization for similar cases. We will try to cover that in separated patches. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. PR middle-end/116351 gcc/ChangeLog: * tree-vect-loop.cc (vect_verify_loop_lens): Return false if the loop_vinfo has relevant mode such as DImode. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr116351-1.c: New test. * gcc.target/riscv/rvv/base/pr116351-2.c: New test. * gcc.target/riscv/rvv/base/pr116351.h: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2025-02-19 21:06:54 +08:00
Xi Ruoyao	427386042f	LoongArch: Use normal RTL pattern instead of UNSPEC for {x,}vsr{a,l}ri instructions Allowing (t + (1ul << imm >> 1)) >> imm to be recognized as a rounding shift operation. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVSRARI): Remove. (UNSPEC_LASX_XVSRLRI): Remove. (lasx_xvsrari_<lsxfmt>): Remove. (lasx_xvsrlri_<lsxfmt>): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_VSRARI): Remove. (UNSPEC_LSX_VSRLRI): Remove. (lsx_vsrari_<lsxfmt>): Remove. (lsx_vsrlri_<lsxfmt>): Remove. * config/loongarch/simd.md (simd_<optab>_imm_round_<mode>): New define_insn. (<simd_isa>_<x>v<insn>ri_<simdfmt>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vect-shift-imm-round.c: New test.	2025-02-19 14:34:46 +08:00
Xi Ruoyao	cef5f23adb	LoongArch: Implement [su]dot_prod* for LSX and LASX modes Despite it's just a special case of "a widening product of which the result used for reduction," having these standard names allows to recognize the dot product pattern earlier and it may be beneficial to optimization. Also fix some test failures with the test cases: - gcc.dg/vect/vect-reduc-chain-2.c - gcc.dg/vect/vect-reduc-chain-3.c - gcc.dg/vect/vect-reduc-chain-dot-slp-3.c - gcc.dg/vect/vect-reduc-chain-dot-slp-4.c gcc/ChangeLog: * config/loongarch/simd.md (wvec_half): New define_mode_attr. (<su>dot_prod<wvec_half><mode>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/wide-mul-reduc-2.c (dg-final): Scan DOT_PROD_EXPR in optimized tree.	2025-02-19 14:34:45 +08:00
Xi Ruoyao	7c54e46b20	LoongArch: Implement vec_widen_mult_{even,odd}_* for LSX and LASX modes Since PR116142 has been fixed, now we can add the standard names so the compiler will generate better code if the result of a widening production is reduced. gcc/ChangeLog: * config/loongarch/simd.md (even_odd): New define_int_attr. (vec_widen_<su>mult_<even_odd>_<mode>): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/loongarch/wide-mul-reduc-1.c: New test. * gcc.target/loongarch/wide-mul-reduc-2.c: New test.	2025-02-19 14:34:45 +08:00
Xi Ruoyao	7dda671512	LoongArch: Simplify lsx_vpick description Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use special predicates instead of hard-coded const vectors. This is not suitable for LASX where lasx_xvpick has a different semantic. gcc/ChangeLog: * config/loongarch/simd.md (LVEC): New define_mode_attr. (simdfmt_as_i): Make it same as simdfmt for integer vector modes. (_f): New define_mode_attr. * config/loongarch/lsx.md (lsx_vpickev_b): Remove. (lsx_vpickev_h): Remove. (lsx_vpickev_w): Remove. (lsx_vpickev_w_f): Remove. (lsx_vpickod_b): Remove. (lsx_vpickod_h): Remove. (lsx_vpickod_w): Remove. (lsx_vpickev_w_f): Remove. (lsx_pick_evod_<mode>): New define_insn. (lsx_<x>vpick<ev_od>_<simdfmt_as_i><_f>): New define_expand.	2025-02-19 14:34:45 +08:00
Xi Ruoyao	f727a4c57e	LoongArch: Simplify {lsx_,lasx_x}vmaddw description Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use special predicates and TImode RTL instead of hard-coded const vectors and UNSPECs. Also reorder two operands of the outer plus in the template, so combine will recognize {x,}vadd + {x,}vmulw{ev,od} => {x,}vmaddw{ev,od}. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVMADDWEV): Remove. (UNSPEC_LASX_XVMADDWEV2): Remove. (UNSPEC_LASX_XVMADDWEV3): Remove. (UNSPEC_LASX_XVMADDWOD): Remove. (UNSPEC_LASX_XVMADDWOD2): Remove. (UNSPEC_LASX_XVMADDWOD3): Remove. (lasx_xvmaddwev_h_b<u>): Remove. (lasx_xvmaddwev_w_h<u>): Remove. (lasx_xvmaddwev_d_w<u>): Remove. (lasx_xvmaddwev_q_d): Remove. (lasx_xvmaddwod_h_b<u>): Remove. (lasx_xvmaddwod_w_h<u>): Remove. (lasx_xvmaddwod_d_w<u>): Remove. (lasx_xvmaddwod_q_d): Remove. (lasx_xvmaddwev_q_du): Remove. (lasx_xvmaddwod_q_du): Remove. (lasx_xvmaddwev_h_bu_b): Remove. (lasx_xvmaddwev_w_hu_h): Remove. (lasx_xvmaddwev_d_wu_w): Remove. (lasx_xvmaddwev_q_du_d): Remove. (lasx_xvmaddwod_h_bu_b): Remove. (lasx_xvmaddwod_w_hu_h): Remove. (lasx_xvmaddwod_d_wu_w): Remove. (lasx_xvmaddwod_q_du_d): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_VMADDWEV): Remove. (UNSPEC_LSX_VMADDWEV2): Remove. (UNSPEC_LSX_VMADDWEV3): Remove. (UNSPEC_LSX_VMADDWOD): Remove. (UNSPEC_LSX_VMADDWOD2): Remove. (UNSPEC_LSX_VMADDWOD3): Remove. (lsx_vmaddwev_h_b<u>): Remove. (lsx_vmaddwev_w_h<u>): Remove. (lsx_vmaddwev_d_w<u>): Remove. (lsx_vmaddwev_q_d): Remove. (lsx_vmaddwod_h_b<u>): Remove. (lsx_vmaddwod_w_h<u>): Remove. (lsx_vmaddwod_d_w<u>): Remove. (lsx_vmaddwod_q_d): Remove. (lsx_vmaddwev_q_du): Remove. (lsx_vmaddwod_q_du): Remove. (lsx_vmaddwev_h_bu_b): Remove. (lsx_vmaddwev_w_hu_h): Remove. (lsx_vmaddwev_d_wu_w): Remove. (lsx_vmaddwev_q_du_d): Remove. (lsx_vmaddwod_h_bu_b): Remove. (lsx_vmaddwod_w_hu_h): Remove. (lsx_vmaddwod_d_wu_w): Remove. (lsx_vmaddwod_q_du_d): Remove. * config/loongarch/simd.md (simd_maddw_evod_<mode>_<su>): New define_insn. (<simd_isa>_<x>vmaddw<ev_od>_<simdfmt_w>_<simdfmt><u>): New define_expand. (simd_maddw_evod_<mode>_hetero): New define_insn. (<simd_isa>_<x>vmaddw<ev_od>_<simdfmt_w>_<simdfmt>u_<simdfmt>): New define_expand. (<simd_isa>_maddw<ev_od>_q_d<u>_punned): New define_expand. (<simd_isa>_maddw<ev_od>_q_du_d_punned): New define_expand. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vmaddwev_q_d): Define as a macro to override it with the punned expand. (CODE_FOR_lsx_vmaddwev_q_du): Likewise. (CODE_FOR_lsx_vmaddwev_q_du_d): Likewise. (CODE_FOR_lsx_vmaddwod_q_d): Likewise. (CODE_FOR_lsx_vmaddwod_q_du): Likewise. (CODE_FOR_lsx_vmaddwod_q_du_d): Likewise. (CODE_FOR_lasx_xvmaddwev_q_d): Likewise. (CODE_FOR_lasx_xvmaddwev_q_du): Likewise. (CODE_FOR_lasx_xvmaddwev_q_du_d): Likewise. (CODE_FOR_lasx_xvmaddwod_q_d): Likewise. (CODE_FOR_lasx_xvmaddwod_q_du): Likewise. (CODE_FOR_lasx_xvmaddwod_q_du_d): Likewise.	2025-02-19 14:34:45 +08:00
Xi Ruoyao	2ca759fc52	LoongArch: Simplify {lsx_,lasx_x}vh{add,sub}w description Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use special predicates and TImode RTL instead of hard-coded const vectors and UNSPECs. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVHADDW_Q_D): Remove. (UNSPEC_LASX_XVHSUBW_Q_D): Remove. (UNSPEC_LASX_XVHADDW_QU_DU): Remove. (UNSPEC_LASX_XVHSUBW_QU_DU): Remove. (lasx_xvh<addsub:optab>w_h<u>_b<u>): Remove. (lasx_xvh<addsub:optab>w_w<u>_h<u>): Remove. (lasx_xvh<addsub:optab>w_d<u>_w<u>): Remove. (lasx_xvhaddw_q_d): Remove. (lasx_xvhsubw_q_d): Remove. (lasx_xvhaddw_qu_du): Remove. (lasx_xvhsubw_qu_du): Remove. (reduc_plus_scal_v4di): Call gen_lasx_haddw_q_d_punned instead of gen_lasx_xvhaddw_q_d. (reduc_plus_scal_v8si): Likewise. * config/loongarch/lsx.md (UNSPEC_LSX_VHADDW_Q_D): Remove. (UNSPEC_ASX_VHSUBW_Q_D): Remove. (UNSPEC_ASX_VHADDW_QU_DU): Remove. (UNSPEC_ASX_VHSUBW_QU_DU): Remove. (lsx_vh<addsub:optab>w_h<u>_b<u>): Remove. (lsx_vh<addsub:optab>w_w<u>_h<u>): Remove. (lsx_vh<addsub:optab>w_d<u>_w<u>): Remove. (lsx_vhaddw_q_d): Remove. (lsx_vhsubw_q_d): Remove. (lsx_vhaddw_qu_du): Remove. (lsx_vhsubw_qu_du): Remove. (reduc_plus_scal_v2di): Change the temporary register mode to V1TI, and pun the mode calling gen_vec_extractv2didi. (reduc_plus_scal_v4si): Change the temporary register mode to V1TI. * config/loongarch/simd.md (simd_h<optab>w_<mode>_<su>): New define_insn. (<simd_isa>_<x>vh<optab>w_<simdfmt_w><u>_<simdfmt><u>): New define_expand. (<simd_isa>_h<optab>w_q<u>_d<u>_punned): New define_expand. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vhaddw_q_d): Define as a macro to override with punned expand. (CODE_FOR_lsx_vhaddw_qu_du): Likewise. (CODE_FOR_lsx_vhsubw_q_d): Likewise. (CODE_FOR_lsx_vhsubw_qu_du): Likewise. (CODE_FOR_lasx_xvhaddw_q_d): Likewise. (CODE_FOR_lasx_xvhaddw_qu_du): Likewise. (CODE_FOR_lasx_xvhsubw_q_d): Likewise. (CODE_FOR_lasx_xvhsubw_qu_du): Likewise.	2025-02-19 14:34:45 +08:00
Xi Ruoyao	a36c15aa66	LoongArch: Simplify {lsx_,lasx_x}v{add,sub,mul}l{ev,od} description These pattern definitions are tediously long, invoking 32 UNSPECs and many hard-coded long const vectors. To simplify them, at first we use the TImode vector operations instead of the UNSPECs, then we adopt an approach in AArch64: using a special predicate to match the const vectors for odd/even indices for define_insn's, and generate those vectors in define_expand's. For "backward compatibilty" we need to provide a "punned" version for the operations invoking TImode vectors as the intrinsics still expect DImode vectors. The stat is "201 insertions, 905 deletions." gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVADDWEV): Remove. (UNSPEC_LASX_XVADDWEV2): Remove. (UNSPEC_LASX_XVADDWEV3): Remove. (UNSPEC_LASX_XVSUBWEV): Remove. (UNSPEC_LASX_XVSUBWEV2): Remove. (UNSPEC_LASX_XVMULWEV): Remove. (UNSPEC_LASX_XVMULWEV2): Remove. (UNSPEC_LASX_XVMULWEV3): Remove. (UNSPEC_LASX_XVADDWOD): Remove. (UNSPEC_LASX_XVADDWOD2): Remove. (UNSPEC_LASX_XVADDWOD3): Remove. (UNSPEC_LASX_XVSUBWOD): Remove. (UNSPEC_LASX_XVSUBWOD2): Remove. (UNSPEC_LASX_XVMULWOD): Remove. (UNSPEC_LASX_XVMULWOD2): Remove. (UNSPEC_LASX_XVMULWOD3): Remove. (lasx_xv<addsubmul:optab>wev_h_b<u>): Remove. (lasx_xv<addsubmul:optab>wev_w_h<u>): Remove. (lasx_xv<addsubmul:optab>wev_d_w<u>): Remove. (lasx_xvaddwev_q_d): Remove. (lasx_xvsubwev_q_d): Remove. (lasx_xvmulwev_q_d): Remove. (lasx_xv<addsubmul:optab>wod_h_b<u>): Remove. (lasx_xv<addsubmul:optab>wod_w_h<u>): Remove. (lasx_xv<addsubmul:optab>wod_d_w<u>): Remove. (lasx_xvaddwod_q_d): Remove. (lasx_xvsubwod_q_d): Remove. (lasx_xvmulwod_q_d): Remove. (lasx_xvaddwev_q_du): Remove. (lasx_xvsubwev_q_du): Remove. (lasx_xvmulwev_q_du): Remove. (lasx_xvaddwod_q_du): Remove. (lasx_xvsubwod_q_du): Remove. (lasx_xvmulwod_q_du): Remove. (lasx_xv<addmul:optab>wev_h_bu_b): Remove. (lasx_xv<addmul:optab>wev_w_hu_h): Remove. (lasx_xv<addmul:optab>wev_d_wu_w): Remove. (lasx_xv<addmul:optab>wod_h_bu_b): Remove. (lasx_xv<addmul:optab>wod_w_hu_h): Remove. (lasx_xv<addmul:optab>wod_d_wu_w): Remove. (lasx_xvaddwev_q_du_d): Remove. (lasx_xvsubwev_q_du_d): Remove. (lasx_xvmulwev_q_du_d): Remove. (lasx_xvaddwod_q_du_d): Remove. (lasx_xvsubwod_q_du_d): Remove. * config/loongarch/lsx.md (UNSPEC_LSX_XVADDWEV): Remove. (UNSPEC_LSX_VADDWEV2): Remove. (UNSPEC_LSX_VADDWEV3): Remove. (UNSPEC_LSX_VSUBWEV): Remove. (UNSPEC_LSX_VSUBWEV2): Remove. (UNSPEC_LSX_VMULWEV): Remove. (UNSPEC_LSX_VMULWEV2): Remove. (UNSPEC_LSX_VMULWEV3): Remove. (UNSPEC_LSX_VADDWOD): Remove. (UNSPEC_LSX_VADDWOD2): Remove. (UNSPEC_LSX_VADDWOD3): Remove. (UNSPEC_LSX_VSUBWOD): Remove. (UNSPEC_LSX_VSUBWOD2): Remove. (UNSPEC_LSX_VMULWOD): Remove. (UNSPEC_LSX_VMULWOD2): Remove. (UNSPEC_LSX_VMULWOD3): Remove. (lsx_v<addsubmul:optab>wev_h_b<u>): Remove. (lsx_v<addsubmul:optab>wev_w_h<u>): Remove. (lsx_v<addsubmul:optab>wev_d_w<u>): Remove. (lsx_vaddwev_q_d): Remove. (lsx_vsubwev_q_d): Remove. (lsx_vmulwev_q_d): Remove. (lsx_v<addsubmul:optab>wod_h_b<u>): Remove. (lsx_v<addsubmul:optab>wod_w_h<u>): Remove. (lsx_v<addsubmul:optab>wod_d_w<u>): Remove. (lsx_vaddwod_q_d): Remove. (lsx_vsubwod_q_d): Remove. (lsx_vmulwod_q_d): Remove. (lsx_vaddwev_q_du): Remove. (lsx_vsubwev_q_du): Remove. (lsx_vmulwev_q_du): Remove. (lsx_vaddwod_q_du): Remove. (lsx_vsubwod_q_du): Remove. (lsx_vmulwod_q_du): Remove. (lsx_v<addmul:optab>wev_h_bu_b): Remove. (lsx_v<addmul:optab>wev_w_hu_h): Remove. (lsx_v<addmul:optab>wev_d_wu_w): Remove. (lsx_v<addmul:optab>wod_h_bu_b): Remove. (lsx_v<addmul:optab>wod_w_hu_h): Remove. (lsx_v<addmul:optab>wod_d_wu_w): Remove. (lsx_vaddwev_q_du_d): Remove. (lsx_vsubwev_q_du_d): Remove. (lsx_vmulwev_q_du_d): Remove. (lsx_vaddwod_q_du_d): Remove. (lsx_vsubwod_q_du_d): Remove. (lsx_vmulwod_q_du_d): Remove. * config/loongarch/loongarch-modes.def: Add V4TI and V1DI. * config/loongarch/loongarch-protos.h (loongarch_gen_stepped_int_parallel): New function prototype. * config/loongarch/loongarch.cc (loongarch_print_operand): Accept 'O' for printing "ev" or "od." (loongarch_gen_stepped_int_parallel): Implement. * config/loongarch/predicates.md (vect_par_cnst_even_or_odd_half): New define_predicate. * config/loongarch/simd.md (WVEC_HALF): New define_mode_attr. (simdfmt_w): Likewise. (zero_one): New define_int_iterator. (ev_od): New define_int_attr. (simd_<optab>w_evod_<mode:IVEC>_<su>): New define_insn. (<simd_isa>_<x>v<optab>w<ev_od>_<simdfmt_w>_<simdfmt><u>): New define_expand. (simd_<optab>w_evod_<mode>_hetero): New define_insn. (<simd_isa>_<x>v<optab>w<ev_od>_<simdfmt_w>_<simdfmt>u_<simdfmt>): New define_expand. (DIVEC): New define_mode_iterator. (<simd_isa>_<optab>w<ev_od>_q_d<u>_punned): New define_expand. (<simd_isa>_<optab>w<ev_od>_q_du_d_punned): Likewise. * config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vaddwev_q_d): Define as a macro to override it with the punned expand. (CODE_FOR_lsx_vaddwev_q_du): Likewise. (CODE_FOR_lsx_vsubwev_q_d): Likewise. (CODE_FOR_lsx_vsubwev_q_du): Likewise. (CODE_FOR_lsx_vmulwev_q_d): Likewise. (CODE_FOR_lsx_vmulwev_q_du): Likewise. (CODE_FOR_lsx_vaddwod_q_d): Likewise. (CODE_FOR_lsx_vaddwod_q_du): Likewise. (CODE_FOR_lsx_vsubwod_q_d): Likewise. (CODE_FOR_lsx_vsubwod_q_du): Likewise. (CODE_FOR_lsx_vmulwod_q_d): Likewise. (CODE_FOR_lsx_vmulwod_q_du): Likewise. (CODE_FOR_lsx_vaddwev_q_du_d): Likewise. (CODE_FOR_lsx_vmulwev_q_du_d): Likewise. (CODE_FOR_lsx_vaddwod_q_du_d): Likewise. (CODE_FOR_lsx_vmulwod_q_du_d): Likewise. (CODE_FOR_lasx_xvaddwev_q_d): Likewise. (CODE_FOR_lasx_xvaddwev_q_du): Likewise. (CODE_FOR_lasx_xvsubwev_q_d): Likewise. (CODE_FOR_lasx_xvsubwev_q_du): Likewise. (CODE_FOR_lasx_xvmulwev_q_d): Likewise. (CODE_FOR_lasx_xvmulwev_q_du): Likewise. (CODE_FOR_lasx_xvaddwod_q_d): Likewise. (CODE_FOR_lasx_xvaddwod_q_du): Likewise. (CODE_FOR_lasx_xvsubwod_q_d): Likewise. (CODE_FOR_lasx_xvsubwod_q_du): Likewise. (CODE_FOR_lasx_xvmulwod_q_d): Likewise. (CODE_FOR_lasx_xvmulwod_q_du): Likewise. (CODE_FOR_lasx_xvaddwev_q_du_d): Likewise. (CODE_FOR_lasx_xvmulwev_q_du_d): Likewise. (CODE_FOR_lasx_xvaddwod_q_du_d): Likewise. (CODE_FOR_lasx_xvmulwod_q_du_d): Likewise.	2025-02-19 14:34:44 +08:00
Xi Ruoyao	ac1b058629	LoongArch: Allow moving TImode vectors We have some vector instructions for operations on 128-bit integer, i.e. TImode, vectors. Previously they had been modeled with unspecs, but it's more natural to just model them with TImode vector RTL expressions. For the preparation, allow moving V1TImode and V2TImode vectors in LSX and LASX registers so we won't get a reload failure when we start to save TImode vectors in these registers. This implicitly depends on the vrepli optimization: without it we'd try "vrepli.q" which does not really exist and trigger an ICE. gcc/ChangeLog: * config/loongarch/lsx.md (mov<LSX:mode>): Remove. (movmisalign<LSX:mode>): Remove. (mov<LSX:mode>_lsx): Remove. * config/loongarch/lasx.md (mov<LASX:mode>): Remove. (movmisalign<LASX:mode>): Remove. (mov<LASX:mode>_lasx): Remove. * config/loongarch/loongarch-modes.def (V1TI): Add. (V2TI): Mention in the comment. * config/loongarch/loongarch.md (mode): Add V1TI and V2TI. * config/loongarch/simd.md (ALLVEC_TI): New mode iterator. (mov<ALLVEC_TI:mode): New define_expand. (movmisalign<ALLVEC_TI:mode>): Likewise. (mov<ALLVEC_TI:mode>_simd): New define_insn_and_split.	2025-02-19 14:34:44 +08:00
Xi Ruoyao	ed9794546d	LoongArch: Try harder using vrepli instructions to materialize const vectors For a = (v4si){0xdddddddd, 0xdddddddd, 0xdddddddd, 0xdddddddd} we just want vrepli.b $vr0, 0xdd but the compiler actually produces a load: la.local $r14,.LC0 vld $vr0,$r14,0 It's because we only tried vrepli.d which wouldn't work. Try all vrepli instructions for const int vector materializing to fix it. gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_const_vector_vrepli): New function prototype. * config/loongarch/loongarch.cc (loongarch_const_vector_vrepli): Implement. (loongarch_const_insns): Call loongarch_const_vector_vrepli instead of loongarch_const_vector_same_int_p. (loongarch_split_vector_move_p): Likewise. (loongarch_output_move): Use loongarch_const_vector_vrepli to pun operend[1] into a better mode if it's a const int vector, and decide the suffix of [x]vrepli with the new mode. * config/loongarch/constraints.md (YI): Call loongarch_const_vector_vrepli instead of loongarch_const_vector_same_int_p. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vrepli.c: New test.	2025-02-19 14:34:44 +08:00
Xi Ruoyao	ea3ebe4815	LoongArch: Accept ADD, IOR or XOR when combining objects with no bits in common [PR115478] Since r15-1120, multi-word shifts/rotates produces PLUS instead of IOR. It's generally a good thing (allowing to use our alsl instruction or similar instrunction on other architectures), but it's preventing us from using bytepick. For example, if we shift a __int128 by 16 bits, the higher word can be produced via a single bytepick.d instruction with immediate 2, but we got: srli.d $r12,$r4,48 slli.d $r5,$r5,16 slli.d $r4,$r4,16 add.d $r5,$r12,$r5 jr $r1 This wasn't work with GCC 14, but after r15-6490 it's supposed to work if IOR was used instead of PLUS. To fix this, add a code iterator to match IOR, XOR, and PLUS and use it instead of just IOR if we know the operands have no overlapping bits. gcc/ChangeLog: PR target/115478 * config/loongarch/loongarch.md (any_or_plus): New define_code_iterator. (bstrins_<mode>_for_ior_mask): Use any_or_plus instead of ior. (bytepick_w_<bytepick_imm>): Likewise. (bytepick_d_<bytepick_imm>): Likewise. (bytepick_d_<bytepick_imm>_rev): Likewise. gcc/testsuite/ChangeLog: PR target/115478 * gcc.target/loongarch/bytepick_shift_128.c: New test.	2025-02-19 14:34:25 +08:00
Jeff Law	3e93035fcc	[PR middle-end/113525] Drop obsolete options from documentation The sibling and unshare passes were dropped as distinct passes 10+ years ago. Docs weren't ever updated. This just removes them; given their age I don't think we need to keep them around any longer. PR middle-end/113525 gcc/ * doc/invoke.texi (dump-rtl-sibling): Drop documentation for pass removed long ago. (dump-rtl-unshare): Likewise.	2025-02-18 19:47:15 -07:00
GCC Administrator	db7b21ac87	Daily bump.	2025-02-19 00:18:02 +00:00
Andi Kleen	29482d4e53	Fix description of file-cache-lines/file-cache-files params The file-cache-lines / file-cache-files tunables were documented in the wrong section. Fix that. Reported-by: Filip Kastl Comitted as obvious. gcc/ChangeLog: * doc/invoke.texi:	2025-02-18 15:40:28 -08:00
David Malcolm	fcdcccdbf8	analyzer: add more properties to sarif output Add some more properties to the analyzer's sarif output, to help with debugging -fanalyzer. gcc/analyzer/ChangeLog: * diagnostic-manager.cc (saved_diagnostic::maybe_add_sarif_properties): Add various properties for debugging, for m_stmt, m_var, and m_duplicates. Remove stray 'if' statement. Capture the kind of the pending_diagnostic. * region-model.cc (poisoned_value_diagnostic::maybe_add_sarif_properties): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2025-02-18 16:54:57 -05:00
David Malcolm	196e8dbddc	sarif output: fix alphabetization in sarif_scheme_handler::make_sink No functional change intended. Signed-off-by: David Malcolm <dmalcolm@redhat.com> gcc/ChangeLog: * opts-diagnostic.cc (sarif_scheme_handler::make_sink): Put properties in alphabetical order. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2025-02-18 16:54:52 -05:00
Roman Kagan	e129b8d768	libgcc: i386/linux-unwind.h: always rely on sys/ucontext.h When gcc is built for x86_64-linux-musl target, stack unwinding from within signal handler stops at the innermost signal frame. The reason for this behaviro is that the signal trampoline is not accompanied with appropiate CFI directives, and the fallback path in libgcc to recognize it by the code sequence is only enabled for glibc except 2.0. The latter is motivated by the lack of sys/ucontext.h in that glibc version. Given that all relevant libc-s ship sys/ucontext.h for over a decade, and that other arches aren't shy of unconditionally using it, follow suit and remove the preprocessor condition, too. libgcc/ChangeLog: * config/i386/linux-unwind.h: Remove preprocessor condition to enable fallback path for all libc-s. Signed-off-by: Roman Kagan <rkagan@amazon.de>	2025-02-18 20:35:42 +01:00
Robin Dapp	44d4a1086d	RISC-V: Fix ratio in vsetvl fuse rule [PR115703]. In PR115703 we fuse two vsetvls: Fuse curr info since prev info compatible with it: prev_info: VALID (insn 438, bb 2) Demand fields: demand_ge_sew demand_non_zero_avl SEW=32, VLMUL=m1, RATIO=32, MAX_SEW=64 TAIL_POLICY=agnostic, MASK_POLICY=agnostic AVL=(reg:DI 0 zero) VL=(reg:DI 9 s1 [312]) curr_info: VALID (insn 92, bb 20) Demand fields: demand_ratio_and_ge_sew demand_avl SEW=64, VLMUL=m1, RATIO=64, MAX_SEW=64 TAIL_POLICY=agnostic, MASK_POLICY=agnostic AVL=(const_int 4 [0x4]) VL=(nil) prev_info after fused: VALID (insn 438, bb 2) Demand fields: demand_ratio_and_ge_sew demand_avl SEW=64, VLMUL=mf2, RATIO=64, MAX_SEW=64 TAIL_POLICY=agnostic, MASK_POLICY=agnostic AVL=(const_int 4 [0x4]) VL=(nil). The result is vsetvl zero, zero, e64, mf2, ta, ma. The previous vsetvl set vl = 4 but here we wrongly set it to vl = 2. As all the following vsetvls only ever change the ratio we never recover. The issue is quite difficult to trigger because we can often deduce the value of d at runtime. Then very check for the value of d will be optimized away. The last known bad commit is r15-3458-g5326306e7d9d36. With that commit the output is wrong but -fno-schedule-insns makes it correct. From the next commit on the issue is latent. I still added the PR's test as scan and run check even if they don't trigger right now. Not sure if the run test will ever fail but well. I verified that the patch fixes the issue when applied on top of r15-3458-g5326306e7d9d36. PR target/115703 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc: Use max_sew for calculating the new LMUL. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr115703-run.c: New test. * gcc.target/riscv/rvv/autovec/pr115703.c: New test.	2025-02-18 16:52:09 +01:00
John David Anglin	8c03fbd776	testsuite: Include stdint.h instead of stdint-gcc.h in some tests When use_gcc_stdint=provide, the stdint-gcc.h header is not provided. 2025-02-18 John David Anglin <danglin@gcc.gnu.org> gcc/testsuite/ChangeLog: PR testsuite/116986 * gcc.dg/crc-builtin-rev-target32.c: Include stdint.h instead of stdint-gcc.h. * gcc.dg/crc-builtin-rev-target64.c: Likewise. * gcc.dg/crc-builtin-target32.c: Likewise. * gcc.dg/crc-builtin-target64.c: Likewise. * gcc.dg/torture/pr115387-2.c: Likewise.	2025-02-18 10:36:48 -05:00
Tobias Burnus	8d922a8039	gfortran.dg/gomp/metadirective-3.f90: xfail on offload_nvptx Currently, 'target' with a nested metadirective creating a 'teams' will fail with a bogus error ("‘target’ construct with nested ‘teams’ construct contains directives outside of the ‘teams’ construct"). That's tracked at PR118694 - and, hence, expected. However, the testcase metadirective-3.f90 triggers this when compiling for 'target offload_nvptx' (otherwise, the code is optimized away). Use xfail to silence the error as it is known and there is a tracking PR. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/metadirective-3.f90: Add xfail when compiling for offload_nvptx.	2025-02-18 15:48:39 +01:00
Richard Sandiford	c5752c1f01	late-combine: Tighten register class check [PR108840] gcc.target/aarch64/pr108840.c has failed since r15-268-g9dbff9c05520 (which means that I really ought to have looked at it earlier). The test wants us to fold an SImode AND into all shifts that use it. This is something that late-combine is supposed to do, but: (1) the pre-RA pass chickened out because of a register pressure check (2) the post-RA pass can't handle it, because the shift uses are in QImode and the sets are in SImode Both are things that would be good to fix. But (1) is particularly silly. The constraints on the AND have "rk" for the destination (so allowing the stack pointer) and "r" for the first source. Including the stack pointer made the destination seem more permissive than the source. The intention was instead to check whether there are any allocatable registers in the destination class that aren't present in the source. That's enough for all tests but the last one. The last one still fails because combine merges the final shift with the move into the hard return register, giving an arithmetic instruction with a hard register destination. Pre-RA late-combine currently punts on those, again due to register pressure concerns. That too is something I'd like to relax, but not for GCC 15. In the interim, the best thing seems to be to disable combine for the test. gcc/ PR rtl-optimization/108840 * late-combine.cc (late_combine::check_register_pressure): Take only allocatable registers into account when checking the permissiveness of register classes. gcc/testsuite/ PR rtl-optimization/108840 * gcc.target/aarch64/pr108840.c: Run at -O2 but disable combine.	2025-02-18 11:00:57 +00:00
Alex Coplan	facdce9028	pair-fusion: Tweak wording in dump message [PR118320] As discussed in https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675978.html this tweaks the dump messasge added with the fix for PR118320 since it doesn't just apply to load pairs. gcc/ChangeLog: PR rtl-optimization/118320 * pair-fusion.cc (pair_fusion_bb_info::fuse_pair): Tweak wording in dump message when punting on invalid use arrays.	2025-02-18 10:48:50 +00:00
Soumya AR	8606ab346b	aarch64: Use generic_armv8_a_prefetch_tune in generic_armv8_a.h generic_armv8_a.h defines generic_armv8_a_prefetch_tune but still uses generic_prefetch_tune in generic_armv8_a_tunings. This patch updates the pointer to generic_armv8_a_prefetch_tune. This patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. Signed-off-by: Soumya AR <soumyaa@nvidia.com> gcc/ChangeLog: * config/aarch64/tuning_models/generic_armv8_a.h: Updated prefetch struct pointer.	2025-02-18 14:11:06 +05:30
Richard Biener	6b8a8c9fd6	tree-optimization/98845 - ICE with tail-merging and DCE/DSE disabled The following shows that tail-merging will make dead SSA defs live in paths where it wasn't before, possibly introducing UB or as in this case, uses of abnormals that eventually fail coalescing later. The fix is to register such defs for stmt comparison. PR tree-optimization/98845 * tree-ssa-tail-merge.cc (stmt_local_def): Consider a def with no uses not local. * gcc.dg/pr98845.c: New testcase. * gcc.dg/pr81192.c: Adjust.	2025-02-18 08:55:47 +01:00
Jin Ma	b22f191b7c	RISC-V: Fix failed tests for regression due to fix ICE patch Ref: https://github.com/ewlu/gcc-precommit-ci/issues/3096#issue-2854419069 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/bug-9.c: Added new failure check. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-17.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-18.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-19.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-20.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-21.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-22.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-23.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-24.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-25.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-26.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-27.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-28.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-29.c: Likewise. * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: Likewise.	2025-02-18 14:37:04 +08:00
Pan Li	17b95cfc31	RISC-V: Fix ICE for target attributes has different xlen size This patch would like to avoid the ICE when the target attribute specific the xlen different to the cmd. Aka compile with rv64gc but target attribute with rv32gcv_zbb. For example as blow: 1 │ long foo (long a, long b) 2 │ __attribute__((target("arch=rv32gcv_zbb"))); 3 │ 4 │ long foo (long a, long b) 5 │ { 6 │ return a + (b * 2); 7 │ } when compile with rv64gc -O3, it will have ICE similar as below during RTL pass: fwprop1 test.c: In function ‘foo’: test.c:10:1: internal compiler error: in add_use, at rtl-ssa/accesses.cc:1234 10 \| } \| ^ 0x44d6b9d internal_error(char const, ...) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic-global-context.cc:517 0x44a26a6 fancy_abort(char const, int, char const) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/diagnostic.cc:1722 0x408fac9 rtl_ssa::function_info::add_use(rtl_ssa::use_info) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/accesses.cc:1234 0x40a5eea rtl_ssa::function_info::create_reg_use(rtl_ssa::function_info::build_info&, rtl_ssa::insn_info, rtl_ssa::resource_info) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/insns.cc:496 0x4456738 rtl_ssa::function_info::add_artificial_accesses(rtl_ssa::function_info::build_info&, df_ref_flags) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:900 0x4457297 rtl_ssa::function_info::start_block(rtl_ssa::function_info::build_info&, rtl_ssa::bb_info) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:1082 0x4453627 rtl_ssa::function_info::bb_walker::before_dom_children(basic_block_def) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:118 0x3e9f3fb dom_walker::walk(basic_block_def) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/domwalk.cc:311 0x445806f rtl_ssa::function_info::process_all_blocks() /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/blocks.cc:1298 0x40a22d3 rtl_ssa::function_info::function_info(function) /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/rtl-ssa/functions.cc:51 0x3ec3f80 fwprop_init /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/fwprop.cc:893 0x3ec420d fwprop /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/fwprop.cc:963 0x3ec43ad execute Consider stage 4, we just report error for the above scenario when detect the cmd xlen is different to the target attribute during the target hook TARGET_OPTION_VALID_ATTRIBUTE_P implementation. PR target/118540 gcc/ChangeLog: config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Report error when cmd xlen is different with target attribute. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr118540-1.c: New test. * gcc.target/riscv/rvv/base/pr118540-2.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2025-02-18 14:23:17 +08:00
Haochen Jiang	101e3101e0	i386: Re-order i386.opt.urls The order of i386.opt.urls need to be the same as i386.opt. gcc/ChangeLog: * config/i386/i386.opt.urls: Adjust the order for avx10.2 and avx10.2-512 due to their order change in i386.opt.	2025-02-18 10:59:26 +08:00
Alexandre Oliva	f039584e9e	[testsuite] fix check-function-bodies usage The existing usage comment for check-function-bodies is presumably a typo, as it doesn't match existing uses. Fix it. for gcc/testsuite/ChangeLog * lib/scanasm.exp (check-function-bodies): Fix usage comment.	2025-02-17 23:26:30 -03:00
Alexandre Oliva	3768bedf7b	[ifcombine] cope with signbit tests of extended values A compare with zero may be taken as a sign bit test by fold_truth_andor_for_ifcombine, but the operand may be extended from a narrower field. If the operand was narrower, the bitsize will reflect the narrowing conversion, but if it was wider, we'll only know whether the field is sign- or zero-extended from unsignedp, but we won't know whether it needed to be extended, because arg will have changed to the narrower variable when we get to the point in which we can compute the arg width. If it's sign-extended, we're testing the right bit, but if it's zero-extended, there isn't any bit we can test. Instead of punting and leaving the foldable compare to be figured out by another pass, arrange for the sign bit resulting from the widening zero-extension to be taken as zero, so that the modified compare will yield the desired result. While at that, avoid swapping the right-hand compare operands when we've already determined that it was a signbit test: it no use to even try. for gcc/ChangeLog PR tree-optimization/118805 * gimple-fold.cc (fold_truth_andor_for_combine): Detect and cope with zero-extension in signbit tests. Reject swapping right-compare operands if rsignbit. for gcc/testsuite/ChangeLog PR tree-optimization/118805 * gcc.dg/field-merge-26.c: New.	2025-02-17 23:17:21 -03:00
GCC Administrator	938bda49de	Daily bump.	2025-02-18 00:18:41 +00:00
Tobias Burnus	8268c8256d	OpenMP/Fortran: extend 'adjust_args' clause, fixes for it and declare variant [PR115271] On the extension side, it implements OpenMP 6.0's numeric values/ranges for the adjust_args arguments, including 'omp_num_args'. And it adds parser support for need_device_addr. It also implements the post-OpenMP-6.0 clarification of OpenMP spec Issue #4443 regarding type(c_ptr) with dimension being invalid for need_device_ptr. To be done: Adding full support for need_device_addr (optional, array descriptor, ...). On the invalid side, it removed a bogus c_ptr check that went through all adjust_args without checking for need_device_ptr and the current scope. And it finally also processes 'declare variant' in an INTERFACE block, which is part of PR115271, but it does not handle .mod file yet - the main issue tracked in that PR. PR fortran/115271 gcc/fortran/ChangeLog: * gfortran.h (gfc_omp_namelist): Change need_device_ptr to adj_args union and add more flags. * openmp.cc (gfc_match_omp_declare_variant, gfc_resolve_omp_declare): For adjust_args, handle need_device_addr and numeric values/ranges besides dummy argument names. (resolve_omp_dispatch): Remove bogus a adjust_args check. * trans-decl.cc (gfc_handle_omp_declare_variant): New. (gfc_generate_module_vars, gfc_generate_function_code): Call it. * trans-openmp.cc (gfc_trans_omp_declare_variant): Handle numeric values/ranges besides dummy argument names. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/adjust-args-1.f90: Update dg-.* expectations. * gfortran.dg/gomp/adjust-args-2.f90: Likewise. * gfortran.dg/gomp/adjust-args-2a.f90: Likewise. * gfortran.dg/gomp/adjust-args-3.f90: Likewise. * gfortran.dg/gomp/adjust-args-4.f90: Remove array from c_ptr. * gfortran.dg/gomp/adjust-args-5.f90: Likewise. * gfortran.dg/gomp/adjust-args-11.f90: Likewise. Add check that INTERFACE is now handled in subroutines and in modules. * gfortran.dg/gomp/adjust-args-13.f90: New test. * gfortran.dg/gomp/adjust-args-14.f90: New test. * gfortran.dg/gomp/adjust-args-15.f90: New test. * gfortran.dg/gomp/declare-variant-21.f90: New test.	2025-02-17 22:52:34 +01:00
Uros Bizjak	565d4e7554	i386: Simplify PARALLEL RTX scan in ix86_find_all_reg_use UNSPEC and UNSPEC_VOLATILE never store. Remove unnecessary checks and simplify RTX scan in ix86_find_all_reg_use to scan only for SET RTX in the PARALLEL. gcc/ChangeLog: * config/i386/i386.cc (ix86_find_all_reg_use): Scan only for SET RTX in PARALLEL.	2025-02-17 20:49:19 +01:00
Uros Bizjak	09684c53bc	middle-end: Fixup constant integers when expanding __builtin_crc [PR118288] Constant integers with MSB set have to be represented as corresponding signed integers. Use gen_int_mode to emit them in the correct way. PR middle-end/118288 gcc/ChangeLog: * builtins.cc (expand_builtin_crc_table_based): Use gen_int_mode to emit constant integers with MSB set. gcc/testsuite/ChangeLog: * gcc.dg/pr118288.c: New test.	2025-02-17 20:47:53 +01:00
Marek Polacek	1787119229	c++: add fixed test [PR102455] Fixed by r13-4564 but the tests are very different. PR c++/102455 gcc/testsuite/ChangeLog: * g++.dg/ext/vector43.C: New test.	2025-02-17 12:36:18 -05:00
Jason Merrill	720c8f6852	c++: extended temps and statement-exprs [PR118763] My last patch for 118856 broke the test for 118763 (which my testing didn't catch, for some reason), because it effectively reverted Jakub's recent fix (r15-7415) for that bug. It seems we need a new flag to indicate internal temporaries. In that patch Jakub wondered if other uses of CLEANUP_EH_ONLY would have the same issue with jumps out of a statement-expr, and indeed it seems that maybe_push_temp_cleanup and now set_up_extended_ref_temp have the same problem. Since maybe_push_temp_cleanup already uses a flag, we can easily stop setting CLEANUP_EH_ONLY there as well. Since set_up_extended_ref_temp doesn't, working around this issue there will be more involved. PR c++/118856 PR c++/118763 gcc/cp/ChangeLog: * cp-tree.h (TARGET_EXPR_INTERNAL_P): New. * call.cc (extend_temps_r): Check it instead of CLEANUP_EH_ONLY. * tree.cc (get_internal_target_expr): Set it instead. * typeck2.cc (maybe_push_temp_cleanup): Don't set CLEANUP_EH_ONLY. gcc/testsuite/ChangeLog: * g++.dg/ext/stmtexpr29.C: New test.	2025-02-17 17:19:25 +00:00
Marek Polacek	5954c5a7c2	c++: add fixed test [PR96364] We were rejecting this, but the test compiles correctly since r14-6346. PR c++/96364 gcc/testsuite/ChangeLog: * g++.dg/cpp0x/gen-attrs-88.C: New test.	2025-02-17 12:15:23 -05:00
Richard Biener	dfd0ced98f	tree-optimization/118895 - ICE during PRE When we simplify a NARY during PHI translation we have to make sure to not inject not available operands into it given that might violate the valueization hook constraints and we'd pick up invalid context-sensitive data in further simplification or as in this case later ICE when we try to insert the expression. PR tree-optimization/118895 * tree-ssa-sccvn.cc (vn_nary_build_or_lookup_1): Only allow CSE if we can verify the result is available. * gcc.dg/pr118895.c: New testcase.	2025-02-17 14:41:25 +01:00
Georg-Johann Lay	230678c19c	AVR: ad target/118764 - Mention CVT availability in device-specs comment. gcc/ PR target/118764 * config/avr/gen-avr-mmcu-specs.cc (print_mcu) [has CVT]: Mention CVT in header comment of generated specs file.	2025-02-17 14:35:08 +01:00
Matthew Malcomson	9335ff73a5	gcc: testsuite: Fix builtin-speculation-overloads[14].C testism When making warnings trigger a failure in template substitution I could not find any way to trigger the warning about builtin speculation not being available on the given target. Turns out I misread the code -- this warning happens when the speculation_barrier pattern is not defined. Here we add an effective target to represent "__builtin_speculation_safe_value is available on this target" and use that to adjust our test on SFINAE behaviour accordingly. N.b. this means that we get extra testing -- not just that things work on targets which support __builtin_speculation_safe_value, but also that the behaviour works on targets which don't support it. Tested with AArch64 native, AArch64 cross compiler, and RISC-V cross compiler (just running the tests that I've changed). Ok for trunk? gcc/testsuite/ChangeLog: PR target/117991 * g++.dg/template/builtin-speculation-overloads.def: SUCCESS argument in SPECULATION_ASSERTS now uses a macro `true_def` instead of the literal `true` for arguments which should work with `__builtin_speculation_safe_value`. * g++.dg/template/builtin-speculation-overloads1.C: Define `true_def` macro on command line to compiler according to the effective target representing that `__builtin_speculation_safe_value` does something on this target. * g++.dg/template/builtin-speculation-overloads4.C: Likewise. * lib/target-supports.exp (check_effective_target_speculation_barrier_defined): New. Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>	2025-02-17 11:09:51 +00:00
Thomas Koenig	b57e6e1b38	Avoid shift wider than unsigned HOST_WIDE_INT on unsigned integer exponentiation. this patch is a variation of Jakub's patch in the PR, which avoids overflow on the mask used for exponentiation and fixes unsigned HOST_WIDE_INT. I tried testing this on a POWER machine, but --with-build-config=bootstrap-ubsan fails bootstrap there. gcc/fortran/ChangeLog: PR fortran/118862 * trans-expr.cc (gfc_conv_cst_int_power): Use functions for unsigned wide integer. (gfc_conv_cst_uint_power): Avoid generating the mask if it would overflow an unsigned HOST_WIDE_INT. Format fixes.	2025-02-17 08:19:29 +01:00
Haochen Jiang	46276080e7	i386: Regenerate i386.opt.urls We need to regenerate i386.opt.urls after removing -mavx10.1. gcc/ChangeLog: * config/i386/i386.opt.urls: Regenetated.	2025-02-17 14:05:02 +08:00
Haochen Jiang	9ea56e2a3e	i386: Re-alias avx10.2 to 512 bit and deprecate -mno-avx10.2-[256,512] As mentioned in avx10.1 option deprecate patch, based on the feedback we got, we would like to re-alias avx10.x to 512 bit. For -mno- options, also mentioned in the previous patch, it is confusing what it is disabling when it comes to avx10. So we will only provide -mno-avx10.x options from AVX10.2, disabling the whole AVX10.x. gcc/ChangeLog: * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AVX10_1_UNSET): Adjust macro. (OPTION_MASK_ISA2_AVX10_2_256_UNSET): Removed. (OPTION_MASK_ISA2_AVX10_2_512_UNSET): Ditto. (OPTION_MASK_ISA2_AVX10_2_UNSET): New. (ix86_handle_option): Remove disable part for avx10.2-256. Rename avx10.2-512 switch case to avx10.2 and adjust disable part macro. * common/config/i386/i386-isas.h: Adjust avx10.2 and avx10.2-512. * config/i386/driver-i386.cc (host_detect_local_cpu): Do not append -mno-avx10.x-256 for -march=native. * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Adjust avx10.2 and avx10.2-512. * config/i386/i386.opt: Reject Negative for mavx10.2-256. Alias mavx10.2-512 to mavx10.2. Reject Negative for mavx10.2-512. * doc/extend.texi: Adjust documentation. * doc/sourcebuild.texi: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-vminmaxbf16-2.c: Add missing avx10_2_512 check. * gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto. * gcc.target/i386/avx10-check.h: Change avx10.2 to avx10.2-256. * gcc.target/i386/avx10_2-bf16-1.c: Ditto. * gcc.target/i386/avx10_2-bf16-vector-cmp-1.c: Ditto. * gcc.target/i386/avx10_2-bf16-vector-fma-1.c: Ditto. * gcc.target/i386/avx10_2-bf16-vector-operations-1.c: Ditto. * gcc.target/i386/avx10_2-bf16-vector-smaxmin-1.c: Ditto. * gcc.target/i386/avx10_2-builtin-1.c: Ditto. * gcc.target/i386/avx10_2-builtin-2.c: Ditto. * gcc.target/i386/avx10_2-comibf-1.c: Ditto. * gcc.target/i386/avx10_2-comibf-2.c: Ditto. * gcc.target/i386/avx10_2-comibf-3.c: Ditto. * gcc.target/i386/avx10_2-comibf-4.c: Ditto. * gcc.target/i386/avx10_2-compare-1.c: Ditto. * gcc.target/i386/avx10_2-compare-1b.c: Ditto. * gcc.target/i386/avx10_2-convert-1.c: Ditto. * gcc.target/i386/avx10_2-media-1.c: Ditto. * gcc.target/i386/avx10_2-minmax-1.c: Ditto. * gcc.target/i386/avx10_2-movrs-1.c: Ditto. * gcc.target/i386/avx10_2-partial-bf16-vector-fast-math-1.c: Ditto. * gcc.target/i386/avx10_2-partial-bf16-vector-fma-1.c: Ditto. * gcc.target/i386/avx10_2-partial-bf16-vector-operations-1.c: Ditto. * gcc.target/i386/avx10_2-partial-bf16-vector-smaxmin-1.c: Ditto. * gcc.target/i386/avx10_2-rounding-1.c: Ditto. * gcc.target/i386/avx10_2-rounding-2.c: Ditto. * gcc.target/i386/avx10_2-rounding-3.c: Ditto. * gcc.target/i386/avx10_2-satcvt-1.c: Ditto. * gcc.target/i386/avx10_2-vaddbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcmpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcomisbf16-1.c: Ditto. * gcc.target/i386/avx10_2-vcomisbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2bf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2bf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2hf8-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2hf8s-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttbf162ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttbf162iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto. * gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto. * gcc.target/i386/avx10_2-vdivbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vdpphps-2.c: Ditto. * gcc.target/i386/avx10_2-vfmaddXXXbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfmsubXXXbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmaddXXXbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfnmsubXXXbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vfpclassbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetexpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vgetmantbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vmaxbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto. * gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto. * gcc.target/i386/avx10_2-vmovd-1.c: Ditto. * gcc.target/i386/avx10_2-vmovd-2.c: Ditto. * gcc.target/i386/avx10_2-vmovw-1.c: Ditto. * gcc.target/i386/avx10_2-vmovw-2.c: Ditto. * gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto. * gcc.target/i386/avx10_2-vmulbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto. * gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto. * gcc.target/i386/avx10_2-vrcpbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vreducebf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrndscalebf16-2.c: Ditto. * gcc.target/i386/avx10_2-vrsqrtbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vscalefbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsqrtbf16-2.c: Ditto. * gcc.target/i386/avx10_2-vsubbf16-2.c: Ditto. * gcc.target/i386/funcspec-56.inc: Ditto. * gcc.target/i386/part-vect-vec_cmpbf.c: Ditto. * gcc.target/i386/pr117495.c: Ditto. * gcc.target/i386/sm4-avx10_2-1.c: Ditto. * gcc.target/i386/sm4-check.h: Ditto. * gcc.target/i386/vnniint16-auto-vectorize-3.c: Ditto. * gcc.target/i386/vnniint8-auto-vectorize-3.c: Ditto. * lib/target-supports.exp: Ditto.	2025-02-17 11:10:08 +08:00
Haochen Jiang	e4f4a5c85e	i386: Deprecate -m[no-]avx10.1 and make -mno-avx10.1-512 to disable the whole AVX10.1 Based on the feedback we got, we would like to re-alias avx10.x to 512 bit in the future. This leaves the current avx10.1 alias to 256 bit inconsistent. Since it has been there for GCC 14.1 and GCC 14.2, we decide to deprecate avx10.1 alias. The current proposal is not adding it back in the future, but it might change if necessary. For -mno- options, it is confusing what it is disabling when it comes to avx10. Since there is barely usage enabling AVX10 with 512 bit then disabling it, we will only provide -mno-avx10.x options in the future, disabling the whole AVX10.x. If someone really wants to disable 512 bit after enabling it, -mavx10.x-512 -mno-avx10.x -mavx10.x-256 is the only way to do that since we also do not want to break the usual expression on -m- options enabling everything mentioned. However, for avx10.1, since we deprecated avx10.1, there is no reason we should have -mno-avx10.1. Thus, we need to keep -mno-avx10.1-[256,512]. To avoid confusion, we will make -mno-avx10.1-512 to disable the whole AVX10.1 set to match the future -mno-avx10.x. gcc/ChangeLog: * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AVX2_UNSET): Change AVX10.1 unset macro. (OPTION_MASK_ISA2_AVX10_1_256_UNSET): Removed. (OPTION_MASK_ISA2_AVX10_1_512_UNSET): Removed. (OPTION_MASK_ISA2_AVX10_1_UNSET): New. (ix86_handle_option): Adjust AVX10.1 unset macro. * common/config/i386/i386-isas.h: Remove avx10.1. * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Ditto. (ix86_option_override_internal): Adjust warning message. * config/i386/i386.opt: Remove mavx10.1. * doc/extend.texi: Remove avx10.1 and adjust doc. * doc/sourcebuild.texi: Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10-check.h: Change to avx10.1-256. * gcc.target/i386/avx10_1-1.c: Ditto. * gcc.target/i386/avx10_1-13.c: Ditto. * gcc.target/i386/avx10_1-14.c: Ditto. * gcc.target/i386/avx10_1-21.c: Ditto. * gcc.target/i386/avx10_1-22.c: Ditto. * gcc.target/i386/avx10_1-23.c: Ditto. * gcc.target/i386/avx10_1-24.c: Ditto. * gcc.target/i386/avx10_1-3.c: Ditto. * gcc.target/i386/avx10_1-5.c: Ditto. * gcc.target/i386/avx10_1-6.c: Ditto. * gcc.target/i386/avx10_1-8.c: Ditto. * gcc.target/i386/pr117946.c: Ditto. * gcc.target/i386/avx10_1-12.c: Adjust warning message. * gcc.target/i386/avx10_1-19.c: Ditto. * gcc.target/i386/avx10_1-17.c: Adjust to no-avx10.1-512.	2025-02-17 11:10:07 +08:00
Haochen Jiang	e15216046d	i386: Do not check vector size conflict when AVX512 is not explicitly set [PR 118815] When AVX512 is not explicitly set, we should not take EVEX512 bit into consideration when checking vector size. It will solve the intrin header file reporting warnings when compiling with -Wsystem-headers. However, there is side effect on the usage for '-march=xxx -mavx10.1-256', where xxx is with AVX512. It will not report warning on vector size for now. Since it is a rare usage, we will take it. gcc/ChangeLog: PR target/118815 * config/i386/i386-options.cc (ix86_option_override_internal): Do not check vector size conflict when AVX512 is not explicitly set. gcc/testsuite/ChangeLog: PR target/118815 * gcc.target/i386/pr118815.c: New test.	2025-02-17 11:01:36 +08:00
Lulu Cheng	ae14d7d04d	LoongArch: Fix the issue of function jump out of range caused by crtbeginS.o [PR118844]. Due to the presence of R_LARCH_B26 in /usr/lib/gcc/loongarch64-linux-gnu/14/crtbeginS.o, its addressing range is [PC-128MiB, PC+128MiB-4]. This means that when the code segment size exceeds 128MB, linking with lld will definitely fail (ld will not fail because the order of the two is different). The linking order: lld: crtbeginS.o + .text + .plt ld : .plt + crtbeginS.o + .text To solve this issue, add '-mcmodel=extreme' when compiling crtbeginS.o. PR target/118844 libgcc/ChangeLog: * config/loongarch/t-crtstuff: Add '-mcmodel=extreme' to CRTSTUFF_T_CFLAGS_S.	2025-02-17 10:15:39 +08:00
GCC Administrator	2ef2b206c4	Daily bump.	2025-02-17 00:16:48 +00:00
Jakub Jelinek	68e74199c6	[PR target/118248] Avoid bogus alloca call in RISC-V backend This is Jakub's patch and Ian's testcase for the slightly vexing fault building the D runtime with an s390x-x-riscv cross compiler. The core issue is we're allocating a vector to hold temporary registers unconditionally, including cases where the vector isn't needed because the loop isn't going to iterate. In the cases where the vector isn't needed the length is computed with an expression (x / y) - 1 where x / y will be zero. The alloca(-1) on the s390 platform triggers a fault. We haven't seen the fault with an x86 cross, but we can certainly see the bogus value being passed to alloca with a debugger. Jakub patch just conditionalizes the whole block in a sensible way. So it looks larger than it really is. I thought it might be better to do a bit of manual CSE on this code to make it even more obvious, but I think we're ultimately OK here. Ian provided the testcase, collapsed down into equivalent C code. Again, it doesn't fault on an x86-x-riscv, but I can see the incorrect behavior with a debugger. And a shout-out to Stefan for providing a docker based reproducer, it really helped track this down. PR target/118248 gcc/ * config/riscv/riscv-string.cc (riscv_block_move_straight): Only allocate REGS buffer if it will be needed. gcc/testsuite * gcc.target/riscv/pr118248.c: New test.	2025-02-16 11:19:20 -07:00

1 2 3 4 5 ...

217696 commits