procyberian/gcc - Masscollabs Services: Beyond Sharing , Liberating The Software World

Author	SHA1	Message	Date
liuhongt	37a231cc75	Disparage slightly for the alternative which move DFmode between SSE_REGS and GENERAL_REGS. For testcase void __cond_swap(double* __x, double* __y) { bool __r = (__x < __y); auto __tmp = __r ? __x : __y; __y = __r ? __y : __x; __x = __tmp; } GCC-14 with -O2 and -march=x86-64 options generates the following code: __cond_swap(double, double): movsd xmm1, QWORD PTR [rdi] movsd xmm0, QWORD PTR [rsi] comisd xmm0, xmm1 jbe .L2 movq rax, xmm1 movapd xmm1, xmm0 movq xmm0, rax .L2: movsd QWORD PTR [rsi], xmm1 movsd QWORD PTR [rdi], xmm0 ret rax is used to save and restore DFmode value. In RA both GENERAL_REGS and SSE_REGS cost zero since we didn't disparage the alternative in movdf_internal pattern, according to register allocation order, GENERAL_REGS is allocated. The patch add ? for alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal pattern, after that we get optimal RA. __cond_swap: .LFB0: .cfi_startproc movsd (%rdi), %xmm1 movsd (%rsi), %xmm0 comisd %xmm1, %xmm0 jbe .L2 movapd %xmm1, %xmm2 movapd %xmm0, %xmm1 movapd %xmm2, %xmm0 .L2: movsd %xmm1, (%rsi) movsd %xmm0, (%rdi) ret gcc/ChangeLog: PR target/110170 * config/i386/i386.md (movdf_internal): Disparage slightly for 2 alternatives (r,v) and (v,r) by adding constraint modifier '?'. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110170-3.c: New test.	2023-07-06 13:54:25 +08:00
Jeevitha Palanisamy	1669fad496	rs6000: Remove redundant initialization [PR106907] PR106907 has few warnings spotted from cppcheck. In that addressing redundant initialization issue. Here the initialized value of 'new_addr' was overwritten before it was read. Updated the source by removing the unnecessary initialization of 'new_addr'. 2023-07-06 Jeevitha Palanisamy <jeevitha@linux.ibm.com> gcc/ PR target/106907 * config/rs6000/rs6000.cc (rs6000_expand_vector_extract): Remove redundant initialization of new_addr.	2023-07-05 23:46:15 -05:00
Hao Liu	7339e725b9	tree-optimization/110474 - Vect: select small VF for epilog of unrolled loop If a loop is unrolled during vectorization (i.e. suggested_unroll_factor > 1), the VFs of both main and epilog loop are enlarged. The epilog vect loop is specific for a loop with small iteration counts, so a large VF may hurt performance. This patch unscales the main loop VF by suggested_unroll_factor while selecting the epilog loop VF, so that it will be the same as vectorized loop without unrolling (i.e. suggested_unroll_factor = 1). gcc/ChangeLog: PR tree-optimization/110474 * tree-vect-loop.cc (vect_analyze_loop_2): unscale the VF by suggested unroll factor while selecting the epilog vect loop VF. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr110474.c: New testcase.	2023-07-06 10:06:01 +08:00
GCC Administrator	5158918aa2	Daily bump.	2023-07-06 00:17:51 +00:00
Andrew MacLeod	778099c426	Make compute_operand_range a tail call. Tweak the routine so it is making a tail call. * gimple-range-gori.cc (compute_operand_range): Convert to a tail call.	2023-07-05 19:06:31 -04:00
Andrew MacLeod	988b07a66a	Make compute_operand2_range a leaf call. Rather than creating long call chains, put the onus for finishing the evlaution on the caller. * gimple-range-gori.cc (compute_operand_range): After calling compute_operand2_range, recursively call self if needed. (compute_operand2_range): Turn into a leaf function. (gori_compute::compute_operand1_and_operand2_range): Finish operand2 calculation. * gimple-range-gori.h (compute_operand2_range): Remove name param.	2023-07-05 19:06:30 -04:00
Andrew MacLeod	018e7f1640	Make compute_operand1_range a leaf call. Rather than creating long call chains, put the onus for finishing the evlaution on the caller. * gimple-range-gori.cc (compute_operand_range): After calling compute_operand1_range, recursively call self if needed. (compute_operand1_range): Turn into a leaf function. (gori_compute::compute_operand1_and_operand2_range): Finish operand1 calculation. * gimple-range-gori.h (compute_operand1_range): Remove name param.	2023-07-05 19:06:30 -04:00
Andrew MacLeod	f037570561	Simplify compute_operand_range for op1 and op2 case. Move the check for co-dependency between 2 operands into compute_operand_range, resulting in a much cleaner compute_operand1_and_operand2_range routine. * gimple-range-gori.cc (compute_operand_range): Check for operand interdependence when both op1 and op2 are computed. (compute_operand1_and_operand2_range): No checks required now.	2023-07-05 19:06:30 -04:00
Andrew MacLeod	70d1e3f40f	Move relation discovery into compute_operand_range compute_operand1_range and compute_operand2_range were both doing relation discovery between the 2 operands... move it into a common area. * gimple-range-gori.cc (compute_operand_range): Check for a relation between op1 and op2 and use that instead. (compute_operand1_range): Don't look for a relation override. (compute_operand2_range): Ditto.	2023-07-05 19:06:30 -04:00
Thomas Rodgers	acfe8fa8dc	libstdc++: Split up pstl/set.cc testcase This testcase is causing some timeout issues. This patch splits the testcase up by individual set algorithm. libstdc++-v3:/ChangeLog: * testsuite/25_algorithms/pstl/alg_sorting/set.cc: Delete file. * testsuite/25_algorithms/pstl/alg_sorting/set_difference.cc: New file. * testsuite/25_algorithms/pstl/alg_sorting/set_intersection.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_symmetric_difference.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_union.cc: Likewise. * testsuite/25_algorithms/pstl/alg_sorting/set_util.h: Likewise.	2023-07-05 14:13:02 -07:00
Jonathan Wakely	be240fc6ac	doc: Update my Contributors entry gcc/ChangeLog: * doc/contrib.texi (Contributors): Update my entry.	2023-07-05 17:10:24 +01:00
Filip Kastl	1ee710027d	value-prof.cc: Correct edge prob calculation. The mod-subtract optimization with ncounts==1 produced incorrect edge probabilities due to incorrect conditional probability calculation. This patch fixes the calculation. Signed-off-by: Filip Kastl <filip.kastl@gmail.com> gcc/ChangeLog: * value-prof.cc (gimple_mod_subtract_transform): Correct edge prob calculation.	2023-07-05 17:36:02 +02:00
Uros Bizjak	a4778dbd93	sched: Change return type of predicate functions from int to bool Also change some internal variables to bool. gcc/ChangeLog: * sched-int.h (struct haifa_sched_info): Change can_schedule_ready_p, scehdule_more_p and contributes_to_priority indirect frunction type from int to bool. (no_real_insns_p): Change return type from int to bool. (contributes_to_priority): Ditto. * haifa-sched.cc (no_real_insns_p): Change return type from int to bool and adjust function body accordingly. * modulo-sched.cc (try_scheduling_node_in_cycle): Change "success" variable type from int to bool. (ps_insn_advance_column): Change return type from int to bool. (ps_has_conflicts): Ditto. Change "has_conflicts" variable type from int to bool. * sched-deps.cc (deps_may_trap_p): Change return type from int to bool. (conditions_mutex_p): Ditto. * sched-ebb.cc (schedule_more_p): Ditto. (ebb_contributes_to_priority): Change return type from int to bool and adjust function body accordingly. * sched-rgn.cc (is_cfg_nonregular): Ditto. (check_live_1): Ditto. (is_pfree): Ditto. (find_conditional_protection): Ditto. (is_conditionally_protected): Ditto. (is_prisky): Ditto. (is_exception_free): Ditto. (haifa_find_rgns): Change "unreachable" and "too_large_failure" variables from int to bool. (extend_rgns): Change "rescan" variable from int to bool. (check_live): Change return type from int to bool and adjust function body accordingly. (can_schedule_ready_p): Ditto. (schedule_more_p): Ditto. (contributes_to_priority): Ditto.	2023-07-05 16:58:17 +02:00
Robin Dapp	c30efd8cd6	gimple-isel: Recognize vec_extract pattern. In gimple-isel we already deduce a vec_set pattern from an ARRAY_REF(VIEW_CONVERT_EXPR). This patch does the same for a vec_extract. The code is largely similar to the vec_set one including the addition of a can_vec_extract_var_idx_p function in optabs.cc to check if the backend can handle a register operand as index. We already have can_vec_extract in optabs-query but that one checks whether we can extract specific modes. With the introduction of an internal function for vec_extract the expander must not FAIL. For vec_set this has already been the case so adjust the documentation accordingly. Additionally, clarify the wording of the vector-vector case for vec_extract. gcc/ChangeLog: * doc/md.texi: Document that vec_set and vec_extract must not fail. * gimple-isel.cc (gimple_expand_vec_set_expr): Rename this... (gimple_expand_vec_set_extract_expr): ...to this. (gimple_expand_vec_exprs): Call renamed function. * internal-fn.cc (vec_extract_direct): Add. (expand_vec_extract_optab_fn): New function to expand vec_extract optab. (direct_vec_extract_optab_supported_p): Add. * internal-fn.def (VEC_EXTRACT): Add. * optabs.cc (can_vec_extract_var_idx_p): New function. * optabs.h (can_vec_extract_var_idx_p): Declare.	2023-07-05 16:57:05 +02:00
Robin Dapp	573bb719bb	RISC-V: Support variable index in vec_extract. This patch adds a gen_lowpart in the vec_extract expander so it properly works with a variable index and adds tests. gcc/ChangeLog: * config/riscv/autovec.md: Add gen_lowpart. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Add tests for variable index. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-zvfh-run.c: Ditto.	2023-07-05 16:57:05 +02:00
Robin Dapp	df9a6cbb08	RISC-V: Allow variable index for vec_set. This patch enables a variable index for vec_set and adjust the tests. gcc/ChangeLog: * config/riscv/autovec.md: Allow register index operand. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Adjust test. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-zvfh-run.c: Ditto.	2023-07-05 16:56:46 +02:00
Pan Li	70b041684a	RISC-V: Use FRM_DYN when add the rounding mode operand This patch would like to take FRM_DYN const rtx as the rounding mode operand according to the RVV spec, which takes the dyn as the only rounding mode for floating-point. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (function_expander::use_exact_insn): Use FRM_DYN instead of const0. Signed-off-by: Pan Li <pan2.li@intel.com>	2023-07-05 22:26:37 +08:00
Robin Dapp	429905d809	RISC-V: Change truncate to float_truncate in narrowing patterns. This fixes a bug in the autovect FP narrowing patterns which resulted in a combine ICE. It would try to e.g. simplify a unary operation by simplify_const_unary_operation which obviously expects a float_truncate and not a truncate for a floating-point mode. gcc/ChangeLog: * config/riscv/autovec.md: Use float_truncate.	2023-07-05 15:54:51 +02:00
Ju-Zhe Zhong	34c614b7e9	VECT: Apply LEN_MASK_GATHER_LOAD/SCATTER_STORE into vectorizer Hi, Richard and Richi. Address comments from Richi. Make gs_info.ifn = LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE. I have fully tested these 4 format: length = vf is a dummpy length, mask = {-1,-1, ... } is a dummy mask. 1. no length, no mask LEN_MASK_GATHER_LOAD (..., length = vf, mask = {-1,-1,...}) 2. exist length, no mask LEN_MASK_GATHER_LOAD (..., len, mask = {-1,-1,...}) 3. exist mask, no length LEN_MASK_GATHER_LOAD (..., length = vf, mask) 4. both mask and length exist LEN_MASK_GATHER_LOAD (..., length, mask) All of these work fine in this patch. Here is the example: void f (int restrict a, int restrict b, int n, int base, int step, int restrict cond) { for (int i = 0; i < n; ++i) { if (cond[i]) a[i 4] = b[i]; } } Gimple IR: <bb 3> [local count: 105119324]: _58 = (unsigned long) n_13(D); <bb 4> [local count: 630715945]: # vectp_cond.7_45 = PHI <vectp_cond.7_46(4), cond_14(D)(3)> # vectp_b.11_51 = PHI <vectp_b.11_52(4), b_15(D)(3)> # vectp_a.14_55 = PHI <vectp_a.14_56(4), a_16(D)(3)> # ivtmp_59 = PHI <ivtmp_60(4), _58(3)> _61 = .SELECT_VL (ivtmp_59, POLY_INT_CST [2, 2]); ivtmp_44 = _61 * 4; vect__4.9_47 = .LEN_MASK_LOAD (vectp_cond.7_45, 32B, _61, 0, { -1, ... }); mask__24.10_49 = vect__4.9_47 != { 0, ... }; vect__8.13_53 = .LEN_MASK_LOAD (vectp_b.11_51, 32B, _61, 0, mask__24.10_49); ivtmp_54 = _61 * 16; .LEN_MASK_SCATTER_STORE (vectp_a.14_55, { 0, 16, 32, ... }, 1, vect__8.13_53, _61, 0, mask__24.10_49); vectp_cond.7_46 = vectp_cond.7_45 + ivtmp_44; vectp_b.11_52 = vectp_b.11_51 + ivtmp_44; vectp_a.14_56 = vectp_a.14_55 + ivtmp_54; ivtmp_60 = ivtmp_59 - _61; if (ivtmp_60 != 0) goto <bb 4>; [83.33%] else goto <bb 5>; [16.67%] Ok for trunk ? gcc/ChangeLog: * internal-fn.cc (internal_fn_len_index): Apply LEN_MASK_GATHER_LOAD/SCATTER_STORE into vectorizer. (internal_fn_mask_index): Ditto. * optabs-query.cc (supports_vec_gather_load_p): Ditto. (supports_vec_scatter_store_p): Ditto. * tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Ditto. * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern): Ditto. * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto. (vect_get_strided_load_store_ops): Ditto. (vectorizable_store): Ditto. (vectorizable_load): Ditto.	2023-07-05 21:27:48 +08:00
Robin Dapp	f4a2ae2338	Change MODE_BITSIZE to MODE_PRECISION for MODE_VECTOR_BOOL. RISC-V lowers the TYPE_PRECISION for MODE_VECTOR_BOOL vectors in order to distinguish between VNx1BI, VNx2BI, VNx4BI and VNx8BI. This patch adjusts uses of MODE_VECTOR_BOOL to use GET_MODE_PRECISION instead of GET_MODE_BITSIZE. The RISC-V tests are provided by Juzhe. Co-Authored-By: Juzhe-Zhong <juzhe.zhong@rivai.ai> gcc/c-family/ChangeLog: * c-common.cc (c_common_type_for_mode): Use GET_MODE_PRECISION. gcc/ChangeLog: * simplify-rtx.cc (native_encode_rtx): Ditto. (native_decode_vector_rtx): Ditto. (simplify_const_vector_byte_offset): Ditto. (simplify_const_vector_subreg): Ditto. * tree.cc (build_truth_vector_type_for_mode): Ditto. * varasm.cc (output_constant_pool_2): Ditto. gcc/fortran/ChangeLog: * trans-types.cc (gfc_type_for_mode): Ditto. gcc/go/ChangeLog: * go-lang.cc (go_langhook_type_for_mode): Ditto. gcc/lto/ChangeLog: * lto-lang.cc (lto_type_for_mode): Ditto. gcc/rust/ChangeLog: * backend/rust-tree.cc (c_common_type_for_mode): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-10.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-11.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-12.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-13.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-14.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-7.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-8.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/bitmask-9.c: New test.	2023-07-05 11:36:39 +02:00
YunQiang Su	5f5e37dcbc	MIPS: Use unaligned access to expand block_move on r6 MIPSr6 support unaligned memory access with normal lh/sh/lw/sw/ld/sd instructions, and thus lwl/lwr/ldl/ldr and swl/swr/sdl/sdr is removed. For microarchitecture, these memory access instructions issue 2 operation if the address is not aligned, which is like what lwl family do. For some situation (such as accessing boundary of pages) on some microarchitectures, the unaligned access may not be good enough, then the kernel should trap&emu it: the kernel may need -mno-unalgined-access option. gcc/ * config/mips/mips.cc (mips_expand_block_move): don't expand for r6 with -mno-unaligned-access option if one or both of src and dest are unaligned. restruct: return directly if length is not const. (mips_block_move_straight): emit_move if ISA_HAS_UNALIGNED_ACCESS. gcc/testsuite/ * gcc.target/mips/expand-block-move-r6-no-unaligned.c: new test. * gcc.target/mips/expand-block-move-r6.c: new test.	2023-07-05 17:26:02 +08:00
Richard Biener	a9c6db31cb	adjust testcase for now happening epilogue vectorization gcc.dg/vect/slp-perm-9.c is reported to FAIL with -march=cascadelake now which is because we now vectorize the epilogue with V2HImode vectors after the recent change to not scrap too large vector epilogues during transform but during analysis time. The following adjusts the testcase to always use the existing alternate N which avoids epilogue vectorization. * gcc.dg/vect/slp-perm-9.c: Always use alternate N.	2023-07-05 10:05:05 +02:00
Jan Beulich	b647f75a6a	x86: suppress avx512f-copysign.c testcase for 32-bit The test installed by "x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F" won't succeed on 32-bit, for floating point operations being done there (by default) without using SIMD insns. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: Suppress for 32-bit.	2023-07-05 09:52:41 +02:00
Jan Beulich	e007369c8b	x86: yet more PR target/100711-like splitting Following two-operand bitwise operations, add another splitter to also deal with not followed by broadcast all on its own, which can be expressed as simple embedded broadcast instead once a broadcast operand is actually permitted in the respective insn. While there also permit a broadcast operand in the corresponding expander. gcc/ PR target/100711 * config/i386/sse.md: New splitters to simplify not;vec_duplicate as a singular vpternlog. (one_cmpl<mode>2): Allow broadcast for operand 1. (<mask_codefor>one_cmpl<mode>2<mask_name>): Likewise. gcc/testsuite/ PR target/100711 * gcc.target/i386/pr100711-6.c: New test.	2023-07-05 09:49:16 +02:00
Jan Beulich	fa58c2871a	x86: further PR target/100711-like splitting With respective two-operand bitwise operations now expressable by a single VPTERNLOG, add splitters to also deal with ior and xor counterparts of the original and-only case. Note that the splitters need to be separate, as the placement of "not" differs in the final insns (iornot<mode>3, xnor<mode>3) which are intended to pick up one half of the result. gcc/ PR target/100711 * config/i386/sse.md: New splitters to simplify not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}. gcc/testsuite/ PR target/100711 * gcc.target/i386/pr100711-4.c: New test. * gcc.target/i386/pr100711-5.c: New test.	2023-07-05 09:48:47 +02:00
Jan Beulich	3186ef0cb9	x86: allow memory operand for AVX2 splitter for PR target/100711 The intended broadcast (with AVX512) can very well be done right from memory. gcc/ PR target/100711 * config/i386/sse.md: Permit non-immediate operand 1 in AVX2 form of splitter for PR target/100711.	2023-07-05 09:48:19 +02:00
Richard Biener	9fed1ec67f	middle-end/110541 - VEC_PERM_EXPR documentation is off The following adjusts the tree.def documentation about VEC_PERM_EXPR which wasn't adjusted when the restrictions of permutes with constant mask were relaxed. PR middle-end/110541 * tree.def (VEC_PERM_EXPR): Adjust documentation to reflect reality.	2023-07-05 09:44:24 +02:00
Jan Beulich	2d11c99dfc	x86: use VPTERNLOG also for certain andnot forms When it's the memory operand which is to be inverted, using VPANDN* requires a further load instruction. The same can be achieved by a single VPTERNLOG. Add two new alternatives (for plain memory and embedded broadcast), adjusting the predicate for the first operand accordingly. Two pre-existing testcases actually end up being affected (improved) by the change, which is reflected in updated expectations there. gcc/ PR target/93768 config/i386/sse.md (andnot<mode>3): Add new alternatives for memory form operand 1. gcc/testsuite/ PR target/93768 gcc.target/i386/avx512f-andn-di-zmm-2.c: New test. * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations towards generated code. * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit code.	2023-07-05 09:41:09 +02:00
Jan Beulich	607613e516	x86: use VPTERNLOG for further bitwise two-vector operations All combinations of and, ior, xor, and not involving two operands can be expressed that way in a single insn. gcc/ PR target/93768 * config/i386/i386.cc (ix86_rtx_costs): Further special-case bitwise vector operations. * config/i386/sse.md (iornot<mode>3): New insn. (xnor<mode>3): Likewise. (<nlogic><mode>3): Likewise. (andor): New code iterator. (nlogic): New code attribute. (ternlog_nlogic): Likewise. gcc/testsuite/ PR target/93768 gcc.target/i386/avx512-binop-not-1.h: New. * gcc.target/i386/avx512-binop-not-2.h: New. * gcc.target/i386/avx512f-orn-si-zmm-1.c: New test. * gcc.target/i386/avx512f-orn-si-zmm-2.c: New test.	2023-07-05 09:40:40 +02:00
Richard Biener	450b9566d5	Fix typo in vectorizer debug message * tree-vect-stmts.cc (vect_mark_relevant): Fix typo.	2023-07-05 09:33:10 +02:00
Jonathan Wakely	cd9964b7e2	libstdc++: Disable std::forward_list tests for C++98 mode These tests fail with -std=gnu++98/-D_GLIBCXX_DEBUG in the runtest flags. They should require the c++11 effective target. libstdc++-v3/ChangeLog: * testsuite/23_containers/forward_list/debug/iterator1_neg.cc: Skip as UNSUPPORTED for C++98 mode. * testsuite/23_containers/forward_list/debug/iterator3_neg.cc: Likewise.	2023-07-05 07:39:04 +01:00
Jonathan Wakely	83cae6c4b7	libstdc++: Fix std::__uninitialized_default_n for constant evaluation [PR110542] libstdc++-v3/ChangeLog: PR libstdc++/110542 * include/bits/stl_uninitialized.h (__uninitialized_default_n): Do not use std::fill_n during constant evaluation.	2023-07-05 07:39:04 +01:00
Jonathan Wakely	4870a18ac2	libstdc++: Use RAII in std::vector::_M_default_append Similar to r14-2052-gdd2eb972a5b063, replace the try-block with RAII types for deallocating storage and destroying elements. libstdc++-v3/ChangeLog: * include/bits/vector.tcc (_M_default_append): Replace try-block with RAII types.	2023-07-05 07:39:04 +01:00
Jonathan Wakely	49f2b325ec	libstdc++: Add redundant 'typename' to std::projected This is needed by Clang 15. libstdc++-v3/ChangeLog: * include/bits/iterator_concepts.h (projected): Add typename.	2023-07-05 07:39:03 +01:00
yulong	8377cf1bf4	RISC-V:Add float16 tuple type abi gcc/ChangeLog: * config/riscv/vector.md: Add float16 attr at sew、vlmul and ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-10.c: Add float16 tuple type case. * gcc.target/riscv/rvv/base/abi-11.c: Ditto. * gcc.target/riscv/rvv/base/abi-12.c: Ditto. * gcc.target/riscv/rvv/base/abi-15.c: Ditto. * gcc.target/riscv/rvv/base/abi-8.c: Ditto. * gcc.target/riscv/rvv/base/abi-9.c: Ditto. * gcc.target/riscv/rvv/base/abi-17.c: New test. * gcc.target/riscv/rvv/base/abi-18.c: New test.	2023-07-05 09:46:23 +08:00
yulong	0af87afb3f	RISC-V:Add float16 tuple type support This patch adds support for the float16 tuple type. gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (valid_type): Enable FP16 tuple. * config/riscv/riscv-modes.def (RVV_TUPLE_MODES): New macro. (ADJUST_ALIGNMENT): Ditto. (RVV_TUPLE_PARTIAL_MODES): Ditto. (ADJUST_NUNITS): Ditto. * config/riscv/riscv-vector-builtins-types.def (vfloat16mf4x2_t): New types. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Ditto. (vfloat16m2x4_t): Ditto. (vfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-builtins.def (vfloat16mf4x2_t): New macro. (vfloat16mf4x3_t): Ditto. (vfloat16mf4x4_t): Ditto. (vfloat16mf4x5_t): Ditto. (vfloat16mf4x6_t): Ditto. (vfloat16mf4x7_t): Ditto. (vfloat16mf4x8_t): Ditto. (vfloat16mf2x2_t): Ditto. (vfloat16mf2x3_t): Ditto. (vfloat16mf2x4_t): Ditto. (vfloat16mf2x5_t): Ditto. (vfloat16mf2x6_t): Ditto. (vfloat16mf2x7_t): Ditto. (vfloat16mf2x8_t): Ditto. (vfloat16m1x2_t): Ditto. (vfloat16m1x3_t): Ditto. (vfloat16m1x4_t): Ditto. (vfloat16m1x5_t): Ditto. (vfloat16m1x6_t): Ditto. (vfloat16m1x7_t): Ditto. (vfloat16m1x8_t): Ditto. (vfloat16m2x2_t): Ditto. (vfloat16m2x3_t): Ditto. (vfloat16m2x4_t): Ditto. (vfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-switch.def (TUPLE_ENTRY): New. * config/riscv/riscv.md: New. * config/riscv/vector-iterators.md: New. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/tuple-28.c: New test. * gcc.target/riscv/rvv/base/tuple-29.c: New test. * gcc.target/riscv/rvv/base/tuple-30.c: New test. * gcc.target/riscv/rvv/base/tuple-31.c: New test. * gcc.target/riscv/rvv/base/tuple-32.c: New test.	2023-07-05 09:46:02 +08:00
Jie Mei	9d5dbf706a	MIPS: Adjust mips16e2 related tests for ifcvt costing changes A mips16e2 related test fails after the ifcvt change. The mips16e2 addition also causes a test for unrelated module to fail. This patch adjusts branch costs when running the two affected tests. These tests should not require the -mbranch-cost option, and this issue needs to be addressed. gcc/testsuite/ChangeLog: * gcc.target/mips/mips16e2-cmov.c: Adjust branch cost to encourage if-conversion. * gcc.target/mips/movcc-3.c: Same as above.	2023-07-05 09:35:10 +08:00
GCC Administrator	6d966f9f17	Daily bump.	2023-07-05 00:17:06 +00:00
Andrew Pinski	71b68cc559	PR 110487: `(a !=/== CST1 ? CST2 : CST3)` pattern for type safety The problem here is we might produce some values out of the type's min/max (and/or valid values, e.g. signed booleans). The fix is to use an integer type which has the same precision and signedness as the original type. Note two_value_replacement in phiopt had the same issue in previous versions; though I don't know if a problem will show up there. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR tree-optimization/110487 * match.pd (a !=/== CST1 ? CST2 : CST3): Always build a nonstandard integer and use that.	2023-07-04 10:19:50 -07:00
Andrew Pinski	2e5c1b123d	Fix PR 110487: invalid signed boolean value This fixes the first part of this bug where `a ? -1 : 0` would cause a value of 1 into the signed boolean value. It fixes the problem by casting to an integer type of the same size/signedness before doing the negative and then casting to the type of expression. OK? Bootstrapped and tested on x86_64. gcc/ChangeLog: * match.pd (a?-1:0): Cast type an integer type rather the type before the negative. (a?0:-1): Likewise.	2023-07-04 10:19:50 -07:00
Takayuki 'January June' Suwa	cd22b97726	xtensa: Use HARD_REG_SET instead of bare integer gcc/ChangeLog: * config/xtensa/xtensa.cc (machine_function, xtensa_expand_prologue): Change to use HARD_REG_BIT and its macros. * config/xtensa/xtensa.md (peephole2: regmove elimination during DFmode input reload): Likewise.	2023-07-04 08:38:17 -07:00
Richard Biener	819285ef10	tree-optimization/110491 - PHI-OPT and undefs The following makes sure to not make conditional undefs in PHI arguments unconditional by folding cond ? arg1 : arg2. PR tree-optimization/110491 * tree-ssa-phiopt.cc (match_simplify_replacement): Check whether the PHI args are possibly undefined before folding the COND_EXPR. * gcc.dg/torture/pr110491.c: New testcase.	2023-07-04 14:13:33 +02:00
Pan Li	86ff0533fc	Streamer: Fix out of range memory access of machine mode We extend the machine mode from 8 to 16 bits already. But there still one placing missing from the streamer. It has one hard coded array for the machine code like size 256. In the lto pass, we memset the array by MAX_MACHINE_MODE count but the value of the MAX_MACHINE_MODE will grow as more and more modes are added. While the machine mode array in tree-streamer still leave 256 as is. Then, when the MAX_MACHINE_MODE is greater than 256, the memset of lto_output_init_mode_table will touch the memory out of range unexpected. This patch would like to take the MAX_MACHINE_MODE as the size of the array in streamer, to make sure there is no potential unexpected memory access in future. Meanwhile, this patch also adjust some place which has MAX_MACHINE_MODE <= 256 assumption. Care is taken that for offload compilation, we interpret the stream-in data in terms of the host 'MAX_MACHINE_MODE' ('file_data->mode_bits'), which very likely is different from the offload device 'MAX_MACHINE_MODE'. gcc/ * lto-streamer-in.cc (lto_input_mode_table): Stream in the mode bits for machine mode table. * lto-streamer-out.cc (lto_write_mode_table): Stream out the HOST machine mode bits. * lto-streamer.h (struct lto_file_decl_data): New fields mode_bits. * tree-streamer.cc (streamer_mode_table): Take MAX_MACHINE_MODE as the table size. * tree-streamer.h (streamer_mode_table): Ditto. (bp_pack_machine_mode): Take 1 << ceil_log2 (MAX_MACHINE_MODE) as the packing limit. (bp_unpack_machine_mode): Ditto with 'file_data->mode_bits'. gcc/lto/ * lto-common.cc (lto_file_finalize) [!ACCEL_COMPILER]: Initialize 'file_data->mode_bits'. Signed-off-by: Pan Li <pan2.li@intel.com> Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>	2023-07-04 14:10:54 +02:00
Thomas Schwinge	d7faf7a54e	LTO: Capture 'lto_file_decl_data file_data' in 'class lto_input_block' ... instead of just 'unsigned char mode_table'. Preparation for a forthcoming change, where we need to capture an additional 'file_data' item, so it seems easier to just capture that one proper. gcc/ * lto-streamer.h (class lto_input_block): Capture 'lto_file_decl_data file_data' instead of just 'unsigned char mode_table'. * ipa-devirt.cc (ipa_odr_read_section): Adjust. * ipa-fnsummary.cc (inline_read_section): Likewise. * ipa-icf.cc (sem_item_optimizer::read_section): Likewise. * ipa-modref.cc (read_section): Likewise. * ipa-prop.cc (ipa_prop_read_section, read_replacements_section): Likewise. * ipa-sra.cc (isra_read_summary_section): Likewise. * lto-cgraph.cc (input_cgraph_opt_section): Likewise. * lto-section-in.cc (lto_create_simple_input_block): Likewise. * lto-streamer-in.cc (lto_read_body_or_constructor) (lto_input_toplevel_asms): Likewise. * tree-streamer.h (bp_unpack_machine_mode): Likewise. gcc/lto/ * lto-common.cc (lto_read_decls): Adjust.	2023-07-04 14:10:54 +02:00
Richard Biener	1135073424	Use mark_ssa_maybe_undefs in PHI-OPT The following removes gimple_uses_undefined_value_p and instead uses the conservative mark_ssa_maybe_undefs in PHI-OPT, the last user of the other API. * tree-ssa-phiopt.cc (pass_phiopt::execute): Mark SSA undefs. (empty_bb_or_one_feeding_into_p): Check for them. * tree-ssa.h (gimple_uses_undefined_value_p): Remove. * tree-ssa.cc (gimple_uses_undefined_value_p): Likewise.	2023-07-04 12:32:56 +02:00
Richard Biener	6eea7eaf11	Remove unnecessary check on scalar_niter == 0 The following removes an unnecessary check. * tree-vect-loop.cc (vect_analyze_loop_costing): Remove check guarding scalar_niter underflow.	2023-07-04 12:32:25 +02:00
Richard Biener	d4800a23d8	tree-optimization/110376 - testcase for fixed bug This is a new testcase for the fixed bug. PR tree-optimization/110376 * gcc.dg/torture/pr110376.c: New testcase.	2023-07-04 12:29:08 +02:00
Hao Liu	2c12ccf800	PR tree-optimization/110531 - Vect: avoid using uninitialized variable slp_done_for_suggested_uf is used directly in vect_analyze_loop_2 without initialization, which is undefined behavior. Initialize it to false according to the discussion. gcc/ChangeLog: PR tree-optimization/110531 * tree-vect-loop.cc (vect_analyze_loop_1): initialize slp_done_for_suggested_uf to false.	2023-07-04 17:19:23 +08:00
Richard Biener	b083203f05	tree-optimization/110228 - avoid undefs in ifcombine more thoroughly The following replaces the simplistic gimple_uses_undefined_value_p with the conservative mark_ssa_maybe_undefs approach as already used by LIM and IVOPTs. This is to avoid exposing an unconditional uninitialized read on a path from entry by if-combine. PR tree-optimization/110228 * tree-ssa-ifcombine.cc (pass_tree_ifcombine::execute): Mark SSA may-undefs. (bb_no_side_effects_p): Check stmt uses for undefs. * gcc.dg/torture/pr110228.c: New testcase. * gcc.dg/uninit-pr101912.c: Un-XFAIL.	2023-07-04 11:11:45 +02:00
Richard Biener	729aa4fa48	tree-optimization/110436 - bogus live/relevant for unused pattern When we compute liveness and relevantness we have to make sure to handle live but not relevant stmts in a way we can later vectorize them. When the stmt uses only operands that do not need vectorization we can just leave such stmts in place - but not in the case they are recognized as patterns. Since we don't have a way to cancel pattern recognition we have to force mark such stmts as relevant. PR tree-optimization/110436 * tree-vect-stmts.cc (vect_mark_relevant): Expand dumping, force live but not relevant pattern stmts relevant. * gcc.dg/pr110436.c: New testcase.	2023-07-04 10:37:03 +02:00

1 2 3 4 5 ...

202266 commits