procyberian/gcc - Masscollabs Services: Beyond Sharing , Liberating The Software World

Author	SHA1	Message	Date
Alex Coplan	e4d3e7f9ad	testsuite: Rename scanltranstree.exp -> scanltrans.exp Since r15-3254-g3f51f0dc88ec21c1ec79df694200f10ef85915f4 added scan-ltrans-rtl* variants to scanltranstree.exp, it no longer makes sense to have "tree" in the name. This renames the file accordingly and updates users. libatomic/ChangeLog: * testsuite/lib/libatomic.exp: Load scanltrans.exp instead of scanltranstree.exp. libgomp/ChangeLog: * testsuite/lib/libgomp.exp: Load scanltrans.exp instead of scanltranstree.exp. libitm/ChangeLog: * testsuite/lib/libitm.exp: Load scanltrans.exp instead of scanltranstree.exp. libphobos/ChangeLog: * testsuite/lib/libphobos-dg.exp: Load scanltrans.exp instead of scanltranstree.exp. libvtv/ChangeLog: * testsuite/lib/libvtv.exp: Load scanltrans.exp instead of scanltranstree.exp. gcc/testsuite/ChangeLog: * gcc.dg-selftests/dg-final.exp: Load scanltrans.exp instead of scanltranstree.exp. * lib/gcc-dg.exp: Likewise. * lib/scanltranstree.exp: Rename to ... * lib/scanltrans.exp: ... this.	2024-09-02 10:07:09 +01:00
Richard Sandiford	2865719efb	Rename gimple_asm_input_p to gimple_asm_basic_p Following on from the earlier tree rename, this patch renames gimple_asm_input_p to gimple_asm_basic_p, and similarly for related names. gcc/ * doc/gimple.texi (gimple_asm_basic_p): Document. (gimple_asm_set_basic): Likewise. * gimple.h (GF_ASM_INPUT): Rename to... (GF_ASM_BASIC): ...this. (gimple_asm_set_input): Rename to... (gimple_asm_set_basic): ...this. (gimple_asm_input_p): Rename to... (gimple_asm_basic_p): ...this. * cfgexpand.cc (expand_asm_stmt): Update after above renaming. * gimple.cc (gimple_asm_clobbers_memory_p): Likewise. * gimplify.cc (gimplify_asm_expr): Likewise. * ipa-icf-gimple.cc (func_checker::compare_gimple_asm): Likewise. * tree-cfg.cc (stmt_can_terminate_bb_p): Likewise.	2024-09-02 09:56:56 +01:00
Richard Sandiford	a4b6c6ab0b	Rename ASM_INPUT_P to ASM_BASIC_P ASM_INPUT_P is so named because it causes the eventual rtl insn pattern to be a top-level ASM_INPUT rather than an ASM_OPERANDS. However, this name has caused confusion, partly due to earlier documentation. The name also sounds related to ASM_INPUTS but is for a different piece of state. This patch renames it to ASM_BASIC_P, with the inverse meaning an extended asm. ("Basic asm" is the term used in extend.texi.) gcc/ * doc/generic.texi (ASM_BASIC_P): Document. * tree.h (ASM_INPUT_P): Rename to... (ASM_BASIC_P): ...this. (ASM_VOLATILE_P, ASM_INLINE_P): Reindent. * gimplify.cc (gimplify_asm_expr): Update after above renaming. * tree-core.h (tree_base): Likewise. gcc/c/ * c-typeck.cc (build_asm_expr): Rename ASM_INPUT_P to ASM_BASIC_P. gcc/cp/ * pt.cc (tsubst_stmt): Rename ASM_INPUT_P to ASM_BASIC_P. * parser.cc (cp_parser_asm_definition): Likewise. gcc/d/ * toir.cc (IRVisitor): Rename ASM_INPUT_P to ASM_BASIC_P. gcc/jit/ * jit-playback.cc (playback::block::add_extended_asm): Rename ASM_INPUT_P to ASM_BASIC_P. gcc/m2/ * gm2-gcc/m2block.cc (flush_pending_note): Rename ASM_INPUT_P to ASM_BASIC_P. * gm2-gcc/m2statement.cc (m2statement_BuildAsm): Likewise.	2024-09-02 09:56:56 +01:00
Tobias Burnus	5cbfb3a799	lto/lto.cc: Fix build with not HAVE_WORKING_FORK gcc/lto/ChangeLog: * lto.cc: Add missing HAVE_WORKING_FORK.	2024-09-02 10:29:36 +02:00
Tobias Burnus	6640a59fed	lto-wrapper: Honor -save-temps for ltrans' makefile gcc/ChangeLog: * lto-wrapper.cc (run_gcc): Honor -save-temps for makefile name.	2024-09-02 10:28:29 +02:00
Eric Botcazou	571d0450b2	ada: Diagnose too large size clause on floating-point type The problem is that the size clause changes the floating-point format used for the type, but it must not when this format is the widest format that is supported in hardware on the target. Instead a padding type must be built and the associated warning given. gcc/ada/ * gcc-interface/decl.cc (gnat_to_gnu_entity): Cap the Esize of a floating-point type to the size of the widest format supported in hardware if it is explicity defined.	2024-09-02 10:22:50 +02:00
Viljar Indus	1c9a6d8203	ada: Create usage entry for -gnatw_l gcc/ada/ * doc/gnat_ugn/building_executable_programs_with_gnat.rst: update documentation for the -gnatw_l switch. * usage.adb: Add -gnatw_l entry. * gnat_ugn.texi: Regenerate.	2024-09-02 10:22:50 +02:00
Ronan Desplanques	2df253f35e	ada: Fix standard output stream for gnatcmd output Before this patch, the gnat command sent to standard error pieces of information that are a better match for standard output. This patch makes this information go to standard output. gcc/ada/ * gnatcmd.adb (GNATCmd): Fix standard output stream.	2024-09-02 10:22:50 +02:00
Ronan Desplanques	91f0a3a5a5	ada: Fix minor issues in -gnaty0's documentation Before this patch, the documentation of -gnaty0 used 0-based indexing for column numbers while 1-based indexing is used everywhere else. This patch makes this documentation use 1-based indexing, and also adds a missing parenthesis. gcc/ada/ * doc/gnat_ugn/building_executable_programs_with_gnat.rst: Fix minor issues. * gnat_ugn.texi: Regenerate.	2024-09-02 10:22:50 +02:00
Bob Duff	a004c28c50	ada: Documentation for generic type inference ...plus minor improvements to existing documentation. gcc/ada/ * doc/gnat_rm/gnat_language_extensions.rst: I assume "extended set of extensions" was a typo for "experimental set of extensions", because "extended extensions" is repetitive and redundant. "in addition" clarifies that the one subsumes the other. Add a reminder at the start of each subsection about what switch/pragma enables what extensions. Add new section about "Inference of Dependent Types in Generic Instantiations". * gnat_rm.texi: Regenerate.	2024-09-02 10:22:50 +02:00
Patrick Bernardi	34437eb472	ada: Small fixes for FreeBSD Size of pthread data types now need to be defined for FreeBSD ports. Traceback support for AArch64 FreeBSD is now defined. gcc/ada/ * s-oscons-tmplt.c: Define sizes of pthread data types on FreeBSD. * tracebak.c: Use GCC unwinder and adjust PC appropriately on aarch64-freebsd.	2024-09-02 10:22:50 +02:00
Marc Poulhiès	cb690aa1ce	ada: Also reset scope for some nested declaration When changing the scope for entities found in the entry body that is mutated into a procedure, the compiler needs to look deeper than only the top level entities as expansion may produce object declarations which scopes are also the entry. For example, the tree after expansion may look like: procedure This_Is_An_Entry_Proc is ... O1 : Typ := do TMP1 : OTyp := ...; ... in TMP1; O1's scope needs to be reset to This_Is_An_Entry_Proc, but so does TMP1's scope. This change also fix a small oversight where N_Implicit_Label_Declaration scope must be reset and its content skipped. gcc/ada/ * exp_ch9.adb (Reset_Scopes_To): Adjust comment. (Reset_Scopes_To.Reset_Scope): Adjust the scope reset for object declaration. In particular, visit the children nodes if any. Also extend the handling of other declarations to N_Implicit_Label_Declaration.	2024-09-02 10:22:49 +02:00
Piotr Trojanek	905ab329cc	ada: Cleanup expansion of object declarations Replace repeated calls to Sloc with uses of local constant Loc. Code cleanup; behavior is unaffected. gcc/ada/ * exp_ch3.adb (Expand_N_Object_Declaration): Replace calls to Sloc with uses of Loc; turn variable Prag into constant.	2024-09-02 10:22:49 +02:00
Piotr Trojanek	78acc6d85f	ada: Remove repeated guards in validity checks Routine Insert_Valid_Check only applies checks when Expr_Known_Valid query returns False; there is no need to call this query before inserting checks. Code cleanup; behavior is unaffected. gcc/ada/ * exp_imgv.adb (Expand_User_Defined_Enumeration_Image) (Expand_Image_Attribute): Remove redundant guards.	2024-09-02 10:22:49 +02:00
Jakub Jelinek	25d51fb7d0	ranger: Fix up range computation for CLZ [PR116486] The initial CLZ gimple-range-op.cc implementation handled just the case where second argument to .CLZ is equal to prec, but in r15-1014 I've added also handling of the -1 case. As the following testcase shows, incorrectly though for the case where the first argument has [0,0] range. If the second argument is prec, then the result should be [prec,prec] and that was handled correctly, but when the second argument is -1, the result should be [-1,-1] but instead it was incorrectly computed as [prec-1,prec-1] (when second argument is prec, mini is 0 and maxi is prec, while when second argument is -1, mini is -1 and maxi is prec-1). Fixed thusly (the actual handling is then similar to the CTZ [0,0] case). 2024-09-02 Jakub Jelinek <jakub@redhat.com> PR middle-end/116486 * gimple-range-op.cc (cfn_clz::fold_range): If lh is [0,0] and mini is -1, return [-1,-1] range rather than [prec-1,prec-1]. * gcc.dg/bitint-109.c: New test.	2024-09-02 09:44:09 +02:00
Richard Biener	9aaedfc414	load and store-lanes with SLP The following is a prototype for how to represent load/store-lanes within SLP. I've for now settled with having a single load node with multiple permute nodes acting as selection, one for each loaded lane and a single store node fed from all stored lanes. For for (int i = 0; i < 1024; ++i) { a[2i] = b[2i] + 7; a[2i+1] = b[2i+1] * 3; } you have the following SLP graph where I explain how things are set up and code-generated: t.c:23:21: note: SLP graph after lowering permutations: t.c:23:21: note: node 0x50dc8b0 (max_nunits=1, refcnt=1) vector(4) int t.c:23:21: note: op template: _6 = _7; t.c:23:21: note: stmt 0 _6 = _7; t.c:23:21: note: stmt 1 _12 = _13; t.c:23:21: note: children 0x50dc488 0x50dc6e8 This is the store node, it's marked with ldst_lanes = true during SLP discovery. This node code-generates vect_array.65[0] = vect__7.61_29; vect_array.65[1] = vect__13.62_28; MEM <int[8]> [(int )vectp_a.63_27] = .STORE_LANES (vect_array.65); ... t.c:23:21: note: node 0x50dc520 (max_nunits=4, refcnt=2) vector(4) int t.c:23:21: note: op: VEC_PERM_EXPR t.c:23:21: note: stmt 0 _5 = _4; t.c:23:21: note: lane permutation { 0[0] } t.c:23:21: note: children 0x50dc948 t.c:23:21: note: node 0x50dc780 (max_nunits=4, refcnt=2) vector(4) int t.c:23:21: note: op: VEC_PERM_EXPR t.c:23:21: note: stmt 0 _11 = _10; t.c:23:21: note: lane permutation { 0[1] } t.c:23:21: note: children 0x50dc948 These are the selection nodes, marked with ldst_lanes = true. They code generate nothing. t.c:23:21: note: node 0x50dc948 (max_nunits=4, refcnt=3) vector(4) int t.c:23:21: note: op template: _5 = _4; t.c:23:21: note: stmt 0 _5 = _4; t.c:23:21: note: stmt 1 _11 = _10; t.c:23:21: note: load permutation { 0 1 } This is the load node, marked with ldst_lanes = true (the load permutation is only accurate when taking into account the lane permute in the selection nodes). It code generates vect_array.58 = .LOAD_LANES (MEM <int[8]> [(int )vectp_b.56_33]); vect__5.59_31 = vect_array.58[0]; vect__5.60_30 = vect_array.58[1]; This scheme allows to leave code generation in vectorizable_load/store mostly as-is. While this should support both load-lanes and (masked) store-lanes the decision to do either is done during SLP discovery time and cannot be reversed without altering the SLP tree - as-is the SLP tree is not usable for non-store-lanes on the store side, the load side is OK representation-wise but will very likely fail permute handling as the lowering to deal with the two input vector restriction isn't done - but of course since the permute node is marked as to be ignored that doesn't work out. So I've put restrictions in place that fail vectorization if a load/store-lane SLP tree is later classified differently by get_load_store_type. I'll note that for example gcc.target/aarch64/sve/mask_struct_store_3.c will not get SLP store-lanes used because the full store SLPs just fine though we then fail to handle the "splat" load-permutation t2.c:5:21: note: node 0x4db2630 (max_nunits=4, refcnt=2) vector([4,4]) int t2.c:5:21: note: op template: _6 = _5; t2.c:5:21: note: stmt 0 _6 = _5; t2.c:5:21: note: stmt 1 _6 = _5; t2.c:5:21: note: stmt 2 _6 = _5; t2.c:5:21: note: stmt 3 _6 = _5; t2.c:5:21: note: load permutation { 0 0 0 0 } the load permute lowering code currently doesn't consider it worth lowering single loads from a group (or in this case not grouped loads). The expectation is the target can handle this by two interleaves with itself. So what we see here is that while the explicit SLP representation is helpful in some cases, in cases like this it would require changing it when we make decisions how to vectorize. My idea is that this all will change a lot when we re-do SLP discovery (for loops) and when we get rid of non-SLP as I think vectorizable_ should be allowed to alter the SLP graph during analysis. The patch also removes the code cancelling SLP if we can use load/store-lanes from the main loop vector analysis code and re-implements it as re-discovering the SLP instance with forced single-lane splits so SLP load/store-lanes scheme can be used. This is now done after SLP discovery and SLP pattern recog are complete to not disturb the latter but per SLP instance instead of being a global decision on the whole loop. This is a behavioral change that for example shows in gcc.dg/vect/slp-perm-6.c on ARM where we formerly used SLP permutes but now a mix of SLP without permutes and load/store lanes. The previous flaky heuristic is now flaky in a different way. Testing on RISC-V and aarch64 reveal several testcases that require adjustment as to now expect SLP even when load/store lanes are being used. If in doubt I've adjusted them to the final expectation which will lead to one or two new FAILs where we still do the SLP cancelling. I have a followup that implements that while remaining in SLP that's in final testing. Note that gcc.dg/vect/slp-42.c and gcc.dg/vect/pr68445.c will FAIL on aarch64 with SVE because for some odd reason vect_stridedN is true for any N for check_effective_target_vect_fully_masked targets but SVE cannot do ld8 while risc-v can. I have not bothered to adjust target tests that now fail assembly-scan. * tree-vectorizer.h (_slp_tree::ldst_lanes): New flag to mark load, store and permute nodes. * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize ldst_lanes. (vect_build_slp_instance): For stores iff the target prefers store-lanes discover single-lane sub-groups, do not perform interleaving lowering but mark the node with ldst_lanes. Also allow i == 0 - fatal failure - for splitting up a store group when we're not doing single-lane discovery already. (vect_lower_load_permutations): When the target supports load lanes and the loads all fit the pattern split out a single level of permutes only and mark the load and permute nodes with ldst_lanes. (vectorizable_slp_permutation_1): Handle the load-lane permute forwarding of vector defs. (vect_analyze_slp): After SLP pattern recog is finished see if there are any SLP instances that would benefit from using load/store-lanes and re-discover those with forced single lanes. * tree-vect-stmts.cc (get_group_load_store_type): Support load/store-lanes for SLP. (vectorizable_store): Support SLP code generation for store-lanes. (vectorizable_load): Support SLP code generation for load-lanes. * tree-vect-loop.cc (vect_analyze_loop_2): Do not cancel SLP when store-lanes can be used. * gcc.dg/vect/slp-55.c: New testcase. * gcc.dg/vect/slp-56.c: Likewise. * gcc.dg/vect/slp-11c.c: Adjust. * gcc.dg/vect/slp-53.c: Likewise. * gcc.dg/vect/slp-cond-1.c: Likewise. * gcc.dg/vect/vect-complex-5.c: Likewise. * gcc.dg/vect/slp-1.c: Likewise. * gcc.dg/vect/slp-54.c: Remove riscv XFAIL. * gcc.dg/vect/slp-perm-5.c: Adjust. * gcc.dg/vect/slp-perm-7.c: Likewise. * gcc.dg/vect/slp-perm-8.c: Likewise. * gcc.dg/vect/slp-multitypes-11.c: Likewise. * gcc.dg/vect/slp-multitypes-11-big-array.c: Likewise. * gcc.dg/vect/slp-perm-9.c: Remove expected SLP fail due to three-vector permute. * gcc.dg/vect/slp-perm-6.c: Remove XFAIL. * gcc.dg/vect/slp-perm-1.c: Adjust. * gcc.dg/vect/slp-perm-2.c: Likewise. * gcc.dg/vect/slp-perm-3.c: Likewise. * gcc.dg/vect/slp-perm-4.c: Likewise. * gcc.dg/vect/pr68445.c: Likewise. * gcc.dg/vect/slp-11b.c: Likewise. * gcc.dg/vect/slp-2.c: Likewise. * gcc.dg/vect/slp-23.c: Likewise. * gcc.dg/vect/slp-33.c: Likewise. * gcc.dg/vect/slp-42.c: Likewise. * gcc.dg/vect/slp-46.c: Likewise. * gcc.dg/vect/slp-perm-10.c: Likewise.	2024-09-02 08:50:32 +02:00
Richard Biener	464067a242	lower SLP load permutation to interleaving The following emulates classical interleaving for SLP load permutes that we are unlikely handling natively. This is to handle cases where interleaving (or load/store-lanes) is the optimal choice for vectorizing even when we are doing that within SLP. An example would be void foo (int * __restrict a, int * b) { for (int i = 0; i < 16; ++i) { a[4i + 0] = b[4i + 0] * 3; a[4i + 1] = b[4i + 1] + 3; a[4i + 2] = (b[4i + 2] * 3 + 3); a[4i + 3] = b[4i + 3] * 3; } } where currently the SLP store is merging four single-lane SLP sub-graphs but none of the loads in it can be code-generated with V4SImode vectors and a VF of four as the permutes would need three vectors. The patch introduces a lowering phase after SLP discovery but before SLP pattern recognition or permute optimization that analyzes all loads from the same dataref group and creates an interleaving scheme starting from an unpermuted load. What can be handled is power-of-two group size and a group size of three. The possibility for doing the interleaving with a load-lanes like instruction is done as followup. For a group-size of three this is done by using the non-interleaving fallback code which then creates at VF == 4 from { { a0, b0, c0 }, { a1, b1, c1 }, { a2, b2, c2 }, { a3, b3, c3 } } the intermediate vectors { c0, c0, c1, c1 } and { c2, c2, c3, c3 } to produce { c0, c1, c2, c3 }. This turns out to be more effective than the scheme implemented for non-SLP for SSE and only slightly worse for AVX512 and a bit more worse for AVX2. It seems to me that this would extend to other non-power-of-two group-sizes though (but the patch does not). Optimal schemes are likely difficult to lay out in VF agnostic form. I'll note that while the lowering assumes even/odd extract is generally available for all vector element sizes (which is probably a good assumption), it doesn't in any way constrain the other permutes it generates based on target availability. Again difficult to do in a VF agnostic way (but at least currently the vector type is fixed). I'll also note that the SLP store side merges lanes in a way producing three-vector permutes for store group-size of three, so the testcase uses a store group-size of four. The patch has a fallback for when there are multi-lane groups and the resulting permutes to not fit interleaving. Code generation is not optimal when this triggers and might be worse than doing single-lane group interleaving. The patch handles gaps by representing them with NULL entries in SLP_TREE_SCALAR_STMTS for the unpermuted load node. The SLP discovery changes could be elided if we manually build the load node instead. SLP load nodes covering enough lanes to not need intermediate permutes are retained as having a load-permutation and do not use the single SLP load node for each dataref group. That's something we might want to change, making load-permutation something purely local to SLP discovery (but then SLP discovery could do part of the lowering). The patch misses CSEing intermediate generated permutes and registering them with the bst_map which is possibly required for SLP pattern detection in some cases - this re-spin of the patch moves the lowering after SLP pattern detection. * tree-vect-slp.cc (vect_build_slp_tree_1): Handle NULL stmt. (vect_build_slp_tree_2): Likewise. Release load permutation when there's a NULL in SLP_TREE_SCALAR_STMTS and assert there's no actual permutation in that case. (vllp_cmp): New function. (vect_lower_load_permutations): Likewise. (vect_analyze_slp): Call it. * gcc.dg/vect/slp-11a.c: Expect SLP. * gcc.dg/vect/slp-12a.c: Likewise. * gcc.dg/vect/slp-51.c: New testcase. * gcc.dg/vect/slp-52.c: New testcase.	2024-09-02 08:49:20 +02:00
Xianmiao Qu	eca320bfe3	[PATCH] RISC-V: Optimize the cost of the DFmode register move for RV32. Currently, in RV32, even with the D extension enabled, the cost of DFmode register moves is still set to 'COSTS_N_INSNS (2)'. This results in the 'lower-subreg' pass splitting DFmode register moves into two SImode SUBREG register moves, leading to the generation of many redundant instructions. As an example, consider the following test case: double foo (int t, double a, double b) { if (t > 0) return a; else return b; } When compiling with -march=rv32imafdc -mabi=ilp32d, the following code is generated: .cfi_startproc addi sp,sp,-32 .cfi_def_cfa_offset 32 fsd fa0,8(sp) fsd fa1,16(sp) lw a4,8(sp) lw a5,12(sp) lw a2,16(sp) lw a3,20(sp) bgt a0,zero,.L1 mv a4,a2 mv a5,a3 .L1: sw a4,24(sp) sw a5,28(sp) fld fa0,24(sp) addi sp,sp,32 .cfi_def_cfa_offset 0 jr ra .cfi_endproc After adjust the DFmode register move's cost to 'COSTS_N_INSNS (1)', the generated code is as follows, with a significant reduction in the number of instructions. .cfi_startproc ble a0,zero,.L5 ret .L5: fmv.d fa0,fa1 ret .cfi_endproc gcc/ * config/riscv/riscv.cc (riscv_rtx_costs): Optimize the cost of the DFmode register move for RV32. gcc/testsuite/ * gcc.target/riscv/rv32-movdf-cost.c: New test.	2024-09-01 22:28:38 -06:00
Jeff Law	0562976d62	[committed][PR rtl-optimization/116544] Fix test for promoted subregs This is a small bug in the ext-dce code's handling of promoted subregs. Essentially when we see a promoted subreg we need to make additional bit groups live as various parts of the RTL path know that an extension of a suitably promoted subreg can be trivially eliminated. When I added support for dealing with this quirk I failed to account for the larger modes properly and it ignored the case when the size of the inner object was > 32 bits. Opps. This does _not_ fix the outstanding x86 issue. That's caused by something completely different and more concerning ;( Bootstrapped and regression tested on x86. Obviously fixes the testcase on riscv as well. Pushing to the trunk. PR rtl-optimization/116544 gcc/ * ext-dce.cc (ext_dce_process_uses): Fix thinko in promoted subreg handling. gcc/testsuite/ * gcc.dg/torture/pr116544.c: New test.	2024-09-01 22:18:29 -06:00
Levy Hsu	f77435aa39	i386: Support vec_cmp for V8BF/V16BF/V32BF in AVX10.2 gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_use_mask_cmp_p): Add BFmode for int mask cmp. * config/i386/sse.md (vec_cmp<mode><avx512fmaskmodelower>): New vec_cmp expand for VBF modes. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-bf-vector-cmpp-1.c: New test. * gcc.target/i386/avx10_2-bf-vector-cmpp-1.c: Ditto.	2024-09-02 10:25:35 +08:00
Levy Hsu	e19f65b0be	i386: Support vectorized BF16 sqrt with AVX10.2 instruction gcc/ChangeLog: * config/i386/sse.md: Expand VF2H to VF2HB with VBF modes.	2024-09-02 10:24:48 +08:00
Levy Hsu	29ef601973	i386: Support vectorized BF16 smaxmin with AVX10.2 instructions gcc/ChangeLog: * config/i386/sse.md (<code><mode>3): New define expand pattern for BF smaxmin. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-bf-vector-smaxmin-1.c: New test. * gcc.target/i386/avx10_2-bf-vector-smaxmin-1.c: New test.	2024-09-02 10:24:47 +08:00
Levy Hsu	6d294fb8ac	i386: Support vectorized BF16 FMA with AVX10.2 instructions gcc/ChangeLog: * config/i386/sse.md: Add V8BF/V16BF/V32BF to mode iterator FMAMODEM. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-bf-vector-fma-1.c: New test. * gcc.target/i386/avx10_2-bf-vector-fma-1.c: New test.	2024-09-02 10:24:46 +08:00
Levy Hsu	f82fa0da4d	i386: Support vectorized BF16 add/sub/mul/div with AVX10.2 instructions AVX10.2 introduces several non-exception instructions for BF16 vector. Enable vectorized BF add/sub/mul/div operation by supporting standard optab for them. gcc/ChangeLog: * config/i386/sse.md (div<mode>3): New expander for BFmode div. (VF_BHSD): New mode iterator with vector BFmodes. (<insn><mode>3<mask_name><round_name>): Change mode to VF_BHSD. (mul<mode>3<mask_name><round_name>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-bf-vector-operations-1.c: New test. * gcc.target/i386/avx10_2-bf-vector-operations-1.c: Ditto.	2024-09-02 10:24:45 +08:00
Hu, Lin1	3b1decef83	i386: Optimize generate insn for AVX10.2 compare gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_fp_compare): Add UNSPEC to support the optimization. * config/i386/i386.cc (ix86_fp_compare_code_to_integer): Add NE/EQ. * config/i386/i386.md (cmpx<unord><MODEF:mode>): New define_insn. (cmpx<unord>hf): Ditto. * config/i386/predicates.md (ix86_trivial_fp_comparison_operator): Add ne/eq. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-compare-1b.c: New test.	2024-09-02 10:24:36 +08:00
Hu, Lin1	86f5031c80	i386: Optimize ordered and nonequal Currently, when we input !__builtin_isunordered (a, b) && (a != b), gcc will emit ucomiss %xmm1, %xmm0 movl $1, %ecx setp %dl setnp %al cmovne %ecx, %edx andl %edx, %eax movzbl %al, %eax In fact, xorl %eax, %eax ucomiss %xmm1, %xmm0 setne %al is better. gcc/ChangeLog: * match.pd: Optimize (and ordered non-equal) to (not (or unordered equal)) gcc/testsuite/ChangeLog: * gcc.target/i386/optimize_one.c: New test.	2024-09-02 10:24:31 +08:00
Haochen Jiang	b1f9fbb6da	i386: Auto vectorize sdot_prod, usdot_prod, udot_prod with AVX10.2 instructions gcc/ChangeLog: * config/i386/sse.md (VI1_AVX512VNNIBW): New. (VI2_AVX10_2): Ditto. (sdot_prod<mode>): Add AVX10.2 to auto vectorize and combine 512 bit part. (udot_prod<mode>): Ditto. (sdot_prodv64qi): Removed. (udot_prodv64qi): Ditto. (usdot_prod<mode>): Add AVX10.2 to auto vectorize. (udot_prod<mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/vnniint16-auto-vectorize-2.c: Only define TEST when not defined. * gcc.target/i386/vnniint8-auto-vectorize-2.c: Ditto. * gcc.target/i386/vnniint16-auto-vectorize-3.c: New test. * gcc.target/i386/vnniint16-auto-vectorize-4.c: Ditto. * gcc.target/i386/vnniint8-auto-vectorize-3.c: Ditto. * gcc.target/i386/vnniint8-auto-vectorize-4.c: Ditto.	2024-09-02 10:24:29 +08:00
Pan Li	5239902210	RISC-V: Add testcases for unsigned scalar quad and oct .SAT_TRUNC form 3 This patch would like to add test cases for the unsigned scalar quad and oct .SAT_TRUNC form 3. Aka: Form 3: #define DEF_SAT_U_TRUC_FMT_3(NT, WT) \ NT __attribute__((noinline)) \ sat_u_truc_##WT##_to_##NT##_fmt_3 (WT x) \ { \ WT max = (WT)(NT)-1; \ return x <= max ? (NT)x : (NT) max; \ } QUAD: DEF_SAT_U_TRUC_FMT_3 (uint16_t, uint64_t) DEF_SAT_U_TRUC_FMT_3 (uint8_t, uint32_t) OCT: DEF_SAT_U_TRUC_FMT_3 (uint8_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_trunc-16.c: New test. * gcc.target/riscv/sat_u_trunc-17.c: New test. * gcc.target/riscv/sat_u_trunc-18.c: New test. * gcc.target/riscv/sat_u_trunc-run-16.c: New test. * gcc.target/riscv/sat_u_trunc-run-17.c: New test. * gcc.target/riscv/sat_u_trunc-run-18.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-09-02 09:26:42 +08:00
Pan Li	ea81e21d53	RISC-V: Add testcases for unsigned scalar quad and oct .SAT_TRUNC form 2 This patch would like to add test cases for the unsigned scalar quad and oct .SAT_TRUNC form 2. Aka: Form 2: #define DEF_SAT_U_TRUC_FMT_2(NT, WT) \ NT __attribute__((noinline)) \ sat_u_truc_##WT##_to_##NT##_fmt_2 (WT x) \ { \ WT max = (WT)(NT)-1; \ return x > max ? (NT) max : (NT)x; \ } QUAD: DEF_SAT_U_TRUC_FMT_2 (uint16_t, uint64_t) DEF_SAT_U_TRUC_FMT_2 (uint8_t, uint32_t) OCT: DEF_SAT_U_TRUC_FMT_2 (uint8_t, uint64_t) The below test is passed for this patch. * The rv64gcv regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_trunc-10.c: New test. * gcc.target/riscv/sat_u_trunc-11.c: New test. * gcc.target/riscv/sat_u_trunc-12.c: New test. * gcc.target/riscv/sat_u_trunc-run-10.c: New test. * gcc.target/riscv/sat_u_trunc-run-11.c: New test. * gcc.target/riscv/sat_u_trunc-run-12.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-09-02 09:26:37 +08:00
Pan Li	56ed1dfa79	RISC-V: Add testcases for form 4 of unsigned vector .SAT_ADD IMM This patch would like to add test cases for the unsigned vector .SAT_ADD when one of the operand is IMM. Form 4: #define DEF_VEC_SAT_U_ADD_IMM_FMT_4(T, IMM) \ T __attribute__((noinline)) \ vec_sat_u_add_imm##IMM##_##T##_fmt_4 (T out, T in, unsigned limit) \ { \ unsigned i; \ T ret; \ for (i = 0; i < limit; i++) \ { \ out[i] = __builtin_add_overflow (in[i], IMM, &ret) == 0 ? ret : -1; \ } \ } DEF_VEC_SAT_U_ADD_IMM_FMT_4(uint64_t, 123) The below test are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-13.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-14.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-15.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-16.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-13.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-14.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-15.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-16.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-09-02 09:25:45 +08:00
Pan Li	72f3e9021e	RISC-V: Add testcases for form 3 of unsigned vector .SAT_ADD IMM This patch would like to add test cases for the unsigned vector .SAT_ADD when one of the operand is IMM. Form 3: #define DEF_VEC_SAT_U_ADD_IMM_FMT_3(T, IMM) \ T __attribute__((noinline)) \ vec_sat_u_add_imm##IMM##_##T##_fmt_3 (T out, T in, unsigned limit) \ { \ unsigned i; \ T ret; \ for (i = 0; i < limit; i++) \ { \ out[i] = __builtin_add_overflow (in[i], IMM, &ret) ? -1 : ret; \ } \ } DEF_VEC_SAT_U_ADD_IMM_FMT_3(uint64_t, 123) The below test are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-10.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-11.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-12.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-9.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-10.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-11.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-12.c: New test. * gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-9.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-09-02 09:25:39 +08:00
Pan Li	e96d4bf6a6	RISC-V: Refactor gen zero_extend rtx for SAT_* when expand SImode in RV64 In previous, we have some specially handling for both the .SAT_ADD and .SAT_SUB for unsigned int. There are similar to take care of SImode in RV64 for zero extend. Thus refactor these two helper function into one for possible code duplication. The below test suite are passed for this patch. * The rv64gcv fully regression test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Merge the zero_extend handing from func riscv_gen_unsigned_xmode_reg. (riscv_gen_unsigned_xmode_reg): Remove. (riscv_expand_ussub): Leverage riscv_gen_zero_extend_rtx instead of riscv_gen_unsigned_xmode_reg. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_u_sub-11.c: Adjust asm check. * gcc.target/riscv/sat_u_sub-15.c: Ditto. * gcc.target/riscv/sat_u_sub-19.c: Ditto. * gcc.target/riscv/sat_u_sub-23.c: Ditto. * gcc.target/riscv/sat_u_sub-27.c: Ditto. * gcc.target/riscv/sat_u_sub-3.c: Ditto. * gcc.target/riscv/sat_u_sub-31.c: Ditto. * gcc.target/riscv/sat_u_sub-35.c: Ditto. * gcc.target/riscv/sat_u_sub-39.c: Ditto. * gcc.target/riscv/sat_u_sub-43.c: Ditto. * gcc.target/riscv/sat_u_sub-47.c: Ditto. * gcc.target/riscv/sat_u_sub-7.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-11.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-11_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-11_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-15.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-15_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-15_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-3.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-3_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-3_2.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-7.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-7_1.c: Ditto. * gcc.target/riscv/sat_u_sub_imm-7_2.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>	2024-09-02 09:24:44 +08:00
GCC Administrator	880834d3e7	Daily bump.	2024-09-02 00:16:51 +00:00
Andrew Pinski	592a335de5	slsr: Use simple_dce_from_worklist in SLSR [PR116554] While working on a phiopt patch, it was noticed that SLSR would leave around some unused ssa names. Let's add simple_dce_from_worklist usage to SLSR to remove the dead statements. This should give a small improvemnent for passes afterwards. Boostrapped and tested on x86_64. gcc/ChangeLog: PR tree-optimization/116554 * gimple-ssa-strength-reduction.cc: Include tree-ssa-dce.h. (replace_mult_candidate): Add sdce_worklist argument, mark the rhs1/rhs2 for maybe dceing. (replace_unconditional_candidate): Add sdce_worklist argument, Update call to replace_mult_candidate. (replace_conditional_candidate): Add sdce_worklist argument, update call to replace_mult_candidate. (replace_uncond_cands_and_profitable_phis): Add sdce_worklist argument, update call to replace_conditional_candidate, replace_unconditional_candidate, and replace_uncond_cands_and_profitable_phis. (replace_one_candidate): Add sdce_worklist argument, mark the orig_rhs1/orig_rhs2 for maybe dceing. (replace_profitable_candidates): Add sdce_worklist argument, update call to replace_one_candidate and replace_profitable_candidates. (analyze_candidates_and_replace): Call simple_dce_from_worklist and update calls to replace_profitable_candidates, and replace_uncond_cands_and_profitable_phis. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-08-31 23:42:59 -07:00
Hans-Peter Nilsson	f22788c7c0	testsuite: Prune compilation messages for modules tests All testsuite compiler-calls pass default_target_compile in the dejagnu installation (typically /usr/share/dejagnu/target.exp) which also calls the dejagnu-installed prune_warnings. Normally, tests using the dg framework (most or all tests these days) compile and link by calling various wrappers that end up calling dg-test in the dejagnu installation, typically installed as /usr/share/dejagnu/dg.exp. That, besides the compiler call, also calls ${tool}-dg-prune (g++-dg-prune) on the messages, which in turn ends up calling prune_gcc_output in gcc/testsuite/lib/prune.exp. That gcc-specific "pruning" function handles more cases than the dejagnu prune_warnings, and also has updated patterns. But, module_do_it in modules.exp calls the lower-level ${tool}_target_compile "directly", i.e. g++_target_compile defined in gcc/testsuite/lib/g++.exp. That does not call ${tool}-dg-prune, meaning those test-cases miss the gcc-specific pruning. Noticed while testing a dejagnu update that handled the miniscule "in" in the warning (line-breaks added below besides the original one after "(void)':") "/path/to/cris-elf/bin/ld: /gccobj/cris-elf/./libstdc++-v3/src/.libs/libstdc++.a(random.o): in function `std::(anonymous namespace)::__libc_getentropy(void)': /gccsrc/libstdc++-v3/src/c++11/random.cc:183: warning: _getentropy is not implemented and will always fail" The line saying "in function" rather than "In function" (from the binutils linker since 2018) is pruned by prune_gcc_output. The prune_warnings in dejagnu-1.6.3 and earlier handles the second line separately. It's an unfortunate wart that neither consumes the delimiting line-break, leaving to the callers to prune residual empty lines. See prune_warnings in dejagnu (default_target_compile and dg-test) for those other line-break fixups, as alluded in the comment. * g++.dg/modules/modules.exp (module_do_it): Prune compilation messages.	2024-09-01 02:27:56 +02:00
GCC Administrator	49fd9b33bd	Daily bump.	2024-09-01 00:25:25 +00:00
Roger Sayle	bac00c3422	i386: Support read-modify-write memory operands in STV. This patch enables STV when the first operand of a TImode binary logic operand (AND, IOR or XOR) is a memory operand, which is commonly the case with read-modify-write instructions. A different motivating example from the one given previously is: __int128 m, p, q; void foo() { m ^= (p & q); } Currently with -O2 -mavx the RMW instructions are rejected by STV, resulting in scalar code: foo: movq p(%rip), %rax movq p+8(%rip), %rdx andq q(%rip), %rax andq q+8(%rip), %rdx xorq %rax, m(%rip) xorq %rdx, m+8(%rip) ret With this patch they become scalar-to-vector candidates: foo: vmovdqa p(%rip), %xmm0 vpand q(%rip), %xmm0, %xmm0 vpxor m(%rip), %xmm0, %xmm0 vmovdqa %xmm0, m(%rip) ret 2024-08-31 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386-features.cc (timode_scalar_to_vector_candidate_p): Support the first operand of AND, IOR and XOR being MEM_P, i.e. a read-modify-write insn. gcc/testsuite/ChangeLog * gcc.target/i386/movti-2.c: Change dg-options to -Os. * gcc.target/i386/movti-4.c: Expected output of original movti-2.c.	2024-08-31 14:19:33 -06:00
Andrew Pinski	2ac27bd503	libobjc: Add cast to void* to disable warning for casting between incompatible function types [PR89586] Even though __objc_get_forward_imp returns an IMP type, it will be casted to a compatable function type before calling it. So we adding a cast to `void` will disable warning about the incompatible type. Pushed after bootstrap/test on x86_64. libobjc/ChangeLog: PR libobjc/89586 sendmsg.c (__objc_get_forward_imp): Add cast to `void*` before casting to IMP. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-08-31 12:04:22 -07:00
Georg-Johann Lay	df89afbb77	AVR: Run pass avr-fuse-add a second time after pass_cprop_hardreg. gcc/ * config/avr/avr-passes.cc (avr_pass_fuse_add) <clone>: Override. * config/avr/avr-passes.def (avr_pass_fuse_add): Run again after pass_cprop_hardreg.	2024-08-31 20:30:40 +02:00
Georg-Johann Lay	60fc5501dd	AVR: Tidy pass avr-fuse-add. gcc/ * config/avr/avr-protos.h (avr_split_tiny_move): Rename to avr_split_fake_addressing_move. * config/avr/avr-passes.def: Same. * config/avr/avr-passes.cc: Same. (avr_pass_data_fuse_add) <tv_id>: Set to TV_MACH_DEP. * config/avr/avr.md (split-lpmx): Remove a define_split. Such splits are performed by avr_split_fake_addressing_move.	2024-08-31 20:28:27 +02:00
Iain Sandoe	7f27d1f1b9	testsuite, c++, coroutines: Avoid 'unused' warnings [NFC]. The 'torture' section of the coroutine tests is primarily about checking correct operation of the generated code. It should, ideally, be possible to run this part of the testsuite with '-Wall' and expect no fails. In the case that we wish to test for a specific diagnostic (and that it does not appear over a range of optimisation/debug conditions) then we should make that explict (as done, for example, in pr109867.C). The tests amended here have warnings because of unused entities; in many cases those are relevant to the test, and so we just mark them with __attribute__((__unused__)). We amend the debug output in coro.h to avoid similar warnings when print output is disabled (the default). gcc/testsuite/ChangeLog: * g++.dg/coroutines/coro.h: Use a variadic macro for PRINTF to avoid unused warnings when output is disabled. * g++.dg/coroutines/torture/co-await-04-control-flow.C: Avoid unused warnings. * g++.dg/coroutines/torture/co-ret-13-template-2.C: Likewise. * g++.dg/coroutines/torture/exceptions-test-01-n4849-a.C: Likewise. * g++.dg/coroutines/torture/local-var-04-hiding-nested-scopes.C: Likewise. * g++.dg/coroutines/torture/pr109867.C: Likewise. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2024-08-31 17:33:31 +01:00
Iain Sandoe	2c27189da4	testsuite, c++, coroutines: Correct a test intent. The intention of the series of tests numberef pr95615-* is to verify that entities created by the ramp and potentially needing destruction are correctly handled when exceptions are thrown. Because of a typo, one case was not being checked correctly (the return object). This patch amends the check to test that the returned object is properly deleted. gcc/testsuite/ChangeLog: * g++.dg/coroutines/torture/pr95615.inc: Check tha the task object produced by get_return_object is correctly deleted on exception. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2024-08-31 17:31:39 +01:00
Iain Sandoe	049a927c10	c++, coroutines: Make and use a frame access helper. In the review of earlier patches it was suggested that we might make use of finish_class_access_expr instead of doing a lookup for the member and then a build_class_access_expr call. finish_class_access_expr does a lot more work than we need and ends up calling build_class_access_expr anyway. So, instead, this patch makes a new helper to do the lookup and build and uses that helper everywhere except instances in the ramp function that we are going to handle separately. gcc/cp/ChangeLog: * coroutines.cc (coro_build_frame_access_expr): New. (transform_await_expr): Use coro_build_frame_access_expr. (transform_local_var_uses): Likewise. (build_actor_fn): Likewise. (build_destroy_fn): Likewise. (cp_coroutine_transform::build_ramp_function): Likewise. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>	2024-08-31 17:31:08 +01:00
John David Anglin	b7e9f36108	hppa: Enable PA 2.0 symbolic operands on ELF32 targets The GNU ELF32 linker has been fixed and it can now handle PA 2.0 symbolic relocations. This only affects non-pic code generation. 2024-08-31 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: * config/pa/pa.cc (pa_emit_move_sequence): Remove symbolic memory work arounds for TARGET_ELF32. (pa_legitimate_address_p): Likewise. Allow symbolic operands. Adjust comment. * config/pa/pa.md: Replace reg_or_0_or_nonsymb_mem_operand with reg_or_0_or_mem_operand predicate in various unnamed move insns. * config/pa/predicates.md (floating_point_store_memory_operand): Update comment. Remove symbolic memory work arounds for TARGET_ELF32. (nonsymb_mem_operand): Rename to mem_operand. Allow symbolic memory operands. (reg_or_0_or_nonsymb_mem_operand): Rename to reg_or_0_or_mem_operand. Allow symbolic memory operands.	2024-08-31 12:20:14 -04:00
Andrew Pinski	ceda727daf	phiopt: Ignore some nop statements in heursics [PR116098] The heurstics that was added for PR71016, try to search to see if the conversion was being moved away from its definition. The problem is the heurstics would stop if there was a non GIMPLE_ASSIGN (and already ignores debug statements) and in this case we would have a GIMPLE_LABEL that was not being ignored. So we should need to ignore GIMPLE_NOP, GIMPLE_LABEL and GIMPLE_PREDICT. Note this is now similar to how gimple_empty_block_p behaves. Note this fixes the wrong code that was reported by moving the VCE (conversion) out before the phiopt/match could convert it into an bit_ior and move the VCE out with the VCE being conditionally valid. Bootstrapped and tested on x86_64-linux-gnu. Also built and tested for aarch64-linux-gnu. PR tree-optimization/116098 gcc/ChangeLog: * tree-ssa-phiopt.cc (factor_out_conditional_operation): Ignore nops, labels and predicts for heuristic for conversion with a constant. gcc/testsuite/ChangeLog: * c-c++-common/torture/pr116098-1.c: New test. * gcc.target/aarch64/csel-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-08-31 09:16:40 -07:00
Andrew Pinski	457805cf59	testsuite: Change what is being tested for pr66726-2.c r14-575-g6d6c17e45f62cf changed the debug dump message but the testcase pr66726-2.c was not updated for the change. The testcase was searching to make sure we didn't factor out a conversion but the testcase was no longer testing that so we needed to update what was being searched for. Tested on x86_64-linux. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr66726-2.c: Update scan dump message. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>	2024-08-31 09:16:40 -07:00
Harald Anlauf	79b5b50402	Fortran: downgrade use associated namelist group name to legacy extension The Fortran standard disallows use associated names as namelist group name (e.g. F2003:C581, but also later standards). This feature is a gfortran legacy extension, and we should give a warning even for -std=gnu. gcc/fortran/ChangeLog: * match.cc (gfc_match_namelist): Downgrade feature from GNU to legacy extension. gcc/testsuite/ChangeLog: * gfortran.dg/pr88169_3.f90: Adjust pattern.	2024-08-31 16:23:38 +02:00
Jakub Jelinek	afd9558b94	c++: Add unsequenced C++ testcase This is the testcase I wrote originally and which on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659154.html patch didn't behave the way I wanted (no warning and no optimizations of [[unsequenced]] function templates which don't have pointer/reference arguments. Posting this separately, because it depends on the above mentioned patch as well as the PR116175 https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659157.html patch. 2024-08-31 Jakub Jelinek <jakub@redhat.com> * g++.dg/ext/attr-unsequenced-1.C: New test.	2024-08-31 16:03:20 +02:00
Jakub Jelinek	dd346b6138	c: Add support for unsequenced and reproducible attributes C23 added in N2956 ( https://open-std.org/JTC1/SC22/WG14/www/docs/n2956.htm ) two new attributes, which are described as similar to GCC const and pure attributes, but they aren't really same and it seems that even the paper is missing some of the differences. The paper says unsequenced is the same as const on functions without pointer arguments and reproducible is the same as pure on such functions (except that they are function type attributes rather than function declaration ones), but it seems the paper doesn't consider the finiteness GCC relies on (aka non-DECL_LOOPING_CONST_OR_PURE_P) - the paper only talks about using the attributes for CSE etc., not for DCE. The following patch introduces (for now limited) support for those attributes, both as standard C23 attributes and as GNU extensions (the difference is that the patch is then less strict on where it allows them, like other function type attributes they can be specified on function declarations as well and apply to the type, while C23 standard ones must go on the function declarators (i.e. after closing paren after function parameters) or in type specifiers of function type. If function doesn't have any pointer/reference arguments, the patch adds additional internal attribute with " noptr" suffix which then is used by flags_from_decl_or_type to handle those easy cases as ECF_CONST\|ECF_LOOPING_CONST_OR_PURE or ECF_PURE\|ECF_LOOPING_CONST_OR_PURE The harder cases aren't handled right now, I'd hope they can be handled incrementally. I wonder whether we shouldn't emit a warning for the gcc.dg/c23-attr-{reproducible,unsequenced}-5.c cases, while the standard clearly specifies that composite types should union the attributes and it is what GCC implements for decades, for ?: that feels dangerous for the new attributes, it would be much better to be conservative on say (cond ? unsequenced_function : normal_function) (args) There is no diagnostics on incorrect [[unsequenced]] or [[reproducible]] function definitions, while I think diagnosing non-const static/TLS declarations in the former could be easy, the rest feels hard. E.g. the const/pure discovery can just punt on everything it doesn't understand, but complete diagnostics would need to understand it. 2024-08-31 Jakub Jelinek <jakub@redhat.com> PR c/116130 gcc/ * doc/extend.texi (unsequenced, reproducible): Document new function type attributes. * calls.cc (flags_from_decl_or_type): Handle "unsequenced noptr" and "reproducible noptr" attributes. gcc/c-family/ * c-attribs.cc (c_common_gnu_attributes): Add entries for "unsequenced", "reproducible", "unsequenced noptr" and "reproducible noptr" attributes. (handle_unsequenced_attribute): New function. (handle_reproducible_attribute): Likewise. * c-common.h (handle_unsequenced_attribute): Declare. (handle_reproducible_attribute): Likewise. * c-lex.cc (c_common_has_attribute): Return 202311 for standard unsequenced and reproducible attributes. gcc/c/ * c-decl.cc (handle_std_unsequenced_attribute): New function. (handle_std_reproducible_attribute): Likewise. (std_attributes): Add entries for "unsequenced" and "reproducible" attributes. (c_warn_type_attributes): Add TYPE argument. Allow unsequenced or reproducible attributes if it is FUNCTION_TYPE. (groktypename): Adjust c_warn_type_attributes caller. (grokdeclarator): Likewise. (finish_declspecs): Likewise. * c-parser.cc (c_parser_declaration_or_fndef): Likewise. * c-tree.h (c_warn_type_attributes): Add TYPE argument. gcc/testsuite/ * c-c++-common/attr-reproducible-1.c: New test. * c-c++-common/attr-reproducible-2.c: New test. * c-c++-common/attr-unsequenced-1.c: New test. * c-c++-common/attr-unsequenced-2.c: New test. * gcc.dg/c23-attr-reproducible-1.c: New test. * gcc.dg/c23-attr-reproducible-2.c: New test. * gcc.dg/c23-attr-reproducible-3.c: New test. * gcc.dg/c23-attr-reproducible-4.c: New test. * gcc.dg/c23-attr-reproducible-5.c: New test. * gcc.dg/c23-attr-reproducible-5-aux.c: New file. * gcc.dg/c23-attr-unsequenced-1.c: New test. * gcc.dg/c23-attr-unsequenced-2.c: New test. * gcc.dg/c23-attr-unsequenced-3.c: New test. * gcc.dg/c23-attr-unsequenced-4.c: New test. * gcc.dg/c23-attr-unsequenced-5.c: New test. * gcc.dg/c23-attr-unsequenced-5-aux.c: New file. * gcc.dg/c23-has-c-attribute-2.c: Add tests for unsequenced and reproducible attributes.	2024-08-31 15:58:23 +02:00
Georg-Johann Lay	dc476e5f68	AVR: Don't print a space after , when printing instructions. gcc/ * config/avr/avr.cc: Follow the convention to not add a space after comma when printing instructions.	2024-08-31 12:50:55 +02:00

1 2 3 4 5 ...

213446 commits