procyberian/gcc - Masscollabs Services: Beyond Sharing , Liberating The Software World

Author	SHA1	Message	Date
GCC Administrator	829d597548	Daily bump.	2023-06-03 00:16:48 +00:00
liuhongt	57b30f0134	Don't try bswap + rotate when TYPE_PRECISION(n->type) > n->range. For the testcase in the PR, we have br64 = br; br64 = ((br64 << 16) & 0x000000ff00000000ull) \| (br64 & 0x0000ff00ull); n->n: 0x3000000200. n->range: 32. n->type: uint64. The original code assumes n->range is same as TYPE PRECISION(n->type), and tries to rotate the mask from 0x300000200 -> 0x20300 which is incorrect. The patch fixed this bug by not trying bswap + rotate when TYPE_PRECISION(n->type) is not equal to n->range. gcc/ChangeLog: PR tree-optimization/110067 * gimple-ssa-store-merging.cc (find_bswap_or_nop): Don't try bswap + rotate when TYPE_PRECISION(n->type) > n->range. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110067.c: New test.	2023-06-03 08:09:31 +08:00
liuhongt	4933704086	i386: Add missing vector truncate patterns [PR92658]. Add missing insn patterns for v2si -> v2hi/v2qi and v2hi-> v2qi vector truncate. gcc/ChangeLog: PR target/92658 * config/i386/mmx.md (truncv2hiv2qi2): New define_insn. (truncv2si<mode>2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr92658-avx512bw-trunc-2.c: New test.	2023-06-03 08:09:07 +08:00
Andrew Pinski	64ca6aa74b	rtl-optimization: [PR102733] DSE removing address which only differ by address space. The problem here is DSE was not taking into account the address space which meant if you had two addresses say `fs:0` and `gs:0` (on x86_64), DSE would think they were the same and remove the first store. This fixes that issue by adding a check for the address space too. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR rtl-optimization/102733 gcc/ChangeLog: * dse.cc (store_info): Add addrspace field. (record_store): Record the address space and check to make sure they are the same. gcc/testsuite/ChangeLog: * gcc.target/i386/addr-space-6.c: New test.	2023-06-02 19:45:17 +00:00
Andrew Pinski	df0853d72d	Fix PR 110042: ifcvt regression due to paradoxical subregs After r14-1014-gc5df248509b489364c573e8, GCC started to emit directly a zero_extract for `(t1&0x8)!=0`. This introduced a small regression where ifcvt would not do the ifconversion as there is now a paradoxical subreg in the dest which was being rejected. Since paradoxical subreg set the whole register, we can treat it as the same as a reg in the two places. OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu. gcc/ChangeLog: PR rtl-optimization/110042 * ifcvt.cc (bbs_ok_for_cmove_arith): Allow paradoxical subregs. (bb_valid_for_noce_process_p): Strip the subreg for the SET_DEST. gcc/testsuite/ChangeLog: PR rtl-optimization/110042 * gcc.target/aarch64/csel_bfx_2.c: New test.	2023-06-02 19:22:13 +00:00
Iain Sandoe	84d080a29a	Darwin, PPC: Fix struct layout with pragma pack [PR110044]. This bug was essentially that darwin_rs6000_special_round_type_align() was ignoring externally-imposed capping of field alignment. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> PR target/110044 gcc/ChangeLog: * config/rs6000/rs6000.cc (darwin_rs6000_special_round_type_align): Make sure that we do not have a cap on field alignment before altering the struct layout based on the type alignment of the first entry. gcc/testsuite/ChangeLog: * gcc.target/powerpc/darwin-abi-13-0.c: New test. * gcc.target/powerpc/darwin-abi-13-1.c: New test. * gcc.target/powerpc/darwin-abi-13-2.c: New test. * gcc.target/powerpc/darwin-structs-0.h: New test.	2023-06-02 20:03:58 +01:00
Steve Kargl	fae09dfc0e	Fortran: fix diagnostics for SELECT RANK [PR100607] gcc/fortran/ChangeLog: PR fortran/100607 * resolve.cc (resolve_select_rank): Remove duplicate error. (resolve_fl_var_and_proc): Prevent NULL pointer dereference and suppress error message for temporary. gcc/testsuite/ChangeLog: PR fortran/100607 * gfortran.dg/select_rank_6.f90: New test.	2023-06-02 19:47:01 +02:00
David Faust	934da923a7	btf: fix bootstrap -Wformat errors [PR110073] Commit `7aae58b04b` "btf: improve -dA comments for testsuite" broke bootstrap on a number of architectures because it introduced some new -Wformat errors. Fix those errors by properly using PRIu64 and a small refactor to the offending code. Based on the suggested patch from Rainer Orth. PR debug/110073 gcc/ChangeLog: * btfout.cc (btf_absolute_func_id): New function. (btf_asm_func_type): Call it here. Change index parameter from size_t to ctf_id_t. Use PRIu64 formatter.	2023-06-02 09:28:32 -07:00
Alex Coplan	f2e60a00c7	btf: Fix -Wformat errors g:7aae58b04b92303ccda3ead600be98f0d4b7f462 introduced -Wformat errors breaking bootstrap on some targets. This patch fixes that. Committed as obvious. gcc/ChangeLog: * btfout.cc (btf_asm_type): Use PRIu64 instead of %lu for uint64_t. (btf_asm_datasec_type): Likewise.	2023-06-02 16:58:24 +01:00
Jason Merrill	9872d56661	c++: fix explicit/copy problem [PR109247] In the testcase, the user wants the assignment to use the operator= declared in the class, but because [over.match.list] says that explicit constructors are also considered for list-initialization, as affirmed in CWG1228, we end up choosing the implicitly-declared copy assignment operator, using the explicit constructor template for the argument, which is ill-formed. Other implementations haven't implemented CWG1228, so we keep getting bug reports. Discussion in CWG led to the idea for this targeted relaxation: if we use an explicit constructor for the conversion to the argument of a copy or move special member function, that makes the candidate worse than another. DR 2735 PR c++/109247 gcc/cp/ChangeLog: * call.cc (sfk_copy_or_move): New. (joust): Add tiebreaker for explicit conv and copy ctor. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/initlist-explicit3.C: New test.	2023-06-02 11:46:21 -04:00
Carl Love	957798e44e	rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx The third argument for __builtin_altivec_tr_stxvrhx should be short * not int . Similarly, the third argument for __builtin_altivec_tr_stxvrwx should be int not short . This patch fixes the arguments in the two builtins. A runnable test case is added to test the __builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx and __builtin_altivec_tr_stxvrdx builtins. gcc/ config/rs6000/rs6000-builtins.def (__builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx): Fix type of third argument. gcc/testsuite/ * gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c: New test for __builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrdx.	2023-06-02 11:11:12 -04:00
Jason Merrill	4d935f52b0	c++: make initializer_list array static again [PR110070] After the maybe_init_list_as_* patches, I noticed that we were putting the array of strings into .rodata, but then memcpying it into an automatic array, which is pointless; we should be able to use it directly. This doesn't happen automatically because TREE_ADDRESSABLE is set (since r12-657 for PR100464), and so gimplify_init_constructor won't promote the variable to static. Theoretically we could do escape analysis to recognize that the address, though taken, never leaves the function; that would allow promotion when we're only using the address for indexing within the function, as in initlist-opt2.C. But this would be a new pass. And in initlist-opt1.C, we're passing the array address to another function, so it definitely escapes; it's only safe in this case because it's calling a standard library function that we know only uses it for indexing. So, a flag seems needed. I first thought to put the flag on the TARGET_EXPR, but the VAR_DECL seems more appropriate. In a previous revision of the patch I called this flag DECL_NOT_OBSERVABLE, but I think DECL_MERGEABLE is a better name, especially if we're going to apply it to the backing array of initializer_list, which is observable. I then also check it in places that check for -fmerge-all-constants, so that multiple equivalent initializer-lists can also be combined. And then it seemed to make sense for [[no_unique_address]] to have this meaning for user-written variables. I think the note in [dcl.init.list]/6 intended to allow this kind of merging for initializer_lists, but it didn't actually work; for an explicit array with the same initializer, if the address escapes the program could tell whether the same variable in two frames have the same address. P2752 is trying to correct this defect, so I'm going to assume that this is the intent. PR c++/110070 PR c++/105838 gcc/ChangeLog: * tree.h (DECL_MERGEABLE): New. * tree-core.h (struct tree_decl_common): Mention it. * gimplify.cc (gimplify_init_constructor): Check it. * cgraph.cc (symtab_node::address_can_be_compared_p): Likewise. * varasm.cc (categorize_decl_for_section): Likewise. gcc/cp/ChangeLog: * call.cc (maybe_init_list_as_array): Set DECL_MERGEABLE. (convert_like_internal) [ck_list]: Set it. (set_up_extended_ref_temp): Copy it. * tree.cc (handle_no_unique_addr_attribute): Set it. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/initlist-opt1.C: Check for static array. * g++.dg/tree-ssa/initlist-opt2.C: Likewise. * g++.dg/tree-ssa/initlist-opt4.C: New test. * g++.dg/opt/icf1.C: New test. * g++.dg/opt/icf2.C: New test. * g++.dg/opt/icf3.C: New test. * g++.dg/tree-ssa/array-temp1.C: Revert r12-657 change.	2023-06-02 10:54:39 -04:00
Uros Bizjak	99566c0c6b	reg-stack: Change return type of predicate functions from int to bool Also change some internal variables to bool and recode handling of boolean varialbes to not use bitwise or. gcc/ChangeLog: * rtl.h (stack_regs_mentioned): Change return type from int to bool. * reg-stack.cc (struct_block_info_def): Change "done" to bool. (stack_regs_mentioned_p): Change return type from int to bool and adjust function body accordingly. (stack_regs_mentioned): Ditto. (check_asm_stack_operands): Ditto. Change "malformed_asm" variable to bool. (move_for_stack_reg): Recode handling of control_flow_insn_deleted. (swap_rtx_condition_1): Change return type from int to bool and adjust function body accordingly. Change "r" variable to bool. (swap_rtx_condition): Change return type from int to bool and adjust function body accordingly. (subst_stack_regs_pat): Recode handling of control_flow_insn_deleted. (subst_stack_regs): Ditto. (convert_regs_entry): Change return type from int to bool and adjust function body accordingly. Change "inserted" variable to bool. (convert_regs_1): Recode handling of control_flow_insn_deleted. (convert_regs_2): Recode handling of cfg_altered. (convert_regs): Ditto. Change "inserted" variable to bool.	2023-06-02 16:17:46 +02:00
Jason Merrill	e7cc4d703b	varasm: check float size In PR95226, the testcase was failing because we tried to output_constant a NOP_EXPR to float from a double REAL_CST, and so we output a double where the caller wanted a float. That doesn't happen anymore, but with the output_constant hunk we will ICE in that situation rather than emit the wrong number of bytes. Part of the problem was that initializer_constant_valid_p_1 returned true for that NOP_EXPR, because it compared the sizes of integer types but not floating-point types. So the C++ front end assumed it didn't need to fold the initializer. PR c++/95226 gcc/ChangeLog: * varasm.cc (output_constant) [REAL_TYPE]: Check that sizes match. (initializer_constant_valid_p_1): Compare float precision.	2023-06-02 10:08:59 -04:00
David Malcolm	ef768035ae	analyzer: implement various atomic builtins [PR109015] This patch implements many of the __atomic_* builtins from sync-builtins.def as known_function subclasses within the analyzer. gcc/analyzer/ChangeLog: PR analyzer/109015 * kf.cc (class kf_atomic_exchange): New. (class kf_atomic_exchange_n): New. (class kf_atomic_fetch_op): New. (class kf_atomic_op_fetch): New. (class kf_atomic_load): New. (class kf_atomic_load_n): New. (class kf_atomic_store_n): New. (register_atomic_builtins): New function. (register_known_functions): Call register_atomic_builtins. gcc/testsuite/ChangeLog: PR analyzer/109015 * gcc.dg/analyzer/atomic-builtins-1.c: New test. * gcc.dg/analyzer/atomic-builtins-haproxy-proxy.c: New test. * gcc.dg/analyzer/atomic-builtins-qemu-sockets.c: New test. * gcc.dg/analyzer/atomic-types-1.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-06-02 09:28:30 -04:00
David Malcolm	b8a916726e	analyzer: regions in different memory spaces can't alias gcc/analyzer/ChangeLog: * store.cc (store::eval_alias_1): Regions in different memory spaces can't alias. Signed-off-by: David Malcolm <dmalcolm@redhat.com>	2023-06-02 09:28:30 -04:00
David Edelsohn	23f352972f	testsuite: Require LTO for pr107557-[12].c pr107557-[12].c invoke -flto option but do not check that the target support LTO. This patch adds dg-require lto to the testcases. * gcc.dg/pr107557-1.c: Require LTO support. * gcc.dg/pr107557-2.c: Require LTO support. Signed-off-by: David Edelsohn <dje.gcc@gmail.com>	2023-06-02 09:30:48 -04:00
Alexander Monakov	9f926f3a0c	doc: clarify semantics of vector bitwise shifts Explicitly say that attempted shift past element bit width is UB for vector types. Mention that integer promotions do not happen. gcc/ChangeLog: * doc/extend.texi (Vector Extensions): Clarify bitwise shift semantics.	2023-06-02 15:17:30 +03:00
Ju-Zhe Zhong	bffc52838e	VECT: Change flow of decrement IV Follow Richi's suggestion, I change current decrement IV flow from: do { remain -= MIN (vf, remain); } while (remain != 0); into: do { old_remain = remain; len = MIN (vf, remain); remain -= vf; } while (old_remain >= vf); to enhance SCEV. Include fixes from kewen. This patch will need to wait for Kewen's test feedback. Testing on X86 is on-going Co-Authored by: Kewen Lin <linkw@linux.ibm.com> PR tree-optimization/109971 gcc/ChangeLog: * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Change decrement IV flow. (vect_set_loop_condition_partial_vectors): Ditto.	2023-06-02 19:49:54 +08:00
Georg-Johann Lay	7bf89a919f	target/110088: Improve operation of l-reg with const after move from d-reg. After reload, there may be sequences like lreg = dreg lreg = lreg <op> const with an LD_REGS dreg, non-LD_REGS lreg, and <op> in PLUS, IOR, AND. If dreg dies after the first insn, it is possible to use dreg = dreg <op> const lreg = dreg instead which is more efficient. gcc/ PR target/110088 * config/avr/avr.md: Add an RTL peephole to optimize operations on non-LD_REGS after a move from LD_REGS. (piaop): New code iterator.	2023-06-02 12:41:07 +02:00
François Dumont	4d866c6bb4	libstdc++: Fix broken _GLIBCXX_PARALLEL mode Add missing <parallel/search.h> include in <parallel/algobase.h>. libstdc++-v3/ChangeLog: * include/parallel/algobase.h: Include <parallel/search.h>.	2023-06-02 11:44:14 +02:00
Thomas Schwinge	04abe1944d	Support parallel testing in libgomp: fallback Perl 'flock' [PR66005] Follow-up to commit `6c3b30ef9e` "Support parallel testing in libgomp, part II [PR66005]" ("..., and enable if 'flock' is available for serializing execution testing"), where we saw: > On my Dell Precision 7530 laptop: > > $ uname -srvi > Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64 > $ grep '^model name' < /proc/cpuinfo \| uniq -c > 12 model name : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz > $ nvidia-smi -L > GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea) > > ... [...]: case (c) standard configuration, no offloading > configured, [...] > $ \time make check-target-libgomp > > Case (c), baseline; [...]: > > 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k > 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k > > Case (c), parallelized [using 'flock']: > > [...] > -j12 GCC_TEST_PARALLEL_SLOTS=12 > 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k > 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k Quite the same when instead of 'flock' using this fallback Perl 'flock': 2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k 2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k PR testsuite/66005 gcc/ * doc/install.texi: Document (optional) Perl usage for parallel testing of libgomp. libgomp/ * testsuite/lib/libgomp.exp: 'flock' through stdout. * testsuite/flock: New. * configure.ac (FLOCK): Point to that if no 'flock' available, but 'perl' is. * configure: Regenerate.	2023-06-02 09:51:15 +02:00
Thomas Schwinge	49153588ab	Remove stale Autoconf checks for Perl Subversion r110220 (Git commit `03b8fe495d`) for PR25884 "libgomp should not require perl to compile" removed all '$(PERL)' usage from libgomp -- but didn't remove the then-unused Autoconf Perl check itself. Later, this Autoconf Perl check appears to have been copied from libgomp into other GCC libraries, likewise unused. libgomp/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise. libatomic/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise. libgm2/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * libm2cor/Makefile.in: Likewise. * libm2iso/Makefile.in: Likewise. * libm2log/Makefile.in: Likewise. * libm2min/Makefile.in: Likewise. * libm2pim/Makefile.in: Likewise. libitm/ * configure.ac (PERL): Remove. * configure: Regenerate. * Makefile.in: Likewise. * testsuite/Makefile.in: Likewise.	2023-06-02 09:51:14 +02:00
Thomas Schwinge	9edb672571	Back to requiring "Perl version 5.6.1 (or later)" [PR82856] With Subversion r265695 (Git commit `22e0527251`) "Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856)" we're back to normal; per Automake 1.15.1 'configure.ac' still "[...] perl 5.6 or better is required [...]". PR bootstrap/82856 gcc/ * doc/install.texi (Perl): Back to requiring "Perl version 5.6.1 (or later)".	2023-06-02 09:51:14 +02:00
Paul Thomas	3c2eba4b7a	Fortran: Fix some problems blocking associate meta-bug [PR87477] 2023-06-02 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/87477 * parse.cc (parse_associate): Replace the existing evaluation of the target rank with calls to gfc_resolve_ref and gfc_expression_rank. Identify untyped target function results with structure constructors by finding the appropriate derived type. * resolve.cc (resolve_symbol): Allow associate variables to be assumed shape. gcc/testsuite/ PR fortran/87477 * gfortran.dg/associate_54.f90 : Cope with extra error. PR fortran/102109 * gfortran.dg/pr102109.f90 : New test. PR fortran/102112 * gfortran.dg/pr102112.f90 : New test. PR fortran/102190 * gfortran.dg/pr102190.f90 : New test. PR fortran/102532 * gfortran.dg/pr102532.f90 : New test. PR fortran/109948 * gfortran.dg/pr109948.f90 : New test. PR fortran/99326 * gfortran.dg/pr99326.f90 : New test.	2023-06-02 08:41:45 +01:00
Juzhe-Zhong	a06b9435b9	RISC-V: Add _mu C++ overloaded intrinsics for load && viota && vid Base on these: https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/232 https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/233 Add _mu C++ overloaded intrinsics for load && viota && vid. Co-authored-by: KuanLin Chen <best124612@gmail.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Add _mu overloaded intrinsics. * config/riscv/riscv-vector-builtins-shapes.cc (struct fault_load_def): Ditto.	2023-06-02 15:27:46 +08:00
Juzhe-Zhong	265357d401	RISC-V: Optimize reverse series index vector This patch optimizes the following seriese vector: [nunits - 1, nunits - 2, ...., 0] Before this patch: vid vmul vadd After this patch: vid vrsub This patch is an obvious and simple optimization, ok for trunk? gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_series): Optimize reverse series index vector. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Add assembly check.	2023-06-02 15:11:02 +08:00
Juzhe-Zhong	37ff12b96d	RISC-V: Fix warning in predicated.md Notice there is warning in predicates.md: ../../../riscv-gcc/gcc/config/riscv/predicates.md: In function ‘bool arith_operand_or_mode_mask(rtx, machine_mode)’: ../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] (match_test "INTVAL (op) == GET_MODE_MASK (HImode) ../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] \|\| INTVAL (op) == GET_MODE_MASK (SImode)")))) gcc/ChangeLog: * config/riscv/predicates.md: Change INTVAL into UINTVAL.	2023-06-02 14:10:40 +08:00
YunQiang Su	4fe6e12204	MAINTAINERS: Add myself as MIPS port maintainer ChangeLog: * MAINTAINERS (CPU Port Maintainers): Add myself as MIPS port maintainer. (Write After Approval): Remove myself.	2023-06-02 10:06:37 +08:00
Pan Li	691805ff8a	RISC-V: Add test for vfloat16_t (non tuple) types This patch would like to add some test cases of vfloat16_t (non tuple), no 'zvfh' or 'zvfhmin' will meet unknown type. Signed-off-by: Pan Li <pan2.li@intel.com> gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/abi-16.c: Add test cases. * gcc.target/riscv/rvv/base/user-7.c: Likewise.	2023-06-02 09:08:46 +08:00
Juzhe-Zhong	d5ea84cdd9	RISC-V: Add __RISCV_ prefix to VXRM and FRM enum According to doc: https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222/files https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/226 Add __RISCV_ prefix to VXRM and FRM enum. gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (DEF_RVV_VXRM_ENUM): Add __RISCV_ prefix. (DEF_RVV_FRM_ENUM): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/frm-1.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-1.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-10.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-11.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-12.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-6.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-7.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-8.c: Ditto. * gcc.target/riscv/rvv/base/vxrm-9.c: Ditto.	2023-06-02 09:07:54 +08:00
Juzhe-Zhong	91430b73a0	RISC-V: Add vwadd.wv/vwsub.wv auto-vectorization lowering optimization 1. This patch optimize the codegen of the following auto-vectorization codes: void foo (int32_t * __restrict a, int64_t * __restrict b, int64_t * __restrict c, int n) { for (int i = 0; i < n; i++) c[i] = (int64_t)a[i] + b[i]; } Combine instruction from: ... vsext.vf2 vadd.vv ... into: ... vwadd.wv ... Since for PLUS operation, GCC prefer the following RTL operand order when combining: (plus: (sign_extend:..) (reg:) instead of (plus: (reg:..) (sign_extend:) which is different from MINUS pattern. I split patterns of vwadd/vwsub, and add dedicated patterns for them. 2. This patch not only optimize the case as above (1) mentioned, also enhance vwadd.vv/vwsub.vv optimization for complicate PLUS/MINUS codes, consider this following codes: __attribute__ ((noipa)) void vwadd_int16_t_int8_t (int16_t __restrict dst, int16_t __restrict dst2, int16_t __restrict dst3, int8_t __restrict a, int8_t __restrict b, int8_t __restrict a2, int8_t __restrict b2, int n) { for (int i = 0; i < n; i++) { dst[i] = (int16_t) a[i] + (int16_t) b[i]; dst2[i] = (int16_t) a2[i] + (int16_t) b[i]; dst3[i] = (int16_t) a2[i] + (int16_t) a[i]; } } Before this patch: ... vsetvli zero,a6,e8,mf2,ta,ma vle8.v v2,0(a3) vle8.v v1,0(a4) vsetvli t1,zero,e16,m1,ta,ma vsext.vf2 v3,v2 vsext.vf2 v2,v1 vadd.vv v1,v2,v3 vsetvli zero,a6,e16,m1,ta,ma vse16.v v1,0(a0) vle8.v v4,0(a5) vsetvli t1,zero,e16,m1,ta,ma vsext.vf2 v1,v4 vadd.vv v2,v1,v2 ... After this patch: ... vsetvli zero,a6,e8,mf2,ta,ma vle8.v v3,0(a4) vle8.v v1,0(a3) vsetvli t4,zero,e8,mf2,ta,ma vwadd.vv v2,v1,v3 vsetvli zero,a6,e16,m1,ta,ma vse16.v v2,0(a0) vle8.v v2,0(a5) vsetvli t4,zero,e8,mf2,ta,ma vwadd.vv v4,v3,v2 vsetvli zero,a6,e16,m1,ta,ma vse16.v v4,0(a1) vsetvli t4,zero,e8,mf2,ta,ma sub a7,a7,a6 vwadd.vv v3,v2,v1 vsetvli zero,a6,e16,m1,ta,ma vse16.v v3,0(a2) ... The reason why current upstream GCC can not optimize codes using vwadd thoroughly is combine PASS needs intermediate RTL IR (extend one of the operand pattern (vwadd.wv)), then base on this intermediate RTL IR, extend the other operand to generate vwadd.vv. So vwadd.wv/vwsub.wv definitely helps to vwadd.vv/vwsub.vv code optimizations. gcc/ChangeLog: config/riscv/riscv-vector-builtins-bases.cc: Change vwadd.wv/vwsub.wv intrinsic API expander * config/riscv/vector.md (@pred_single_widen_<plus_minus:optab><any_extend:su><mode>): Remove it. (@pred_single_widen_sub<any_extend:su><mode>): New pattern. (@pred_single_widen_add<any_extend:su><mode>): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-6.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run-6.c: New test.	2023-06-02 09:05:10 +08:00
Juzhe-Zhong	bf9eee73f3	RISC-V: Support RVV permutation auto-vectorization This patch supports vector permutation for VLS only by vec_perm pattern. We will support TARGET_VECTORIZE_VEC_PERM_CONST to support VLA permutation in the future. Fixed following comments from Robin. gcc/ChangeLog: * config/riscv/autovec.md (vec_perm<mode>): New pattern. * config/riscv/predicates.md (vector_perm_operand): New predicate. * config/riscv/riscv-protos.h (enum insn_type): New enum. (expand_vec_perm): New function. * config/riscv/riscv-v.cc (const_vec_all_in_range_p): Ditto. (gen_const_vector_dup): Ditto. (emit_vlmax_gather_insn): Ditto. (emit_vlmax_masked_gather_mu_insn): Ditto. (expand_vec_perm): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm.h: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: New test. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: New test.	2023-06-02 09:02:21 +08:00
GCC Administrator	847499148e	Daily bump.	2023-06-02 00:17:38 +00:00
Harald Anlauf	ff8f45d20f	Fortran: force error on bad KIND specifier [PR88552] gcc/fortran/ChangeLog: PR fortran/88552 * decl.cc (gfc_match_kind_spec): Use error path on missing right parenthesis. (gfc_match_decl_type_spec): Use error return when an error occurred during matching a KIND specifier. gcc/testsuite/ChangeLog: PR fortran/88552 * gfortran.dg/pr88552.f90: New test.	2023-06-01 23:04:30 +02:00
Vineet Gupta	3bb8ebb6ac	testsuite: print any leaking torture options for debugging This was helpful when debugging the recent multilib testsuite failure. gcc/testsuite: * lib/torture-options.exp: print the value of non-empty options: torture_without_loops, torture_with_loops, LTO_TORTURE_OPTIONS. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>	2023-06-01 12:36:00 -07:00
Vineet Gupta	8dde92fd8b	testsuite: Unbork multilib setups using -march flags (RISC-V) RISC-V multilib testing is currently busted with follow splat all over: \| Schedule of variations: \| riscv-sim/-march=rv64imafdc/-mabi=lp64d/-mcmodel=medlow \| riscv-sim/-march=rv32imafdc/-mabi=ilp32d/-mcmodel=medlow \| riscv-sim/-march=rv32imac/-mabi=ilp32/-mcmodel=medlow \| riscv-sim/-march=rv64imac/-mabi=lp64/-mcmodel=medlow ... ... \| ERROR: tcl error code NONE \| ERROR: torture-init: torture_without_loops is not empty as expected causing insane amount of false failures. \| ========= Summary of gcc testsuite ========= \| \| # of unexpected case / # of unique unexpected case \| \| gcc \| g++ \| gfortran \| \| rv64imafdc/ lp64d/ medlow \| 5421 / 4 \| 1 / 1 \| 6 / 1 \| \| rv32imafdc/ ilp32d/ medlow \| 5422 / 5 \| 3 / 2 \| 6 / 1 \| \| rv32imac/ ilp32/ medlow \| 391 / 5 \| 3 / 2 \| 43 / 8 \| \| rv64imac/ lp64/ medlow \| 5422 / 5 \| 1 / 1 \| 43 / 8 \| The error splat itself is from recent test harness improvements for stricter checks for torture-{init,finish} pairing. But the real issue is a latent bug from 2009: commit `3dd1415dc8`, ("i386-prefetch.exp: Skip tests when multilib flags contain -march") which added an "early exit" condition to i386-prefetch.exp which could potentially cause an unpaired torture-{init,finish}. The early exit only happens in a multlib setup using -march in flags which is what RISC-V happens to use, hence the reason this was only seen on RISC-V multilib testing. Moving the early exit outside of torture-{init,finish} bracket reinstates RISC-V testing. \| rv64imafdc/ lp64d/ medlow \| 3 / 2 \| 1 / 1 \| 6 / 1 \| \| rv32imafdc/ ilp32d/ medlow \| 4 / 3 \| 3 / 2 \| 6 / 1 \| \| rv32imac/ ilp32/ medlow \| 3 / 2 \| 3 / 2 \| 43 / 8 \| \| rv64imac/ lp64/ medlow \| 5 / 4 \| 1 / 1 \| 43 / 8 \| gcc/testsuite: * gcc.misc-tests/i386-prefetch.exp: Move early return outside the torture-{init,finish} Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>	2023-06-01 12:35:46 -07:00
Jason Merrill	5fccebdbd9	doc: improve docs for -pedantic{,-errors} Recent discussion of -Wimplicit led me to want to clarify this section of the documentation, and mark which diagnostics other than -Wpedantic are affected by -pedantic-errors. gcc/ChangeLog: * doc/invoke.texi (-Wpedantic): Improve clarity.	2023-06-01 15:24:32 -04:00
David Edelsohn	ed54532e3d	testsuite: Skip powerpc tests on AIX. AIX does not support -mstrict-align. pr109566.c had skip directive in wrong order for DejaGNU. * gcc.target/powerpc/pr100106-sa.c: Skip on AIX. * gcc.target/powerpc/pr109566.c: Skip on AIX. Signed-off-by: David Edelsohn <dje.gcc@gmail.com>	2023-06-01 13:16:12 -04:00
Jonathan Wakely	f8403c4304	libstdc++: Fix PSTL test that fails in C++20 This test fails in C++20 and later due to a warning: warning: C++20 says that these are ambiguous, even though the second is reversed: note: candidate 1: 'bool MyClass::operator==(const MyClass&)' note: candidate 2: 'bool MyClass::operator==(const MyClass&)' (reversed) note: try making the operator a 'const' member function FAIL: 26_numerics/pstl/numeric_ops/transform_reduce.cc (test for excess errors) libstdc++-v3/ChangeLog: * testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc: Add const to equality operator.	2023-06-01 16:54:24 +01:00
Jonathan Wakely	fe94f8b7e0	libstdc++: Do not use std::expected::value() in monadic ops (LWG 3938) The monadic operations in std::expected always check has_value() so we can avoid the execptional path in value() and the assertions in error() by accessing _M_val and _M_unex directly. This means that the monadic operations no longer require _M_unex to be copyable so that it can be thrown from value(), as modified by LWG 3938. This also fixes two incorrect uses of std::move in transform(F&&)& and transform(F&&) const& which I found while making these changes. Now that move-only error types are supported, it's possible to properly test the constraints that LWG 3877 added to and_then and transform. The lwg3877.cc test now does that. libstdc++-v3/ChangeLog: * include/std/expected (expected::and_then, expected::or_else) (expected::transform_error): Use _M_val and _M_unex instead of calling value() and error(), as per LWG 3938. (expected::transform): Likewise. Remove incorrect std::move calls from lvalue overloads. (expected<void, E>::and_then, expected<void, E>::or_else) (expected<void, E>::transform): Use _M_unex instead of calling error(). * testsuite/20_util/expected/lwg3877.cc: Add checks for and_then and transform, and for std::expected<void, E>. * testsuite/20_util/expected/lwg3938.cc: New test.	2023-06-01 16:06:15 +01:00
Jonathan Wakely	b7b255e77a	libstdc++: Fix code size regressions in std::vector [PR110060] My r14-1452-gfb409a15d9babc change to add optimization hints to std::vector causes regressions because it makes std::vector::size() and std::vector::capacity() too big to inline. That's the opposite of what I wanted, so revert the changes to those functions. To achieve the original aim of optimizing vec.assign(vec.size(), x) we can add a local optimization hint to _M_fill_assign, so that it doesn't affect all other uses of size() and capacity(). Additionally, add the same hint to the _M_assign_aux overload for forward iterators and add that to the testcase. It would be nice to similarly optimize: if (vec1.size() == vec2.size()) vec1 = vec2; but adding hints to operator=(const vector&) doesn't help. Presumably the relationships between the two sizes and two capacities are too complex to track effectively. libstdc++-v3/ChangeLog: PR libstdc++/110060 * include/bits/stl_vector.h (_Vector_base::_M_invariant): Remove. (vector::size, vector::capacity): Remove calls to _M_invariant. * include/bits/vector.tcc (vector::_M_fill_assign): Add optimization hint to reallocating path. (vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)): Likewise. * testsuite/23_containers/vector/capacity/invariant.cc: Moved to... * testsuite/23_containers/vector/modifiers/assign/no_realloc.cc: ...here. Check assign(FwdIter, FwdIter) too. * testsuite/23_containers/vector/types/1.cc: Revert addition of -Wno-stringop-overread option.	2023-06-01 16:06:15 +01:00
Jonathan Wakely	8cbaf679a3	libstdc++: Document removal of implicit allocator rebinding extensions Traditionally libstdc++ allowed containers and strings to be instantiated with allocator's that have the wrong value type, implicitly rebinding the allocator to the container's value type. Since C++20 that has been explicitly ill-formed, so the extension is no longer supported in strict modes (e.g. -std=c++17) and in C++20 and later. libstdc++-v3/ChangeLog: * doc/xml/manual/evolution.xml: Document removal of implicit allocator rebinding extensions in strict mode and for C++20. * doc/html/*: Regenerate.	2023-06-01 16:04:26 +01:00
Uros Bizjak	dec7aaabe9	cse: Change return type of predicate functions from int to bool Also change some function arguments to bool and remove one instance of always zero function argument. gcc/ChangeLog: * rtl.h (exp_equiv_p): Change return type from int to bool. * cse.cc (mention_regs): Change return type from int to bool and adjust function body accordingly. (exp_equiv_p): Ditto. (insert_regs): Ditto. Change "modified" function argument to bool and update usage accordingly. (record_jump_cond): Remove always zero "reversed_nonequality" function argument and update usage accordingly. (fold_rtx): Change "changed" variable to bool. (record_jump_equiv): Remove unneeded "reversed_nonequality" variable. (is_dead_reg): Change return type from int to bool.	2023-06-01 16:21:57 +02:00
Takayuki 'January June' Suwa	fe3ce08610	xtensa: Add 'adddi3' and 'subdi3' insn patterns More optimized than the default RTL generation. gcc/ChangeLog: * config/xtensa/xtensa.md (adddi3, subdi3): New RTL generation patterns implemented according to the instruc- tion idioms described in the Xtensa ISA reference manual (p. 600).	2023-06-01 07:13:45 -07:00
Roger Sayle	3635e8c67e	PR target/109973: CCZmode and CCCmode variants of [v]ptest on x86. This is my proposed minimal fix for PR target/109973 (hopefully suitable for backporting) that follows Jakub Jelinek's suggestion that we introduce CCZmode and CCCmode variants of ptest and vptest, so that the i386 backend treats [v]ptest instructions similarly to testl instructions; using different CCmodes to indicate which condition flags are desired, and then relying on the RTL cmpelim pass to eliminate redundant tests. This conveniently matches Intel's intrinsics, that provide different functions for retrieving different flags, _mm_testz_si128 tests the Z flag, _mm_testc_si128 tests the carry flag. Currently we use the same instruction (pattern) for both, and unfortunately the ptest<mode>_and optimization is only valid when the ptest/vptest instruction is used to set/test the Z flag. The downside, as predicted by Jakub, is that GCC's cmpelim pass is currently COMPARE-centric and not able to merge the ptests from expressions such as _mm256_testc_si256 (a, b) + _mm256_testz_si256 (a, b), which is a known issue, PR target/80040. 2023-06-01 Roger Sayle <roger@nextmovesoftware.com> Uros Bizjak <ubizjak@gmail.com> gcc/ChangeLog PR target/109973 config/i386/i386-builtin.def (__builtin_ia32_ptestz128): Use new CODE_for_sse4_1_ptestzv2di. (__builtin_ia32_ptestc128): Use new CODE_for_sse4_1_ptestcv2di. (__builtin_ia32_ptestz256): Use new CODE_for_avx_ptestzv4di. (__builtin_ia32_ptestc256): Use new CODE_for_avx_ptestcv4di. * config/i386/i386-expand.cc (ix86_expand_branch): Use CCZmode when expanding UNSPEC_PTEST to compare against zero. * config/i386/i386-features.cc (scalar_chain::convert_compare): Likewise generate CCZmode UNSPEC_PTESTs when converting comparisons. (general_scalar_chain::convert_insn): Use CCZmode for COMPARE result. (timode_scalar_chain::convert_insn): Use CCZmode for COMPARE result. * config/i386/i386-protos.h (ix86_match_ptest_ccmode): Prototype. * config/i386/i386.cc (ix86_match_ptest_ccmode): New predicate to check for suitable matching modes for the UNSPEC_PTEST pattern. * config/i386/sse.md (define_split): When splitting UNSPEC_MOVMSK to UNSPEC_PTEST, preserve the FLAG_REG mode as CCZ. (<sse4_1>_ptest<mode>): Add asterisk to hide define_insn. Remove ":CC" mode of FLAGS_REG, instead use ix86_match_ptest_ccmode. (<sse4_1>_ptestz<mode>): New define_expand to specify CCZ. (<sse4_1>_ptestc<mode>): New define_expand to specify CCC. (<sse4_1>_ptest<mode>): A define_expand using CC to preserve the current behavior. (ptest<mode>_and): Specify CCZ to only perform this optimization when only the Z flag is required. gcc/testsuite/ChangeLog PR target/109973 * gcc.target/i386/pr109973-1.c: New test case. * gcc.target/i386/pr109973-2.c: Likewise.	2023-06-01 15:11:25 +01:00
Jason Merrill	5d9c911907	libstdc++: optimize EH phase 2 In the ABI's two-phase EH model, first we walk the stack looking for a handler, then we walk the stack running cleanups until we reach that handler. In the cleanup phase, we shouldn't redundantly check the handlers along the way, e.g. when walking through g(): void f() { throw 42; } void g() { try { f(); } catch (void ) { } } int main() { try { g(); } catch (int) { } } libstdc++-v3/ChangeLog: libsupc++/eh_personality.cc (PERSONALITY_FUNCTION): Don't check handlers in the cleanup phase.	2023-06-01 08:49:20 -04:00
Jonathan Wakely	eeb9270496	doc: Fix description of x86 -m32 option [PR109954] This option does not imply -march=i386 so it's incorrect to say it generates code that will run on "any i386 system". gcc/ChangeLog: PR target/109954 * doc/invoke.texi (x86 Options): Fix description of -m32 option.	2023-06-01 12:01:09 +01:00
Matthias Kretz	2fbbaa77c8	libstdc++: Fix condition for supported SIMD types on ARMv8 Signed-off-by: Matthias Kretz <m.kretz@gsi.de> libstdc++-v3/ChangeLog: PR libstdc++/110050 * include/experimental/bits/simd.h (__vectorized_sizeof): With __have_neon_a32 only single-precision float works (in addition to integers).	2023-06-01 10:45:10 +02:00
Kyrylo Tkachov	12e71b593e	aarch64: Add =r,m and =m,r alternatives to 64-bit vector move patterns We can use the X registers to load and store 64-bit vector modes, we just need to add the alternatives to the mov patterns. This straightforward patch does that and for the pair variants too. For the testcase in the code we now generate the optimal assembly without any superfluous GP<->SIMD moves. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_simd_mov<VDMOV:mode>): Add =r,m and =r,m alternatives. (load_pair<DREG:mode><DREG2:mode>): Likewise. (vec_store_pair<DREG:mode><DREG2:mode>): Likewise. gcc/testsuite/ChangeLog: gcc.target/aarch64/xreg-vec-modes_1.c: New test.	2023-06-01 09:37:06 +01:00

1 2 3 4 5 ...

201440 commits