RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

This patch fully support gather_load/scatter_store:
1. Support single-rgroup on both RV32/RV64.
2. Support indexed element width can be same as or smaller than Pmode.
3. Support VLA SLP with gather/scatter.
4. Fully tested all gather/scatter with LMUL = M1/M2/M4/M8 both VLA and VLS.
5. Fix bug of handling (subreg:SI (const_poly_int:DI))
6. Fix bug on vec_perm which is used by gather/scatter SLP.

All kinds of GATHER/SCATTER are normalized into LEN_MASK_*.
We fully supported these 4 kinds of gather/scatter:
1. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and dummy mask (Full vector).
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and real mask.
3. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and dummy mask.
4. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and real mask.

Base on the disscussions with Richards, we don't lower vlse/vsse in RISC-V backend for strided load/store.
Instead, we leave it to the middle-end to handle that.

Regression is pass ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md
	(len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): New pattern.
	(len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
	(len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
	(len_mask_gather_load<mode><mode>): Ditto.
	(len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
	(len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
	(len_mask_scatter_store<mode><mode>): Ditto.
	* config/riscv/predicates.md (const_1_operand): New predicate.
	(vector_gs_scale_operand_16): Ditto.
	(vector_gs_scale_operand_32): Ditto.
	(vector_gs_scale_operand_64): Ditto.
	(vector_gs_extension_operand): Ditto.
	(vector_gs_scale_operand_16_rv32): Ditto.
	(vector_gs_scale_operand_32_rv32): Ditto.
	* config/riscv/riscv-protos.h (enum insn_type): Add gather/scatter.
	(expand_gather_scatter): New function.
	* config/riscv/riscv-v.cc (gen_const_vector_dup): Add gather/scatter.
	(emit_vlmax_masked_store_insn): New function.
	(emit_nonvlmax_masked_store_insn): Ditto.
	(modulo_sel_indices): Ditto.
	(expand_vec_perm): Fix SLP for gather/scatter.
	(prepare_gather_scatter): New function.
	(expand_gather_scatter): Ditto.
	* config/riscv/riscv.cc (riscv_legitimize_move): Fix bug of
	(subreg:SI (DI CONST_POLY_INT)).
	* config/riscv/vector-iterators.md: Add gather/scatter.
	* config/riscv/vector.md (vec_duplicate<mode>): Use "@" instead.
	(@vec_duplicate<mode>): Ditto.
	(@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>):
	Fix name.
	(@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/rvv.exp: Add gather/scatter tests.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c:
	New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c:
	New test.
This commit is contained in:
Ju-Zhe Zhong 2023-07-12 17:38:49 +08:00 committed by Pan Li
parent 15939bae35
commit f048af2aa3
102 changed files with 4987 additions and 62 deletions

View file

@ -57,6 +57,262 @@
}
)
;; =========================================================================
;; == Gather Load
;; =========================================================================
(define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
[(match_operand:VNX1_QHSD 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX1_QHSDI 2 "register_operand")
(match_operand 3 "<VNX1_QHSD:gs_extension>")
(match_operand 4 "<VNX1_QHSD:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
(define_expand "len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
[(match_operand:VNX2_QHSD 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX2_QHSDI 2 "register_operand")
(match_operand 3 "<VNX2_QHSD:gs_extension>")
(match_operand 4 "<VNX2_QHSD:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
(define_expand "len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
[(match_operand:VNX4_QHSD 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX4_QHSDI 2 "register_operand")
(match_operand 3 "<VNX4_QHSD:gs_extension>")
(match_operand 4 "<VNX4_QHSD:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
(define_expand "len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
[(match_operand:VNX8_QHSD 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX8_QHSDI 2 "register_operand")
(match_operand 3 "<VNX8_QHSD:gs_extension>")
(match_operand 4 "<VNX8_QHSD:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
(define_expand "len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
[(match_operand:VNX16_QHSD 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX16_QHSDI 2 "register_operand")
(match_operand 3 "<VNX16_QHSD:gs_extension>")
(match_operand 4 "<VNX16_QHSD:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
(define_expand "len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>"
[(match_operand:VNX32_QHS 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX32_QHSI 2 "register_operand")
(match_operand 3 "<VNX32_QHS:gs_extension>")
(match_operand 4 "<VNX32_QHS:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
(define_expand "len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>"
[(match_operand:VNX64_QH 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX64_QHI 2 "register_operand")
(match_operand 3 "<VNX64_QH:gs_extension>")
(match_operand 4 "<VNX64_QH:gs_scale>")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
;; When SEW = 8 and LMUL = 8, we can't find any index mode with
;; larger SEW. Since RVV indexed load/store support zero extend
;; implicitly and not support scaling, we should only allow
;; operands[3] and operands[4] to be const_1_operand.
(define_expand "len_mask_gather_load<mode><mode>"
[(match_operand:VNX128_Q 0 "register_operand")
(match_operand 1 "pmode_reg_or_0_operand")
(match_operand:VNX128_Q 2 "register_operand")
(match_operand 3 "const_1_operand")
(match_operand 4 "const_1_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, true);
DONE;
})
;; =========================================================================
;; == Scatter Store
;; =========================================================================
(define_expand "len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX1_QHSDI 1 "register_operand")
(match_operand 2 "<VNX1_QHSD:gs_extension>")
(match_operand 3 "<VNX1_QHSD:gs_scale>")
(match_operand:VNX1_QHSD 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
(define_expand "len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX2_QHSDI 1 "register_operand")
(match_operand 2 "<VNX2_QHSD:gs_extension>")
(match_operand 3 "<VNX2_QHSD:gs_scale>")
(match_operand:VNX2_QHSD 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
(define_expand "len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX4_QHSDI 1 "register_operand")
(match_operand 2 "<VNX4_QHSD:gs_extension>")
(match_operand 3 "<VNX4_QHSD:gs_scale>")
(match_operand:VNX4_QHSD 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
(define_expand "len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX8_QHSDI 1 "register_operand")
(match_operand 2 "<VNX8_QHSD:gs_extension>")
(match_operand 3 "<VNX8_QHSD:gs_scale>")
(match_operand:VNX8_QHSD 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
(define_expand "len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX16_QHSDI 1 "register_operand")
(match_operand 2 "<VNX16_QHSD:gs_extension>")
(match_operand 3 "<VNX16_QHSD:gs_scale>")
(match_operand:VNX16_QHSD 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
(define_expand "len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX32_QHSI 1 "register_operand")
(match_operand 2 "<VNX32_QHS:gs_extension>")
(match_operand 3 "<VNX32_QHS:gs_scale>")
(match_operand:VNX32_QHS 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
(define_expand "len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX64_QHI 1 "register_operand")
(match_operand 2 "<VNX64_QH:gs_extension>")
(match_operand 3 "<VNX64_QH:gs_scale>")
(match_operand:VNX64_QH 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
;; When SEW = 8 and LMUL = 8, we can't find any index mode with
;; larger SEW. Since RVV indexed load/store support zero extend
;; implicitly and not support scaling, we should only allow
;; operands[3] and operands[4] to be const_1_operand.
(define_expand "len_mask_scatter_store<mode><mode>"
[(match_operand 0 "pmode_reg_or_0_operand")
(match_operand:VNX128_Q 1 "register_operand")
(match_operand 2 "const_1_operand")
(match_operand 3 "const_1_operand")
(match_operand:VNX128_Q 4 "register_operand")
(match_operand 5 "autovec_length_operand")
(match_operand 6 "const_0_operand")
(match_operand:<VM> 7 "vector_mask_operand")]
"TARGET_VECTOR"
{
riscv_vector::expand_gather_scatter (operands, false);
DONE;
})
;; =========================================================================
;; == Vector creation
;; =========================================================================

View file

@ -61,6 +61,10 @@
(and (match_code "const_int,const_wide_int,const_vector")
(match_test "op == CONST0_RTX (GET_MODE (op))")))
(define_predicate "const_1_operand"
(and (match_code "const_int,const_wide_int,const_vector")
(match_test "op == CONST1_RTX (GET_MODE (op))")))
(define_predicate "reg_or_0_operand"
(ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
@ -341,6 +345,33 @@
(ior (match_operand 0 "register_operand")
(match_code "const_vector")))
(define_predicate "vector_gs_scale_operand_16"
(and (match_code "const_int")
(match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
(define_predicate "vector_gs_scale_operand_32"
(and (match_code "const_int")
(match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
(define_predicate "vector_gs_scale_operand_64"
(and (match_code "const_int")
(match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == DImode)")))
(define_predicate "vector_gs_extension_operand"
(ior (match_operand 0 "const_1_operand")
(and (match_operand 0 "const_0_operand")
(match_test "Pmode == SImode"))))
(define_predicate "vector_gs_scale_operand_16_rv32"
(and (match_code "const_int")
(match_test "INTVAL (op) == 1
|| (INTVAL (op) == 2 && Pmode == SImode)")))
(define_predicate "vector_gs_scale_operand_32_rv32"
(and (match_code "const_int")
(match_test "INTVAL (op) == 1
|| (INTVAL (op) == 4 && Pmode == SImode)")))
(define_predicate "ltge_operator"
(match_code "lt,ltu,ge,geu"))
@ -376,7 +407,7 @@
|| rtx_equal_p (op, CONST0_RTX (GET_MODE (op))))
&& maybe_gt (GET_MODE_BITSIZE (GET_MODE (op)), GET_MODE_BITSIZE (Pmode)))")
(ior (match_test "rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))")
(ior (match_operand 0 "const_int_operand")
(ior (match_code "const_int,const_poly_int")
(ior (match_operand 0 "register_operand")
(match_test "satisfies_constraint_Wdm (op)"))))))

View file

@ -195,6 +195,8 @@ enum insn_type
RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md. */
RVV_SLIDE_OP = 4, /* Dest, VUNDEF, source and offset. */
RVV_COMPRESS_OP = 4,
RVV_GATHER_M_OP = 5,
RVV_SCATTER_M_OP = 4,
};
enum vlmul_type
{
@ -303,6 +305,7 @@ void expand_vec_init (rtx, rtx);
void expand_vec_perm (rtx, rtx, rtx, rtx);
void expand_select_vl (rtx *);
void expand_load_store (rtx *, bool);
void expand_gather_scatter (rtx *, bool);
/* Rounding mode bitfield for fixed point VXRM. */
enum fixed_point_rounding_mode

View file

@ -556,16 +556,23 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, poly_int64 maxval)
return true;
}
/* Return a const_int vector of VAL.
This function also exists in aarch64, we may unify it in middle-end in the
future. */
/* Return a const vector of VAL. The VAL can be either const_int or
const_poly_int. */
static rtx
gen_const_vector_dup (machine_mode mode, poly_int64 val)
{
rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
return gen_const_vec_duplicate (mode, c);
scalar_mode smode = GET_MODE_INNER (mode);
rtx c = gen_int_mode (val, smode);
if (!val.is_constant () && GET_MODE_SIZE (smode) > GET_MODE_SIZE (Pmode))
{
/* When VAL is const_poly_int value, we need to explicitly broadcast
it into a vector using RVV broadcast instruction. */
rtx dup = gen_reg_rtx (mode);
emit_insn (gen_vec_duplicate (mode, dup, c));
return dup;
}
return gen_const_vec_duplicate (mode, c);
}
/* Emit a vlmax vsetvl instruction. This should only be used when
@ -901,6 +908,39 @@ emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
e.emit_insn ((enum insn_code) icode, ops);
}
/* This function emits a VLMAX masked store instruction. */
static void
emit_vlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops)
{
machine_mode dest_mode = GET_MODE (ops[0]);
machine_mode mask_mode = get_mask_mode (dest_mode).require ();
insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
/*HAS_DEST_P*/ false,
/*FULLY_UNMASKED_P*/ false,
/*USE_REAL_MERGE_P*/ true,
/*HAS_AVL_P*/ true,
/*VLMAX_P*/ true, dest_mode,
mask_mode);
e.emit_insn ((enum insn_code) icode, ops);
}
/* This function emits a non-VLMAX masked store instruction. */
static void
emit_nonvlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
{
machine_mode dest_mode = GET_MODE (ops[0]);
machine_mode mask_mode = get_mask_mode (dest_mode).require ();
insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
/*HAS_DEST_P*/ false,
/*FULLY_UNMASKED_P*/ false,
/*USE_REAL_MERGE_P*/ true,
/*HAS_AVL_P*/ true,
/*VLMAX_P*/ false, dest_mode,
mask_mode);
e.set_vl (avl);
e.emit_insn ((enum insn_code) icode, ops);
}
/* This function emits a masked instruction. */
void
emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@ -1194,7 +1234,6 @@ static void
expand_const_vector (rtx target, rtx src)
{
machine_mode mode = GET_MODE (target);
scalar_mode elt_mode = GET_MODE_INNER (mode);
if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
{
rtx elt;
@ -1219,7 +1258,6 @@ expand_const_vector (rtx target, rtx src)
}
else
{
elt = force_reg (elt_mode, elt);
rtx ops[] = {tmp, elt};
emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
}
@ -2488,6 +2526,25 @@ expand_vec_cmp_float (rtx target, rtx_code code, rtx op0, rtx op1,
return false;
}
/* Modulo all SEL indices to ensure they are all in range if [0, MAX_SEL]. */
static rtx
modulo_sel_indices (rtx sel, poly_uint64 max_sel)
{
rtx sel_mod;
machine_mode sel_mode = GET_MODE (sel);
poly_uint64 nunits = GET_MODE_NUNITS (sel_mode);
/* If SEL is variable-length CONST_VECTOR, we don't need to modulo it. */
if (!nunits.is_constant () && CONST_VECTOR_P (sel))
sel_mod = sel;
else
{
rtx mod = gen_const_vector_dup (sel_mode, max_sel);
sel_mod
= expand_simple_binop (sel_mode, AND, sel, mod, NULL, 0, OPTAB_DIRECT);
}
return sel_mod;
}
/* Implement vec_perm<mode>. */
void
@ -2501,41 +2558,44 @@ expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
index is in range of [0, nunits - 1]. A single vrgather instructions is
enough. Since we will use vrgatherei16.vv for variable-length vector,
it is never out of range and we don't need to modulo the index. */
if (!nunits.is_constant () || const_vec_all_in_range_p (sel, 0, nunits - 1))
if (nunits.is_constant () && const_vec_all_in_range_p (sel, 0, nunits - 1))
{
emit_vlmax_gather_insn (target, op0, sel);
return;
}
/* Check if the two values vectors are the same. */
if (rtx_equal_p (op0, op1) || const_vec_duplicate_p (sel))
/* Check if all the indices are same. */
rtx elt;
if (const_vec_duplicate_p (sel, &elt))
{
/* Note: vec_perm indices are supposed to wrap when they go beyond the
size of the two value vectors, i.e. the upper bits of the indices
are effectively ignored. RVV vrgather instead produces 0 for any
out-of-range indices, so we need to modulo all the vec_perm indices
to ensure they are all in range of [0, nunits - 1]. */
rtx max_sel = gen_const_vector_dup (sel_mode, nunits - 1);
rtx sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
OPTAB_DIRECT);
emit_vlmax_gather_insn (target, op1, sel_mod);
poly_uint64 value = rtx_to_poly_int64 (elt);
rtx op = op0;
if (maybe_gt (value, nunits - 1))
{
sel = gen_const_vector_dup (sel_mode, value - nunits);
op = op1;
}
emit_vlmax_gather_insn (target, op, sel);
}
/* Note: vec_perm indices are supposed to wrap when they go beyond the
size of the two value vectors, i.e. the upper bits of the indices
are effectively ignored. RVV vrgather instead produces 0 for any
out-of-range indices, so we need to modulo all the vec_perm indices
to ensure they are all in range of [0, nunits - 1] when op0 == op1
or all in range of [0, 2 * nunits - 1] when op0 != op1. */
rtx sel_mod
= modulo_sel_indices (sel,
rtx_equal_p (op0, op1) ? nunits - 1 : 2 * nunits - 1);
/* Check if the two values vectors are the same. */
if (rtx_equal_p (op0, op1))
{
emit_vlmax_gather_insn (target, op0, sel_mod);
return;
}
rtx sel_mod = sel;
rtx max_sel = gen_const_vector_dup (sel_mode, 2 * nunits - 1);
/* We don't need to modulo indices for VLA vector.
Since we should gurantee they aren't out of range before. */
if (nunits.is_constant ())
{
/* Note: vec_perm indices are supposed to wrap when they go beyond the
size of the two value vectors, i.e. the upper bits of the indices
are effectively ignored. RVV vrgather instead produces 0 for any
out-of-range indices, so we need to modulo all the vec_perm indices
to ensure they are all in range of [0, 2 * nunits - 1]. */
sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
OPTAB_DIRECT);
}
/* This following sequence is handling the case that:
__builtin_shufflevector (vec1, vec2, index...), the index can be any
@ -3007,6 +3067,7 @@ expand_load_store (rtx *ops, bool is_load)
}
}
/* Return true if the operation is the floating-point operation need FRM. */
static bool
needs_fp_rounding (rtx_code code, machine_mode mode)
@ -3047,4 +3108,163 @@ expand_cond_len_binop (rtx_code code, rtx *ops)
gcc_unreachable ();
}
/* Prepare insn_code for gather_load/scatter_store according to
the vector mode and index mode. */
static insn_code
prepare_gather_scatter (machine_mode vec_mode, machine_mode idx_mode,
bool is_load)
{
if (!is_load)
return code_for_pred_indexed_store (UNSPEC_UNORDERED, vec_mode, idx_mode);
else
{
unsigned src_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (idx_mode));
unsigned dst_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (vec_mode));
if (dst_eew_bitsize == src_eew_bitsize)
return code_for_pred_indexed_load_same_eew (UNSPEC_UNORDERED, vec_mode);
else if (dst_eew_bitsize > src_eew_bitsize)
{
unsigned factor = dst_eew_bitsize / src_eew_bitsize;
switch (factor)
{
case 2:
return code_for_pred_indexed_load_x2_greater_eew (
UNSPEC_UNORDERED, vec_mode);
case 4:
return code_for_pred_indexed_load_x4_greater_eew (
UNSPEC_UNORDERED, vec_mode);
case 8:
return code_for_pred_indexed_load_x8_greater_eew (
UNSPEC_UNORDERED, vec_mode);
default:
gcc_unreachable ();
}
}
else
{
unsigned factor = src_eew_bitsize / dst_eew_bitsize;
switch (factor)
{
case 2:
return code_for_pred_indexed_load_x2_smaller_eew (
UNSPEC_UNORDERED, vec_mode);
case 4:
return code_for_pred_indexed_load_x4_smaller_eew (
UNSPEC_UNORDERED, vec_mode);
case 8:
return code_for_pred_indexed_load_x8_smaller_eew (
UNSPEC_UNORDERED, vec_mode);
default:
gcc_unreachable ();
}
}
}
}
/* Expand LEN_MASK_{GATHER_LOAD,SCATTER_STORE}. */
void
expand_gather_scatter (rtx *ops, bool is_load)
{
rtx ptr, vec_offset, vec_reg, len, mask;
bool zero_extend_p;
int scale_log2;
if (is_load)
{
vec_reg = ops[0];
ptr = ops[1];
vec_offset = ops[2];
zero_extend_p = INTVAL (ops[3]);
scale_log2 = exact_log2 (INTVAL (ops[4]));
len = ops[5];
mask = ops[7];
}
else
{
vec_reg = ops[4];
ptr = ops[0];
vec_offset = ops[1];
zero_extend_p = INTVAL (ops[2]);
scale_log2 = exact_log2 (INTVAL (ops[3]));
len = ops[5];
mask = ops[7];
}
machine_mode vec_mode = GET_MODE (vec_reg);
machine_mode idx_mode = GET_MODE (vec_offset);
scalar_mode inner_vec_mode = GET_MODE_INNER (vec_mode);
scalar_mode inner_idx_mode = GET_MODE_INNER (idx_mode);
unsigned inner_vsize = GET_MODE_BITSIZE (inner_vec_mode);
unsigned inner_offsize = GET_MODE_BITSIZE (inner_idx_mode);
poly_int64 nunits = GET_MODE_NUNITS (vec_mode);
poly_int64 value;
bool is_vlmax = poly_int_rtx_p (len, &value) && known_eq (value, nunits);
if (inner_offsize < inner_vsize)
{
/* 7.2. Vector Load/Store Addressing Modes.
If the vector offset elements are narrower than XLEN, they are
zero-extended to XLEN before adding to the ptr effective address. If
the vector offset elements are wider than XLEN, the least-significant
XLEN bits are used in the address calculation. An implementation must
raise an illegal instruction exception if the EEW is not supported for
offset elements.
RVV spec only refers to the scale_log == 0 case. */
if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))
{
if (zero_extend_p)
inner_idx_mode
= int_mode_for_size (inner_offsize * 2, 0).require ();
else
inner_idx_mode = int_mode_for_size (BITS_PER_WORD, 0).require ();
machine_mode new_idx_mode
= get_vector_mode (inner_idx_mode, nunits).require ();
rtx tmp = gen_reg_rtx (new_idx_mode);
emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, idx_mode,
zero_extend_p ? true : false));
vec_offset = tmp;
idx_mode = new_idx_mode;
}
}
if (scale_log2 != 0)
{
rtx tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
gen_int_mode (scale_log2, Pmode), NULL_RTX, 0,
OPTAB_DIRECT);
vec_offset = tmp;
}
insn_code icode = prepare_gather_scatter (vec_mode, idx_mode, is_load);
if (is_vlmax)
{
if (is_load)
{
rtx load_ops[]
= {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
emit_vlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops);
}
else
{
rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
emit_vlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops);
}
}
else
{
if (is_load)
{
rtx load_ops[]
= {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
emit_nonvlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops, len);
}
else
{
rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
emit_nonvlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops,
len);
}
}
}
} // namespace riscv_vector

View file

@ -2037,7 +2037,14 @@ riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
(m, n) = base * magn + constant.
This calculation doesn't need div operation. */
emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
if (known_le (GET_MODE_SIZE (mode), GET_MODE_SIZE (Pmode)))
emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
else
{
emit_move_insn (gen_highpart (Pmode, tmp), CONST0_RTX (Pmode));
emit_move_insn (gen_lowpart (Pmode, tmp),
gen_int_mode (BYTES_PER_RISCV_VECTOR, Pmode));
}
if (BYTES_PER_RISCV_VECTOR.is_constant ())
{
@ -2144,7 +2151,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
return false;
}
if (satisfies_constraint_vp (src))
if (satisfies_constraint_vp (src) && GET_MODE (src) == Pmode)
return false;
if (GET_MODE_SIZE (mode).to_constant () < GET_MODE_SIZE (Pmode))

View file

@ -115,6 +115,9 @@
(define_mode_iterator VEEWEXT2 [
(VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx64HI "TARGET_MIN_VLEN >= 128")
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
(VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
(VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
(VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI "TARGET_VECTOR_ELEN_64")
(VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
@ -161,6 +164,8 @@
(define_mode_iterator VEEWTRUNC2 [
(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN >= 128")
(VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN >= 128")
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
(VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
(VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN >= 128")
(VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
(VNx2SF "TARGET_VECTOR_ELEN_FP_32")
@ -172,6 +177,8 @@
(define_mode_iterator VEEWTRUNC4 [
(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI (VNx32QI "TARGET_MIN_VLEN >= 128")
(VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI (VNx16HI "TARGET_MIN_VLEN >= 128")
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
(VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
])
(define_mode_iterator VEEWTRUNC8 [
@ -362,46 +369,67 @@
])
(define_mode_iterator VNX1_QHSD [
(VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
(VNx1QI "TARGET_MIN_VLEN < 128")
(VNx1HI "TARGET_MIN_VLEN < 128")
(VNx1SI "TARGET_MIN_VLEN < 128")
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
(VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
(VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
])
(define_mode_iterator VNX2_QHSD [
VNx2QI VNx2HI VNx2SI
VNx2QI
VNx2HI
VNx2SI
(VNx2DI "TARGET_VECTOR_ELEN_64")
(VNx2HF "TARGET_VECTOR_ELEN_FP_16")
(VNx2SF "TARGET_VECTOR_ELEN_FP_32")
(VNx2DF "TARGET_VECTOR_ELEN_FP_64")
])
(define_mode_iterator VNX4_QHSD [
VNx4QI VNx4HI VNx4SI
VNx4QI
VNx4HI
VNx4SI
(VNx4DI "TARGET_VECTOR_ELEN_64")
(VNx4HF "TARGET_VECTOR_ELEN_FP_16")
(VNx4SF "TARGET_VECTOR_ELEN_FP_32")
(VNx4DF "TARGET_VECTOR_ELEN_FP_64")
])
(define_mode_iterator VNX8_QHSD [
VNx8QI VNx8HI VNx8SI
VNx8QI
VNx8HI
VNx8SI
(VNx8DI "TARGET_VECTOR_ELEN_64")
(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
(VNx8SF "TARGET_VECTOR_ELEN_FP_32")
(VNx8DF "TARGET_VECTOR_ELEN_FP_64")
])
(define_mode_iterator VNX16_QHS [
VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32")
(define_mode_iterator VNX16_QHSD [
VNx16QI
VNx16HI
(VNx16SI "TARGET_MIN_VLEN > 32")
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
(VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
(VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX32_QHS [
VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128") (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
VNx32QI
(VNx32HI "TARGET_MIN_VLEN > 32")
(VNx32SI "TARGET_MIN_VLEN >= 128")
(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
(VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX64_QH [
(VNx64QI "TARGET_MIN_VLEN > 32")
(VNx64HI "TARGET_MIN_VLEN >= 128")
(VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX128_Q [
@ -409,35 +437,49 @@
])
(define_mode_iterator VNX1_QHSDI [
(VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
(VNx1DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
(VNx1QI "TARGET_MIN_VLEN < 128")
(VNx1HI "TARGET_MIN_VLEN < 128")
(VNx1SI "TARGET_MIN_VLEN < 128")
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128 && TARGET_64BIT")
])
(define_mode_iterator VNX2_QHSDI [
VNx2QI VNx2HI VNx2SI
(VNx2DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
VNx2QI
VNx2HI
VNx2SI
(VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
(define_mode_iterator VNX4_QHSDI [
VNx4QI VNx4HI VNx4SI
(VNx4DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
VNx4QI
VNx4HI
VNx4SI
(VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
(define_mode_iterator VNX8_QHSDI [
VNx8QI VNx8HI VNx8SI
(VNx8DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
VNx8QI
VNx8HI
VNx8SI
(VNx8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
(define_mode_iterator VNX16_QHSDI [
VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx16DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
VNx16QI
VNx16HI
(VNx16SI "TARGET_MIN_VLEN > 32")
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128 && TARGET_64BIT")
])
(define_mode_iterator VNX32_QHSI [
VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
VNx32QI
(VNx32HI "TARGET_MIN_VLEN > 32")
(VNx32SI "TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX64_QHI [
VNx64QI (VNx64HI "TARGET_MIN_VLEN >= 128")
(VNx64QI "TARGET_MIN_VLEN > 32")
(VNx64HI "TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator V_WHOLE [
@ -1393,6 +1435,8 @@
(define_mode_attr VINDEX_DOUBLE_TRUNC [
(VNx1HI "VNx1QI") (VNx2HI "VNx2QI") (VNx4HI "VNx4QI") (VNx8HI "VNx8QI")
(VNx16HI "VNx16QI") (VNx32HI "VNx32QI") (VNx64HI "VNx64QI")
(VNx1HF "VNx1QI") (VNx2HF "VNx2QI") (VNx4HF "VNx4QI") (VNx8HF "VNx8QI")
(VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
(VNx1SI "VNx1HI") (VNx2SI "VNx2HI") (VNx4SI "VNx4HI") (VNx8SI "VNx8HI")
(VNx16SI "VNx16HI") (VNx32SI "VNx32HI")
(VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI")
@ -1420,6 +1464,7 @@
(define_mode_attr VINDEX_DOUBLE_EXT [
(VNx1QI "VNx1HI") (VNx2QI "VNx2HI") (VNx4QI "VNx4HI") (VNx8QI "VNx8HI") (VNx16QI "VNx16HI") (VNx32QI "VNx32HI") (VNx64QI "VNx64HI")
(VNx1HI "VNx1SI") (VNx2HI "VNx2SI") (VNx4HI "VNx4SI") (VNx8HI "VNx8SI") (VNx16HI "VNx16SI") (VNx32HI "VNx32SI")
(VNx1HF "VNx1SI") (VNx2HF "VNx2SI") (VNx4HF "VNx4SI") (VNx8HF "VNx8SI") (VNx16HF "VNx16SI") (VNx32HF "VNx32SI")
(VNx1SI "VNx1DI") (VNx2SI "VNx2DI") (VNx4SI "VNx4DI") (VNx8SI "VNx8DI") (VNx16SI "VNx16DI")
(VNx1SF "VNx1DI") (VNx2SF "VNx2DI") (VNx4SF "VNx4DI") (VNx8SF "VNx8DI") (VNx16SF "VNx16DI")
])
@ -1427,6 +1472,7 @@
(define_mode_attr VINDEX_QUAD_EXT [
(VNx1QI "VNx1SI") (VNx2QI "VNx2SI") (VNx4QI "VNx4SI") (VNx8QI "VNx8SI") (VNx16QI "VNx16SI") (VNx32QI "VNx32SI")
(VNx1HI "VNx1DI") (VNx2HI "VNx2DI") (VNx4HI "VNx4DI") (VNx8HI "VNx8DI") (VNx16HI "VNx16DI")
(VNx1HF "VNx1DI") (VNx2HF "VNx2DI") (VNx4HF "VNx4DI") (VNx8HF "VNx8DI") (VNx16HF "VNx16DI")
])
(define_mode_attr VINDEX_OCT_EXT [
@ -1471,6 +1517,40 @@
(VNx4DI "VNx8BI") (VNx8DI "VNx16BI") (VNx16DI "VNx32BI")
])
(define_mode_attr gs_extension [
(VNx1QI "immediate_operand") (VNx2QI "immediate_operand") (VNx4QI "immediate_operand") (VNx8QI "immediate_operand") (VNx16QI "immediate_operand")
(VNx32QI "vector_gs_extension_operand") (VNx64QI "const_1_operand")
(VNx1HI "immediate_operand") (VNx2HI "immediate_operand") (VNx4HI "immediate_operand") (VNx8HI "immediate_operand") (VNx16HI "immediate_operand")
(VNx32HI "vector_gs_extension_operand") (VNx64HI "const_1_operand")
(VNx1SI "immediate_operand") (VNx2SI "immediate_operand") (VNx4SI "immediate_operand") (VNx8SI "immediate_operand") (VNx16SI "immediate_operand")
(VNx32SI "vector_gs_extension_operand")
(VNx1DI "immediate_operand") (VNx2DI "immediate_operand") (VNx4DI "immediate_operand") (VNx8DI "immediate_operand") (VNx16DI "immediate_operand")
(VNx1HF "immediate_operand") (VNx2HF "immediate_operand") (VNx4HF "immediate_operand") (VNx8HF "immediate_operand") (VNx16HF "immediate_operand")
(VNx32HF "vector_gs_extension_operand") (VNx64HF "const_1_operand")
(VNx1SF "immediate_operand") (VNx2SF "immediate_operand") (VNx4SF "immediate_operand") (VNx8SF "immediate_operand") (VNx16SF "immediate_operand")
(VNx32SF "vector_gs_extension_operand")
(VNx1DF "immediate_operand") (VNx2DF "immediate_operand") (VNx4DF "immediate_operand") (VNx8DF "immediate_operand") (VNx16DF "immediate_operand")
])
(define_mode_attr gs_scale [
(VNx1QI "const_1_operand") (VNx2QI "const_1_operand") (VNx4QI "const_1_operand") (VNx8QI "const_1_operand")
(VNx16QI "const_1_operand") (VNx32QI "const_1_operand") (VNx64QI "const_1_operand")
(VNx1HI "vector_gs_scale_operand_16") (VNx2HI "vector_gs_scale_operand_16") (VNx4HI "vector_gs_scale_operand_16") (VNx8HI "vector_gs_scale_operand_16")
(VNx16HI "vector_gs_scale_operand_16") (VNx32HI "vector_gs_scale_operand_16_rv32") (VNx64HI "const_1_operand")
(VNx1SI "vector_gs_scale_operand_32") (VNx2SI "vector_gs_scale_operand_32") (VNx4SI "vector_gs_scale_operand_32") (VNx8SI "vector_gs_scale_operand_32")
(VNx16SI "vector_gs_scale_operand_32") (VNx32SI "vector_gs_scale_operand_32_rv32")
(VNx1DI "vector_gs_scale_operand_64") (VNx2DI "vector_gs_scale_operand_64") (VNx4DI "vector_gs_scale_operand_64") (VNx8DI "vector_gs_scale_operand_64")
(VNx16DI "vector_gs_scale_operand_64")
(VNx1HF "vector_gs_scale_operand_16") (VNx2HF "vector_gs_scale_operand_16") (VNx4HF "vector_gs_scale_operand_16") (VNx8HF "vector_gs_scale_operand_16")
(VNx16HF "vector_gs_scale_operand_16") (VNx32HF "vector_gs_scale_operand_16_rv32") (VNx64HF "const_1_operand")
(VNx1SF "vector_gs_scale_operand_32") (VNx2SF "vector_gs_scale_operand_32") (VNx4SF "vector_gs_scale_operand_32") (VNx8SF "vector_gs_scale_operand_32")
(VNx16SF "vector_gs_scale_operand_32") (VNx32SF "vector_gs_scale_operand_32_rv32")
(VNx1DF "vector_gs_scale_operand_64") (VNx2DF "vector_gs_scale_operand_64") (VNx4DF "vector_gs_scale_operand_64") (VNx8DF "vector_gs_scale_operand_64")
(VNx16DF "vector_gs_scale_operand_64")
])
(define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM])
(define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED])

View file

@ -818,7 +818,7 @@
;; This pattern only handles duplicates of non-constant inputs.
;; Constant vectors go through the movm pattern instead.
;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT.
(define_expand "vec_duplicate<mode>"
(define_expand "@vec_duplicate<mode>"
[(set (match_operand:V 0 "register_operand")
(vec_duplicate:V
(match_operand:<VEL> 1 "direct_broadcast_operand")))]
@ -1357,8 +1357,16 @@
}
}
else if (GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)
&& immediate_operand (operands[3], Pmode))
operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, force_reg (Pmode, operands[3]));
&& (immediate_operand (operands[3], Pmode)
|| (CONST_POLY_INT_P (operands[3])
&& known_ge (rtx_to_poly_int64 (operands[3]), 0U)
&& known_le (rtx_to_poly_int64 (operands[3]), GET_MODE_SIZE (<MODE>mode)))))
{
rtx tmp = gen_reg_rtx (Pmode);
poly_int64 value = rtx_to_poly_int64 (operands[3]);
emit_move_insn (tmp, gen_int_mode (value, Pmode));
operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, tmp);
}
else
operands[3] = force_reg (<VEL>mode, operands[3]);
})
@ -1387,7 +1395,8 @@
vlse<sew>.v\t%0,%3,zero
vmv.s.x\t%0,%3
vmv.s.x\t%0,%3"
"register_operand (operands[3], <VEL>mode)
"(register_operand (operands[3], <VEL>mode)
|| CONST_POLY_INT_P (operands[3]))
&& GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)"
[(set (match_dup 0)
(if_then_else:VI (unspec:<VM> [(match_dup 1) (match_dup 4)
@ -1397,6 +1406,12 @@
(match_dup 2)))]
{
gcc_assert (can_create_pseudo_p ());
if (CONST_POLY_INT_P (operands[3]))
{
rtx tmp = gen_reg_rtx (<VEL>mode);
emit_move_insn (tmp, operands[3]);
operands[3] = tmp;
}
rtx m = assign_stack_local (<VEL>mode, GET_MODE_SIZE (<VEL>mode),
GET_MODE_ALIGNMENT (<VEL>mode));
m = validize_mem (m);
@ -1483,6 +1498,7 @@
(match_operand 5 "vector_length_operand" " rK, rK, rK")
(match_operand 6 "const_int_operand" " i, i, i")
(match_operand 7 "const_int_operand" " i, i, i")
(match_operand 8 "const_int_operand" " i, i, i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(unspec:V
@ -1738,7 +1754,7 @@
[(set_attr "type" "vst<order>x")
(set_attr "mode" "<VNX8_QHSD:MODE>")])
(define_insn "@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>"
(define_insn "@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
[(set (mem:BLK (scratch))
(unspec:BLK
[(unspec:<VM>
@ -1749,11 +1765,11 @@
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
(match_operand 1 "pmode_reg_or_0_operand" " rJ")
(match_operand:VNX16_QHSDI 2 "register_operand" " vr")
(match_operand:VNX16_QHS 3 "register_operand" " vr")] ORDER))]
(match_operand:VNX16_QHSD 3 "register_operand" " vr")] ORDER))]
"TARGET_VECTOR"
"vs<order>xei<VNX16_QHSDI:sew>.v\t%3,(%z1),%2%p0"
[(set_attr "type" "vst<order>x")
(set_attr "mode" "<VNX16_QHS:MODE>")])
(set_attr "mode" "<VNX16_QHSD:MODE>")])
(define_insn "@pred_indexed_<order>store<VNX32_QHS:mode><VNX32_QHSI:mode>"
[(set (mem:BLK (scratch))

View file

@ -0,0 +1,38 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define INDEX16 uint16_t
#define INDEX32 uint32_t
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,32 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define TEST_LOOP(DATA_TYPE) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict *src) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += *src[i]; \
}
#define TEST_ALL(T) \
T (int8_t) \
T (uint8_t) \
T (int16_t) \
T (uint16_t) \
T (_Float16) \
T (int32_t) \
T (uint32_t) \
T (float) \
T (int64_t) \
T (uint64_t) \
T (double)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,112 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-vect-cost-model -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x, \
INDEX_TYPE *restrict index) \
{ \
for (int i = 0; i < 100; ++i) \
{ \
y[i * 2] = x[index[i * 2]] + 1; \
y[i * 2 + 1] = x[index[i * 2 + 1]] + 2; \
} \
}
TEST_LOOP (int8_t, int8_t)
TEST_LOOP (uint8_t, int8_t)
TEST_LOOP (int16_t, int8_t)
TEST_LOOP (uint16_t, int8_t)
TEST_LOOP (int32_t, int8_t)
TEST_LOOP (uint32_t, int8_t)
TEST_LOOP (int64_t, int8_t)
TEST_LOOP (uint64_t, int8_t)
TEST_LOOP (_Float16, int8_t)
TEST_LOOP (float, int8_t)
TEST_LOOP (double, int8_t)
TEST_LOOP (int8_t, int16_t)
TEST_LOOP (uint8_t, int16_t)
TEST_LOOP (int16_t, int16_t)
TEST_LOOP (uint16_t, int16_t)
TEST_LOOP (int32_t, int16_t)
TEST_LOOP (uint32_t, int16_t)
TEST_LOOP (int64_t, int16_t)
TEST_LOOP (uint64_t, int16_t)
TEST_LOOP (_Float16, int16_t)
TEST_LOOP (float, int16_t)
TEST_LOOP (double, int16_t)
TEST_LOOP (int8_t, int32_t)
TEST_LOOP (uint8_t, int32_t)
TEST_LOOP (int16_t, int32_t)
TEST_LOOP (uint16_t, int32_t)
TEST_LOOP (int32_t, int32_t)
TEST_LOOP (uint32_t, int32_t)
TEST_LOOP (int64_t, int32_t)
TEST_LOOP (uint64_t, int32_t)
TEST_LOOP (_Float16, int32_t)
TEST_LOOP (float, int32_t)
TEST_LOOP (double, int32_t)
TEST_LOOP (int8_t, int64_t)
TEST_LOOP (uint8_t, int64_t)
TEST_LOOP (int16_t, int64_t)
TEST_LOOP (uint16_t, int64_t)
TEST_LOOP (int32_t, int64_t)
TEST_LOOP (uint32_t, int64_t)
TEST_LOOP (int64_t, int64_t)
TEST_LOOP (uint64_t, int64_t)
TEST_LOOP (_Float16, int64_t)
TEST_LOOP (float, int64_t)
TEST_LOOP (double, int64_t)
TEST_LOOP (int8_t, uint8_t)
TEST_LOOP (uint8_t, uint8_t)
TEST_LOOP (int16_t, uint8_t)
TEST_LOOP (uint16_t, uint8_t)
TEST_LOOP (int32_t, uint8_t)
TEST_LOOP (uint32_t, uint8_t)
TEST_LOOP (int64_t, uint8_t)
TEST_LOOP (uint64_t, uint8_t)
TEST_LOOP (_Float16, uint8_t)
TEST_LOOP (float, uint8_t)
TEST_LOOP (double, uint8_t)
TEST_LOOP (int8_t, uint16_t)
TEST_LOOP (uint8_t, uint16_t)
TEST_LOOP (int16_t, uint16_t)
TEST_LOOP (uint16_t, uint16_t)
TEST_LOOP (int32_t, uint16_t)
TEST_LOOP (uint32_t, uint16_t)
TEST_LOOP (int64_t, uint16_t)
TEST_LOOP (uint64_t, uint16_t)
TEST_LOOP (_Float16, uint16_t)
TEST_LOOP (float, uint16_t)
TEST_LOOP (double, uint16_t)
TEST_LOOP (int8_t, uint32_t)
TEST_LOOP (uint8_t, uint32_t)
TEST_LOOP (int16_t, uint32_t)
TEST_LOOP (uint16_t, uint32_t)
TEST_LOOP (int32_t, uint32_t)
TEST_LOOP (uint32_t, uint32_t)
TEST_LOOP (int64_t, uint32_t)
TEST_LOOP (uint64_t, uint32_t)
TEST_LOOP (_Float16, uint32_t)
TEST_LOOP (float, uint32_t)
TEST_LOOP (double, uint32_t)
TEST_LOOP (int8_t, uint64_t)
TEST_LOOP (uint8_t, uint64_t)
TEST_LOOP (int16_t, uint64_t)
TEST_LOOP (uint16_t, uint64_t)
TEST_LOOP (int32_t, uint64_t)
TEST_LOOP (uint32_t, uint64_t)
TEST_LOOP (int64_t, uint64_t)
TEST_LOOP (uint64_t, uint64_t)
TEST_LOOP (_Float16, uint64_t)
TEST_LOOP (float, uint64_t)
TEST_LOOP (double, uint64_t)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */

View file

@ -0,0 +1,38 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 uint16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 int16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 uint32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 int32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-1.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-10.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,39 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-11.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE *src_##DATA_TYPE[128]; \
DATA_TYPE src2_##DATA_TYPE[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
src_##DATA_TYPE[i] = src2_##DATA_TYPE + i; \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] + src_##DATA_TYPE[i][0]));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,124 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-12.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, INDEX_TYPE) \
DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
for (int i = 0; i < 202; i++) \
{ \
src_##DATA_TYPE##_##INDEX_TYPE[i] \
= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1)); \
index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55); \
} \
f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE, \
src_##DATA_TYPE##_##INDEX_TYPE, \
index_##DATA_TYPE##_##INDEX_TYPE); \
for (int i = 0; i < 100; i++) \
{ \
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2] \
== (src_##DATA_TYPE##_##INDEX_TYPE \
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2]] \
+ 1)); \
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1] \
== (src_##DATA_TYPE##_##INDEX_TYPE \
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]] \
+ 2)); \
}
RUN_LOOP (int8_t, int8_t)
RUN_LOOP (uint8_t, int8_t)
RUN_LOOP (int16_t, int8_t)
RUN_LOOP (uint16_t, int8_t)
RUN_LOOP (int32_t, int8_t)
RUN_LOOP (uint32_t, int8_t)
RUN_LOOP (int64_t, int8_t)
RUN_LOOP (uint64_t, int8_t)
RUN_LOOP (_Float16, int8_t)
RUN_LOOP (float, int8_t)
RUN_LOOP (double, int8_t)
RUN_LOOP (int8_t, int16_t)
RUN_LOOP (uint8_t, int16_t)
RUN_LOOP (int16_t, int16_t)
RUN_LOOP (uint16_t, int16_t)
RUN_LOOP (int32_t, int16_t)
RUN_LOOP (uint32_t, int16_t)
RUN_LOOP (int64_t, int16_t)
RUN_LOOP (uint64_t, int16_t)
RUN_LOOP (_Float16, int16_t)
RUN_LOOP (float, int16_t)
RUN_LOOP (double, int16_t)
RUN_LOOP (int8_t, int32_t)
RUN_LOOP (uint8_t, int32_t)
RUN_LOOP (int16_t, int32_t)
RUN_LOOP (uint16_t, int32_t)
RUN_LOOP (int32_t, int32_t)
RUN_LOOP (uint32_t, int32_t)
RUN_LOOP (int64_t, int32_t)
RUN_LOOP (uint64_t, int32_t)
RUN_LOOP (_Float16, int32_t)
RUN_LOOP (float, int32_t)
RUN_LOOP (double, int32_t)
RUN_LOOP (int8_t, int64_t)
RUN_LOOP (uint8_t, int64_t)
RUN_LOOP (int16_t, int64_t)
RUN_LOOP (uint16_t, int64_t)
RUN_LOOP (int32_t, int64_t)
RUN_LOOP (uint32_t, int64_t)
RUN_LOOP (int64_t, int64_t)
RUN_LOOP (uint64_t, int64_t)
RUN_LOOP (_Float16, int64_t)
RUN_LOOP (float, int64_t)
RUN_LOOP (double, int64_t)
RUN_LOOP (int8_t, uint8_t)
RUN_LOOP (uint8_t, uint8_t)
RUN_LOOP (int16_t, uint8_t)
RUN_LOOP (uint16_t, uint8_t)
RUN_LOOP (int32_t, uint8_t)
RUN_LOOP (uint32_t, uint8_t)
RUN_LOOP (int64_t, uint8_t)
RUN_LOOP (uint64_t, uint8_t)
RUN_LOOP (_Float16, uint8_t)
RUN_LOOP (float, uint8_t)
RUN_LOOP (double, uint8_t)
RUN_LOOP (int8_t, uint16_t)
RUN_LOOP (uint8_t, uint16_t)
RUN_LOOP (int16_t, uint16_t)
RUN_LOOP (uint16_t, uint16_t)
RUN_LOOP (int32_t, uint16_t)
RUN_LOOP (uint32_t, uint16_t)
RUN_LOOP (int64_t, uint16_t)
RUN_LOOP (uint64_t, uint16_t)
RUN_LOOP (_Float16, uint16_t)
RUN_LOOP (float, uint16_t)
RUN_LOOP (double, uint16_t)
RUN_LOOP (int8_t, uint32_t)
RUN_LOOP (uint8_t, uint32_t)
RUN_LOOP (int16_t, uint32_t)
RUN_LOOP (uint16_t, uint32_t)
RUN_LOOP (int32_t, uint32_t)
RUN_LOOP (uint32_t, uint32_t)
RUN_LOOP (int64_t, uint32_t)
RUN_LOOP (uint64_t, uint32_t)
RUN_LOOP (_Float16, uint32_t)
RUN_LOOP (float, uint32_t)
RUN_LOOP (double, uint32_t)
RUN_LOOP (int8_t, uint64_t)
RUN_LOOP (uint8_t, uint64_t)
RUN_LOOP (int16_t, uint64_t)
RUN_LOOP (uint16_t, uint64_t)
RUN_LOOP (int32_t, uint64_t)
RUN_LOOP (uint32_t, uint64_t)
RUN_LOOP (int64_t, uint64_t)
RUN_LOOP (uint64_t, uint64_t)
RUN_LOOP (_Float16, uint64_t)
RUN_LOOP (float, uint64_t)
RUN_LOOP (double, uint64_t)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-2.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-3.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-4.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-5.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-6.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-7.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-8.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,41 @@
/* { dg-do run { target { riscv_vector } } } */
#include "gather_load-9.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,39 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define INDEX16 uint16_t
#define INDEX32 uint32_t
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,116 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-vect-cost-model -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x, \
INDEX_TYPE *restrict index, \
INDEX_TYPE *restrict cond) \
{ \
for (int i = 0; i < 100; ++i) \
{ \
if (cond[i * 2]) \
y[i * 2] = x[index[i * 2]] + 1; \
if (cond[i * 2 + 1]) \
y[i * 2 + 1] = x[index[i * 2 + 1]] + 2; \
} \
}
TEST_LOOP (int8_t, int8_t)
TEST_LOOP (uint8_t, int8_t)
TEST_LOOP (int16_t, int8_t)
TEST_LOOP (uint16_t, int8_t)
TEST_LOOP (int32_t, int8_t)
TEST_LOOP (uint32_t, int8_t)
TEST_LOOP (int64_t, int8_t)
TEST_LOOP (uint64_t, int8_t)
TEST_LOOP (_Float16, int8_t)
TEST_LOOP (float, int8_t)
TEST_LOOP (double, int8_t)
TEST_LOOP (int8_t, int16_t)
TEST_LOOP (uint8_t, int16_t)
TEST_LOOP (int16_t, int16_t)
TEST_LOOP (uint16_t, int16_t)
TEST_LOOP (int32_t, int16_t)
TEST_LOOP (uint32_t, int16_t)
TEST_LOOP (int64_t, int16_t)
TEST_LOOP (uint64_t, int16_t)
TEST_LOOP (_Float16, int16_t)
TEST_LOOP (float, int16_t)
TEST_LOOP (double, int16_t)
TEST_LOOP (int8_t, int32_t)
TEST_LOOP (uint8_t, int32_t)
TEST_LOOP (int16_t, int32_t)
TEST_LOOP (uint16_t, int32_t)
TEST_LOOP (int32_t, int32_t)
TEST_LOOP (uint32_t, int32_t)
TEST_LOOP (int64_t, int32_t)
TEST_LOOP (uint64_t, int32_t)
TEST_LOOP (_Float16, int32_t)
TEST_LOOP (float, int32_t)
TEST_LOOP (double, int32_t)
TEST_LOOP (int8_t, int64_t)
TEST_LOOP (uint8_t, int64_t)
TEST_LOOP (int16_t, int64_t)
TEST_LOOP (uint16_t, int64_t)
TEST_LOOP (int32_t, int64_t)
TEST_LOOP (uint32_t, int64_t)
TEST_LOOP (int64_t, int64_t)
TEST_LOOP (uint64_t, int64_t)
TEST_LOOP (_Float16, int64_t)
TEST_LOOP (float, int64_t)
TEST_LOOP (double, int64_t)
TEST_LOOP (int8_t, uint8_t)
TEST_LOOP (uint8_t, uint8_t)
TEST_LOOP (int16_t, uint8_t)
TEST_LOOP (uint16_t, uint8_t)
TEST_LOOP (int32_t, uint8_t)
TEST_LOOP (uint32_t, uint8_t)
TEST_LOOP (int64_t, uint8_t)
TEST_LOOP (uint64_t, uint8_t)
TEST_LOOP (_Float16, uint8_t)
TEST_LOOP (float, uint8_t)
TEST_LOOP (double, uint8_t)
TEST_LOOP (int8_t, uint16_t)
TEST_LOOP (uint8_t, uint16_t)
TEST_LOOP (int16_t, uint16_t)
TEST_LOOP (uint16_t, uint16_t)
TEST_LOOP (int32_t, uint16_t)
TEST_LOOP (uint32_t, uint16_t)
TEST_LOOP (int64_t, uint16_t)
TEST_LOOP (uint64_t, uint16_t)
TEST_LOOP (_Float16, uint16_t)
TEST_LOOP (float, uint16_t)
TEST_LOOP (double, uint16_t)
TEST_LOOP (int8_t, uint32_t)
TEST_LOOP (uint8_t, uint32_t)
TEST_LOOP (int16_t, uint32_t)
TEST_LOOP (uint16_t, uint32_t)
TEST_LOOP (int32_t, uint32_t)
TEST_LOOP (uint32_t, uint32_t)
TEST_LOOP (int64_t, uint32_t)
TEST_LOOP (uint64_t, uint32_t)
TEST_LOOP (_Float16, uint32_t)
TEST_LOOP (float, uint32_t)
TEST_LOOP (double, uint32_t)
TEST_LOOP (int8_t, uint64_t)
TEST_LOOP (uint8_t, uint64_t)
TEST_LOOP (int16_t, uint64_t)
TEST_LOOP (uint16_t, uint64_t)
TEST_LOOP (int32_t, uint64_t)
TEST_LOOP (uint32_t, uint64_t)
TEST_LOOP (int64_t, uint64_t)
TEST_LOOP (uint64_t, uint64_t)
TEST_LOOP (_Float16, uint64_t)
TEST_LOOP (float, uint64_t)
TEST_LOOP (double, uint64_t)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
/* { dg-final { scan-assembler-not {vlse64\.v\s+v[0-9]+,\s*0\([a-x0-9]+\),\s*zero} } } */

View file

@ -0,0 +1,39 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 uint16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 int16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 uint32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 int32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[i] += src[indices[i]]; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-1.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-10.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,140 @@
/* { dg-do run { target { riscv_vector } } } */
/* { dg-additional-options "-mcmodel=medany" } */
#include "mask_gather_load-11.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, INDEX_TYPE) \
DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
DATA_TYPE dest2_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
INDEX_TYPE cond_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
for (int i = 0; i < 202; i++) \
{ \
src_##DATA_TYPE##_##INDEX_TYPE[i] \
= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1)); \
dest_##DATA_TYPE##_##INDEX_TYPE[i] \
= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1)); \
dest2_##DATA_TYPE##_##INDEX_TYPE[i] \
= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1)); \
index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55); \
cond_##DATA_TYPE##_##INDEX_TYPE[i] = (INDEX_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE, \
src_##DATA_TYPE##_##INDEX_TYPE, \
index_##DATA_TYPE##_##INDEX_TYPE, \
cond_##DATA_TYPE##_##INDEX_TYPE); \
for (int i = 0; i < 100; i++) \
{ \
if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2]) \
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2] \
== (src_##DATA_TYPE##_##INDEX_TYPE \
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2]] \
+ 1)); \
else \
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2] \
== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2]); \
if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]) \
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1] \
== (src_##DATA_TYPE##_##INDEX_TYPE \
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]] \
+ 2)); \
else \
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1] \
== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]); \
}
RUN_LOOP (int8_t, int8_t)
RUN_LOOP (uint8_t, int8_t)
RUN_LOOP (int16_t, int8_t)
RUN_LOOP (uint16_t, int8_t)
RUN_LOOP (int32_t, int8_t)
RUN_LOOP (uint32_t, int8_t)
RUN_LOOP (int64_t, int8_t)
RUN_LOOP (uint64_t, int8_t)
RUN_LOOP (_Float16, int8_t)
RUN_LOOP (float, int8_t)
RUN_LOOP (double, int8_t)
RUN_LOOP (int8_t, int16_t)
RUN_LOOP (uint8_t, int16_t)
RUN_LOOP (int16_t, int16_t)
RUN_LOOP (uint16_t, int16_t)
RUN_LOOP (int32_t, int16_t)
RUN_LOOP (uint32_t, int16_t)
RUN_LOOP (int64_t, int16_t)
RUN_LOOP (uint64_t, int16_t)
RUN_LOOP (_Float16, int16_t)
RUN_LOOP (float, int16_t)
RUN_LOOP (double, int16_t)
RUN_LOOP (int8_t, int32_t)
RUN_LOOP (uint8_t, int32_t)
RUN_LOOP (int16_t, int32_t)
RUN_LOOP (uint16_t, int32_t)
RUN_LOOP (int32_t, int32_t)
RUN_LOOP (uint32_t, int32_t)
RUN_LOOP (int64_t, int32_t)
RUN_LOOP (uint64_t, int32_t)
RUN_LOOP (_Float16, int32_t)
RUN_LOOP (float, int32_t)
RUN_LOOP (double, int32_t)
RUN_LOOP (int8_t, int64_t)
RUN_LOOP (uint8_t, int64_t)
RUN_LOOP (int16_t, int64_t)
RUN_LOOP (uint16_t, int64_t)
RUN_LOOP (int32_t, int64_t)
RUN_LOOP (uint32_t, int64_t)
RUN_LOOP (int64_t, int64_t)
RUN_LOOP (uint64_t, int64_t)
RUN_LOOP (_Float16, int64_t)
RUN_LOOP (float, int64_t)
RUN_LOOP (double, int64_t)
RUN_LOOP (int8_t, uint8_t)
RUN_LOOP (uint8_t, uint8_t)
RUN_LOOP (int16_t, uint8_t)
RUN_LOOP (uint16_t, uint8_t)
RUN_LOOP (int32_t, uint8_t)
RUN_LOOP (uint32_t, uint8_t)
RUN_LOOP (int64_t, uint8_t)
RUN_LOOP (uint64_t, uint8_t)
RUN_LOOP (_Float16, uint8_t)
RUN_LOOP (float, uint8_t)
RUN_LOOP (double, uint8_t)
RUN_LOOP (int8_t, uint16_t)
RUN_LOOP (uint8_t, uint16_t)
RUN_LOOP (int16_t, uint16_t)
RUN_LOOP (uint16_t, uint16_t)
RUN_LOOP (int32_t, uint16_t)
RUN_LOOP (uint32_t, uint16_t)
RUN_LOOP (int64_t, uint16_t)
RUN_LOOP (uint64_t, uint16_t)
RUN_LOOP (_Float16, uint16_t)
RUN_LOOP (float, uint16_t)
RUN_LOOP (double, uint16_t)
RUN_LOOP (int8_t, uint32_t)
RUN_LOOP (uint8_t, uint32_t)
RUN_LOOP (int16_t, uint32_t)
RUN_LOOP (uint16_t, uint32_t)
RUN_LOOP (int32_t, uint32_t)
RUN_LOOP (uint32_t, uint32_t)
RUN_LOOP (int64_t, uint32_t)
RUN_LOOP (uint64_t, uint32_t)
RUN_LOOP (_Float16, uint32_t)
RUN_LOOP (float, uint32_t)
RUN_LOOP (double, uint32_t)
RUN_LOOP (int8_t, uint64_t)
RUN_LOOP (uint8_t, uint64_t)
RUN_LOOP (int16_t, uint64_t)
RUN_LOOP (uint16_t, uint64_t)
RUN_LOOP (int32_t, uint64_t)
RUN_LOOP (uint32_t, uint64_t)
RUN_LOOP (int64_t, uint64_t)
RUN_LOOP (uint64_t, uint64_t)
RUN_LOOP (_Float16, uint64_t)
RUN_LOOP (float, uint64_t)
RUN_LOOP (double, uint64_t)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-2.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-3.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-4.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-5.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-6.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
/* { dg-additional-options "-mcmodel=medany" } */
#include "mask_gather_load-7.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
/* { dg-additional-options "-mcmodel=medany" } */
#include "mask_gather_load-8.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_gather_load-9.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[i] \
== (dest2_##DATA_TYPE[i] \
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
else \
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
}
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,39 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define INDEX16 uint16_t
#define INDEX32 uint32_t
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,39 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 uint16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 int16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 uint32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 int32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,36 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
{ \
for (int i = 0; i < 128; ++i) \
if (cond[i]) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-1.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-10.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-2.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-3.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-4.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-5.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-6.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
/* { dg-additional-options "-mcmodel=medany" } */
#include "mask_scatter_store-7.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-8.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,48 @@
/* { dg-do run { target { riscv_vector } } } */
#include "mask_scatter_store-9.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
{ \
if (cond_##DATA_TYPE##_##BITS[i]) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1)); \
else \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
}
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,38 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define INDEX16 uint16_t
#define INDEX32 uint32_t
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,38 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 uint8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX8 int8_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 8) \
T (uint8_t, 8) \
T (int16_t, 8) \
T (uint16_t, 8) \
T (_Float16, 8) \
T (int32_t, 8) \
T (uint32_t, 8) \
T (float, 8) \
T (int64_t, 8) \
T (uint64_t, 8) \
T (double, 8)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 uint16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX16 int16_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 16) \
T (uint8_t, 16) \
T (int16_t, 16) \
T (uint16_t, 16) \
T (_Float16, 16) \
T (int32_t, 16) \
T (uint32_t, 16) \
T (float, 16) \
T (int64_t, 16) \
T (uint64_t, 16) \
T (double, 16)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 uint32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX32 int32_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 32) \
T (uint8_t, 32) \
T (int16_t, 32) \
T (uint16_t, 32) \
T (_Float16, 32) \
T (int32_t, 32) \
T (uint32_t, 32) \
T (float, 32) \
T (int64_t, 32) \
T (uint64_t, 32) \
T (double, 32)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,35 @@
/* { dg-do compile } */
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
#include <stdint-gcc.h>
#define INDEX64 uint64_t
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS *restrict indices) \
{ \
for (int i = 0; i < 128; ++i) \
dest[indices[i]] = src[i] + 1; \
}
#define TEST_ALL(T) \
T (int8_t, 64) \
T (uint8_t, 64) \
T (int16_t, 64) \
T (uint16_t, 64) \
T (_Float16, 64) \
T (int32_t, 64) \
T (uint32_t, 64) \
T (float, 64) \
T (int64_t, 64) \
T (uint64_t, 64) \
T (double, 64)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-1.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-10.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
/* { dg-additional-options "-mcmodel=medany" } */
#include "scatter_store-2.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-3.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-4.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-5.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-6.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-7.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
#include "scatter_store-8.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
return 0;
}

View file

@ -0,0 +1,40 @@
/* { dg-do run { target { riscv_vector } } } */
/* { dg-additional-options "-mcmodel=medany" } */
#include "scatter_store-9.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE[128]; \
DATA_TYPE dest2_##DATA_TYPE[128]; \
DATA_TYPE src_##DATA_TYPE[128]; \
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
for (int i = 0; i < 128; i++) \
{ \
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
} \
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
indices_##DATA_TYPE##_##BITS); \
for (int i = 0; i < 128; i++) \
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
== (src_##DATA_TYPE[i] + 1));
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,45 @@
/* { dg-do compile } */
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
#include <stdint-gcc.h>
#ifndef INDEX8
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#endif
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS stride, INDEX##BITS n) \
{ \
for (INDEX##BITS i = 0; i < n; ++i) \
dest[i] += src[i * stride]; \
}
#define TEST_TYPE(T, DATA_TYPE) \
T (DATA_TYPE, 8) \
T (DATA_TYPE, 16) \
T (DATA_TYPE, 32) \
T (DATA_TYPE, 64)
#define TEST_ALL(T) \
TEST_TYPE (T, int8_t) \
TEST_TYPE (T, uint8_t) \
TEST_TYPE (T, int16_t) \
TEST_TYPE (T, uint16_t) \
TEST_TYPE (T, _Float16) \
TEST_TYPE (T, int32_t) \
TEST_TYPE (T, uint32_t) \
TEST_TYPE (T, float) \
TEST_TYPE (T, int64_t) \
TEST_TYPE (T, uint64_t) \
TEST_TYPE (T, double)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 66 "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */

View file

@ -0,0 +1,45 @@
/* { dg-do compile } */
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
#include <stdint-gcc.h>
#ifndef INDEX8
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#endif
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS stride, INDEX##BITS n) \
{ \
for (INDEX##BITS i = 0; i < (BITS + 13); ++i) \
dest[i] += src[i * (BITS - 3)]; \
}
#define TEST_TYPE(T, DATA_TYPE) \
T (DATA_TYPE, 8) \
T (DATA_TYPE, 16) \
T (DATA_TYPE, 32) \
T (DATA_TYPE, 64)
#define TEST_ALL(T) \
TEST_TYPE (T, int8_t) \
TEST_TYPE (T, uint8_t) \
TEST_TYPE (T, int16_t) \
TEST_TYPE (T, uint16_t) \
TEST_TYPE (T, _Float16) \
TEST_TYPE (T, int32_t) \
TEST_TYPE (T, uint32_t) \
TEST_TYPE (T, float) \
TEST_TYPE (T, int64_t) \
TEST_TYPE (T, uint64_t) \
TEST_TYPE (T, double)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 46 "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */

View file

@ -0,0 +1,84 @@
/* { dg-do run { target { riscv_vector } } } */
#include "strided_load-1.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3); \
INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13); \
for (INDEX##BITS i = 0; \
i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++) \
{ \
dest_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
} \
f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
stride_##DATA_TYPE##_##BITS, \
n_##DATA_TYPE##_##BITS); \
for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++) \
{ \
assert ( \
dest_##DATA_TYPE##_##BITS[i] \
== (dest2_##DATA_TYPE##_##BITS[i] \
+ src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS])); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,84 @@
/* { dg-do run { target { riscv_vector } } } */
#include "strided_load-2.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3); \
INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13); \
for (INDEX##BITS i = 0; \
i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++) \
{ \
dest_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
} \
f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
stride_##DATA_TYPE##_##BITS, \
n_##DATA_TYPE##_##BITS); \
for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++) \
{ \
assert ( \
dest_##DATA_TYPE##_##BITS[i] \
== (dest2_##DATA_TYPE##_##BITS[i] \
+ src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS])); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

View file

@ -0,0 +1,45 @@
/* { dg-do compile } */
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
#include <stdint-gcc.h>
#ifndef INDEX8
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#endif
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS stride, INDEX##BITS n) \
{ \
for (INDEX##BITS i = 0; i < n; ++i) \
dest[i * stride] = src[i] + BITS; \
}
#define TEST_TYPE(T, DATA_TYPE) \
T (DATA_TYPE, 8) \
T (DATA_TYPE, 16) \
T (DATA_TYPE, 32) \
T (DATA_TYPE, 64)
#define TEST_ALL(T) \
TEST_TYPE (T, int8_t) \
TEST_TYPE (T, uint8_t) \
TEST_TYPE (T, int16_t) \
TEST_TYPE (T, uint16_t) \
TEST_TYPE (T, _Float16) \
TEST_TYPE (T, int32_t) \
TEST_TYPE (T, uint32_t) \
TEST_TYPE (T, float) \
TEST_TYPE (T, int64_t) \
TEST_TYPE (T, uint64_t) \
TEST_TYPE (T, double)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 66 "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */

View file

@ -0,0 +1,45 @@
/* { dg-do compile } */
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
#include <stdint-gcc.h>
#ifndef INDEX8
#define INDEX8 int8_t
#define INDEX16 int16_t
#define INDEX32 int32_t
#define INDEX64 int64_t
#endif
#define TEST_LOOP(DATA_TYPE, BITS) \
void __attribute__ ((noinline, noclone)) \
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
INDEX##BITS stride, INDEX##BITS n) \
{ \
for (INDEX##BITS i = 0; i < n; ++i) \
dest[i * (BITS - 3)] = src[i] + BITS; \
}
#define TEST_TYPE(T, DATA_TYPE) \
T (DATA_TYPE, 8) \
T (DATA_TYPE, 16) \
T (DATA_TYPE, 32) \
T (DATA_TYPE, 64)
#define TEST_ALL(T) \
TEST_TYPE (T, int8_t) \
TEST_TYPE (T, uint8_t) \
TEST_TYPE (T, int16_t) \
TEST_TYPE (T, uint16_t) \
TEST_TYPE (T, _Float16) \
TEST_TYPE (T, int32_t) \
TEST_TYPE (T, uint32_t) \
TEST_TYPE (T, float) \
TEST_TYPE (T, int64_t) \
TEST_TYPE (T, uint64_t) \
TEST_TYPE (T, double)
TEST_ALL (TEST_LOOP)
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 44 "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */

View file

@ -0,0 +1,82 @@
/* { dg-do run { target { riscv_vector } } } */
#include "strided_store-1.c"
#include <assert.h>
int
main (void)
{
#define RUN_LOOP(DATA_TYPE, BITS) \
DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3); \
INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13); \
for (INDEX##BITS i = 0; \
i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++) \
{ \
dest_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
dest2_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
src_##DATA_TYPE##_##BITS[i] \
= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
} \
f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
stride_##DATA_TYPE##_##BITS, \
n_##DATA_TYPE##_##BITS); \
for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++) \
{ \
assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS] \
== (src_##DATA_TYPE##_##BITS[i] + BITS)); \
}
RUN_LOOP (int8_t, 8)
RUN_LOOP (uint8_t, 8)
RUN_LOOP (int16_t, 8)
RUN_LOOP (uint16_t, 8)
RUN_LOOP (_Float16, 8)
RUN_LOOP (int32_t, 8)
RUN_LOOP (uint32_t, 8)
RUN_LOOP (float, 8)
RUN_LOOP (int64_t, 8)
RUN_LOOP (uint64_t, 8)
RUN_LOOP (double, 8)
RUN_LOOP (int8_t, 16)
RUN_LOOP (uint8_t, 16)
RUN_LOOP (int16_t, 16)
RUN_LOOP (uint16_t, 16)
RUN_LOOP (_Float16, 16)
RUN_LOOP (int32_t, 16)
RUN_LOOP (uint32_t, 16)
RUN_LOOP (float, 16)
RUN_LOOP (int64_t, 16)
RUN_LOOP (uint64_t, 16)
RUN_LOOP (double, 16)
RUN_LOOP (int8_t, 32)
RUN_LOOP (uint8_t, 32)
RUN_LOOP (int16_t, 32)
RUN_LOOP (uint16_t, 32)
RUN_LOOP (_Float16, 32)
RUN_LOOP (int32_t, 32)
RUN_LOOP (uint32_t, 32)
RUN_LOOP (float, 32)
RUN_LOOP (int64_t, 32)
RUN_LOOP (uint64_t, 32)
RUN_LOOP (double, 32)
RUN_LOOP (int8_t, 64)
RUN_LOOP (uint8_t, 64)
RUN_LOOP (int16_t, 64)
RUN_LOOP (uint16_t, 64)
RUN_LOOP (_Float16, 64)
RUN_LOOP (int32_t, 64)
RUN_LOOP (uint32_t, 64)
RUN_LOOP (float, 64)
RUN_LOOP (int64_t, 64)
RUN_LOOP (uint64_t, 64)
RUN_LOOP (double, 64)
return 0;
}

Some files were not shown because too many files have changed in this diff Show more