RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
This patch fully support gather_load/scatter_store: 1. Support single-rgroup on both RV32/RV64. 2. Support indexed element width can be same as or smaller than Pmode. 3. Support VLA SLP with gather/scatter. 4. Fully tested all gather/scatter with LMUL = M1/M2/M4/M8 both VLA and VLS. 5. Fix bug of handling (subreg:SI (const_poly_int:DI)) 6. Fix bug on vec_perm which is used by gather/scatter SLP. All kinds of GATHER/SCATTER are normalized into LEN_MASK_*. We fully supported these 4 kinds of gather/scatter: 1. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and dummy mask (Full vector). 2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and real mask. 3. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and dummy mask. 4. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and real mask. Base on the disscussions with Richards, we don't lower vlse/vsse in RISC-V backend for strided load/store. Instead, we leave it to the middle-end to handle that. Regression is pass ok for trunk ? gcc/ChangeLog: * config/riscv/autovec.md (len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): New pattern. (len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto. (len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto. (len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto. (len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto. (len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto. (len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>): Ditto. (len_mask_gather_load<mode><mode>): Ditto. (len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Ditto. (len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto. (len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto. (len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto. (len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto. (len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto. (len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>): Ditto. (len_mask_scatter_store<mode><mode>): Ditto. * config/riscv/predicates.md (const_1_operand): New predicate. (vector_gs_scale_operand_16): Ditto. (vector_gs_scale_operand_32): Ditto. (vector_gs_scale_operand_64): Ditto. (vector_gs_extension_operand): Ditto. (vector_gs_scale_operand_16_rv32): Ditto. (vector_gs_scale_operand_32_rv32): Ditto. * config/riscv/riscv-protos.h (enum insn_type): Add gather/scatter. (expand_gather_scatter): New function. * config/riscv/riscv-v.cc (gen_const_vector_dup): Add gather/scatter. (emit_vlmax_masked_store_insn): New function. (emit_nonvlmax_masked_store_insn): Ditto. (modulo_sel_indices): Ditto. (expand_vec_perm): Fix SLP for gather/scatter. (prepare_gather_scatter): New function. (expand_gather_scatter): Ditto. * config/riscv/riscv.cc (riscv_legitimize_move): Fix bug of (subreg:SI (DI CONST_POLY_INT)). * config/riscv/vector-iterators.md: Add gather/scatter. * config/riscv/vector.md (vec_duplicate<mode>): Use "@" instead. (@vec_duplicate<mode>): Ditto. (@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>): Fix name. (@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.exp: Add gather/scatter tests. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c: New test. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c: New test.
This commit is contained in:
parent
15939bae35
commit
f048af2aa3
102 changed files with 4987 additions and 62 deletions
|
@ -57,6 +57,262 @@
|
|||
}
|
||||
)
|
||||
|
||||
;; =========================================================================
|
||||
;; == Gather Load
|
||||
;; =========================================================================
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
|
||||
[(match_operand:VNX1_QHSD 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX1_QHSDI 2 "register_operand")
|
||||
(match_operand 3 "<VNX1_QHSD:gs_extension>")
|
||||
(match_operand 4 "<VNX1_QHSD:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
|
||||
[(match_operand:VNX2_QHSD 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX2_QHSDI 2 "register_operand")
|
||||
(match_operand 3 "<VNX2_QHSD:gs_extension>")
|
||||
(match_operand 4 "<VNX2_QHSD:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
|
||||
[(match_operand:VNX4_QHSD 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX4_QHSDI 2 "register_operand")
|
||||
(match_operand 3 "<VNX4_QHSD:gs_extension>")
|
||||
(match_operand 4 "<VNX4_QHSD:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
|
||||
[(match_operand:VNX8_QHSD 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX8_QHSDI 2 "register_operand")
|
||||
(match_operand 3 "<VNX8_QHSD:gs_extension>")
|
||||
(match_operand 4 "<VNX8_QHSD:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
|
||||
[(match_operand:VNX16_QHSD 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX16_QHSDI 2 "register_operand")
|
||||
(match_operand 3 "<VNX16_QHSD:gs_extension>")
|
||||
(match_operand 4 "<VNX16_QHSD:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>"
|
||||
[(match_operand:VNX32_QHS 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX32_QHSI 2 "register_operand")
|
||||
(match_operand 3 "<VNX32_QHS:gs_extension>")
|
||||
(match_operand 4 "<VNX32_QHS:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>"
|
||||
[(match_operand:VNX64_QH 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX64_QHI 2 "register_operand")
|
||||
(match_operand 3 "<VNX64_QH:gs_extension>")
|
||||
(match_operand 4 "<VNX64_QH:gs_scale>")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
;; When SEW = 8 and LMUL = 8, we can't find any index mode with
|
||||
;; larger SEW. Since RVV indexed load/store support zero extend
|
||||
;; implicitly and not support scaling, we should only allow
|
||||
;; operands[3] and operands[4] to be const_1_operand.
|
||||
(define_expand "len_mask_gather_load<mode><mode>"
|
||||
[(match_operand:VNX128_Q 0 "register_operand")
|
||||
(match_operand 1 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX128_Q 2 "register_operand")
|
||||
(match_operand 3 "const_1_operand")
|
||||
(match_operand 4 "const_1_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, true);
|
||||
DONE;
|
||||
})
|
||||
|
||||
;; =========================================================================
|
||||
;; == Scatter Store
|
||||
;; =========================================================================
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX1_QHSDI 1 "register_operand")
|
||||
(match_operand 2 "<VNX1_QHSD:gs_extension>")
|
||||
(match_operand 3 "<VNX1_QHSD:gs_scale>")
|
||||
(match_operand:VNX1_QHSD 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX2_QHSDI 1 "register_operand")
|
||||
(match_operand 2 "<VNX2_QHSD:gs_extension>")
|
||||
(match_operand 3 "<VNX2_QHSD:gs_scale>")
|
||||
(match_operand:VNX2_QHSD 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX4_QHSDI 1 "register_operand")
|
||||
(match_operand 2 "<VNX4_QHSD:gs_extension>")
|
||||
(match_operand 3 "<VNX4_QHSD:gs_scale>")
|
||||
(match_operand:VNX4_QHSD 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX8_QHSDI 1 "register_operand")
|
||||
(match_operand 2 "<VNX8_QHSD:gs_extension>")
|
||||
(match_operand 3 "<VNX8_QHSD:gs_scale>")
|
||||
(match_operand:VNX8_QHSD 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX16_QHSDI 1 "register_operand")
|
||||
(match_operand 2 "<VNX16_QHSD:gs_extension>")
|
||||
(match_operand 3 "<VNX16_QHSD:gs_scale>")
|
||||
(match_operand:VNX16_QHSD 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX32_QHSI 1 "register_operand")
|
||||
(match_operand 2 "<VNX32_QHS:gs_extension>")
|
||||
(match_operand 3 "<VNX32_QHS:gs_scale>")
|
||||
(match_operand:VNX32_QHS 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
(define_expand "len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX64_QHI 1 "register_operand")
|
||||
(match_operand 2 "<VNX64_QH:gs_extension>")
|
||||
(match_operand 3 "<VNX64_QH:gs_scale>")
|
||||
(match_operand:VNX64_QH 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
;; When SEW = 8 and LMUL = 8, we can't find any index mode with
|
||||
;; larger SEW. Since RVV indexed load/store support zero extend
|
||||
;; implicitly and not support scaling, we should only allow
|
||||
;; operands[3] and operands[4] to be const_1_operand.
|
||||
(define_expand "len_mask_scatter_store<mode><mode>"
|
||||
[(match_operand 0 "pmode_reg_or_0_operand")
|
||||
(match_operand:VNX128_Q 1 "register_operand")
|
||||
(match_operand 2 "const_1_operand")
|
||||
(match_operand 3 "const_1_operand")
|
||||
(match_operand:VNX128_Q 4 "register_operand")
|
||||
(match_operand 5 "autovec_length_operand")
|
||||
(match_operand 6 "const_0_operand")
|
||||
(match_operand:<VM> 7 "vector_mask_operand")]
|
||||
"TARGET_VECTOR"
|
||||
{
|
||||
riscv_vector::expand_gather_scatter (operands, false);
|
||||
DONE;
|
||||
})
|
||||
|
||||
;; =========================================================================
|
||||
;; == Vector creation
|
||||
;; =========================================================================
|
||||
|
|
|
@ -61,6 +61,10 @@
|
|||
(and (match_code "const_int,const_wide_int,const_vector")
|
||||
(match_test "op == CONST0_RTX (GET_MODE (op))")))
|
||||
|
||||
(define_predicate "const_1_operand"
|
||||
(and (match_code "const_int,const_wide_int,const_vector")
|
||||
(match_test "op == CONST1_RTX (GET_MODE (op))")))
|
||||
|
||||
(define_predicate "reg_or_0_operand"
|
||||
(ior (match_operand 0 "const_0_operand")
|
||||
(match_operand 0 "register_operand")))
|
||||
|
@ -341,6 +345,33 @@
|
|||
(ior (match_operand 0 "register_operand")
|
||||
(match_code "const_vector")))
|
||||
|
||||
(define_predicate "vector_gs_scale_operand_16"
|
||||
(and (match_code "const_int")
|
||||
(match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
|
||||
|
||||
(define_predicate "vector_gs_scale_operand_32"
|
||||
(and (match_code "const_int")
|
||||
(match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
|
||||
|
||||
(define_predicate "vector_gs_scale_operand_64"
|
||||
(and (match_code "const_int")
|
||||
(match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == DImode)")))
|
||||
|
||||
(define_predicate "vector_gs_extension_operand"
|
||||
(ior (match_operand 0 "const_1_operand")
|
||||
(and (match_operand 0 "const_0_operand")
|
||||
(match_test "Pmode == SImode"))))
|
||||
|
||||
(define_predicate "vector_gs_scale_operand_16_rv32"
|
||||
(and (match_code "const_int")
|
||||
(match_test "INTVAL (op) == 1
|
||||
|| (INTVAL (op) == 2 && Pmode == SImode)")))
|
||||
|
||||
(define_predicate "vector_gs_scale_operand_32_rv32"
|
||||
(and (match_code "const_int")
|
||||
(match_test "INTVAL (op) == 1
|
||||
|| (INTVAL (op) == 4 && Pmode == SImode)")))
|
||||
|
||||
(define_predicate "ltge_operator"
|
||||
(match_code "lt,ltu,ge,geu"))
|
||||
|
||||
|
@ -376,7 +407,7 @@
|
|||
|| rtx_equal_p (op, CONST0_RTX (GET_MODE (op))))
|
||||
&& maybe_gt (GET_MODE_BITSIZE (GET_MODE (op)), GET_MODE_BITSIZE (Pmode)))")
|
||||
(ior (match_test "rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))")
|
||||
(ior (match_operand 0 "const_int_operand")
|
||||
(ior (match_code "const_int,const_poly_int")
|
||||
(ior (match_operand 0 "register_operand")
|
||||
(match_test "satisfies_constraint_Wdm (op)"))))))
|
||||
|
||||
|
|
|
@ -195,6 +195,8 @@ enum insn_type
|
|||
RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md. */
|
||||
RVV_SLIDE_OP = 4, /* Dest, VUNDEF, source and offset. */
|
||||
RVV_COMPRESS_OP = 4,
|
||||
RVV_GATHER_M_OP = 5,
|
||||
RVV_SCATTER_M_OP = 4,
|
||||
};
|
||||
enum vlmul_type
|
||||
{
|
||||
|
@ -303,6 +305,7 @@ void expand_vec_init (rtx, rtx);
|
|||
void expand_vec_perm (rtx, rtx, rtx, rtx);
|
||||
void expand_select_vl (rtx *);
|
||||
void expand_load_store (rtx *, bool);
|
||||
void expand_gather_scatter (rtx *, bool);
|
||||
|
||||
/* Rounding mode bitfield for fixed point VXRM. */
|
||||
enum fixed_point_rounding_mode
|
||||
|
|
|
@ -556,16 +556,23 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, poly_int64 maxval)
|
|||
return true;
|
||||
}
|
||||
|
||||
/* Return a const_int vector of VAL.
|
||||
|
||||
This function also exists in aarch64, we may unify it in middle-end in the
|
||||
future. */
|
||||
/* Return a const vector of VAL. The VAL can be either const_int or
|
||||
const_poly_int. */
|
||||
|
||||
static rtx
|
||||
gen_const_vector_dup (machine_mode mode, poly_int64 val)
|
||||
{
|
||||
rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
|
||||
return gen_const_vec_duplicate (mode, c);
|
||||
scalar_mode smode = GET_MODE_INNER (mode);
|
||||
rtx c = gen_int_mode (val, smode);
|
||||
if (!val.is_constant () && GET_MODE_SIZE (smode) > GET_MODE_SIZE (Pmode))
|
||||
{
|
||||
/* When VAL is const_poly_int value, we need to explicitly broadcast
|
||||
it into a vector using RVV broadcast instruction. */
|
||||
rtx dup = gen_reg_rtx (mode);
|
||||
emit_insn (gen_vec_duplicate (mode, dup, c));
|
||||
return dup;
|
||||
}
|
||||
return gen_const_vec_duplicate (mode, c);
|
||||
}
|
||||
|
||||
/* Emit a vlmax vsetvl instruction. This should only be used when
|
||||
|
@ -901,6 +908,39 @@ emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
|
|||
e.emit_insn ((enum insn_code) icode, ops);
|
||||
}
|
||||
|
||||
/* This function emits a VLMAX masked store instruction. */
|
||||
static void
|
||||
emit_vlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops)
|
||||
{
|
||||
machine_mode dest_mode = GET_MODE (ops[0]);
|
||||
machine_mode mask_mode = get_mask_mode (dest_mode).require ();
|
||||
insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
|
||||
/*HAS_DEST_P*/ false,
|
||||
/*FULLY_UNMASKED_P*/ false,
|
||||
/*USE_REAL_MERGE_P*/ true,
|
||||
/*HAS_AVL_P*/ true,
|
||||
/*VLMAX_P*/ true, dest_mode,
|
||||
mask_mode);
|
||||
e.emit_insn ((enum insn_code) icode, ops);
|
||||
}
|
||||
|
||||
/* This function emits a non-VLMAX masked store instruction. */
|
||||
static void
|
||||
emit_nonvlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
|
||||
{
|
||||
machine_mode dest_mode = GET_MODE (ops[0]);
|
||||
machine_mode mask_mode = get_mask_mode (dest_mode).require ();
|
||||
insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
|
||||
/*HAS_DEST_P*/ false,
|
||||
/*FULLY_UNMASKED_P*/ false,
|
||||
/*USE_REAL_MERGE_P*/ true,
|
||||
/*HAS_AVL_P*/ true,
|
||||
/*VLMAX_P*/ false, dest_mode,
|
||||
mask_mode);
|
||||
e.set_vl (avl);
|
||||
e.emit_insn ((enum insn_code) icode, ops);
|
||||
}
|
||||
|
||||
/* This function emits a masked instruction. */
|
||||
void
|
||||
emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
|
||||
|
@ -1194,7 +1234,6 @@ static void
|
|||
expand_const_vector (rtx target, rtx src)
|
||||
{
|
||||
machine_mode mode = GET_MODE (target);
|
||||
scalar_mode elt_mode = GET_MODE_INNER (mode);
|
||||
if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
|
||||
{
|
||||
rtx elt;
|
||||
|
@ -1219,7 +1258,6 @@ expand_const_vector (rtx target, rtx src)
|
|||
}
|
||||
else
|
||||
{
|
||||
elt = force_reg (elt_mode, elt);
|
||||
rtx ops[] = {tmp, elt};
|
||||
emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
|
||||
}
|
||||
|
@ -2488,6 +2526,25 @@ expand_vec_cmp_float (rtx target, rtx_code code, rtx op0, rtx op1,
|
|||
return false;
|
||||
}
|
||||
|
||||
/* Modulo all SEL indices to ensure they are all in range if [0, MAX_SEL]. */
|
||||
static rtx
|
||||
modulo_sel_indices (rtx sel, poly_uint64 max_sel)
|
||||
{
|
||||
rtx sel_mod;
|
||||
machine_mode sel_mode = GET_MODE (sel);
|
||||
poly_uint64 nunits = GET_MODE_NUNITS (sel_mode);
|
||||
/* If SEL is variable-length CONST_VECTOR, we don't need to modulo it. */
|
||||
if (!nunits.is_constant () && CONST_VECTOR_P (sel))
|
||||
sel_mod = sel;
|
||||
else
|
||||
{
|
||||
rtx mod = gen_const_vector_dup (sel_mode, max_sel);
|
||||
sel_mod
|
||||
= expand_simple_binop (sel_mode, AND, sel, mod, NULL, 0, OPTAB_DIRECT);
|
||||
}
|
||||
return sel_mod;
|
||||
}
|
||||
|
||||
/* Implement vec_perm<mode>. */
|
||||
|
||||
void
|
||||
|
@ -2501,41 +2558,44 @@ expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
|
|||
index is in range of [0, nunits - 1]. A single vrgather instructions is
|
||||
enough. Since we will use vrgatherei16.vv for variable-length vector,
|
||||
it is never out of range and we don't need to modulo the index. */
|
||||
if (!nunits.is_constant () || const_vec_all_in_range_p (sel, 0, nunits - 1))
|
||||
if (nunits.is_constant () && const_vec_all_in_range_p (sel, 0, nunits - 1))
|
||||
{
|
||||
emit_vlmax_gather_insn (target, op0, sel);
|
||||
return;
|
||||
}
|
||||
|
||||
/* Check if the two values vectors are the same. */
|
||||
if (rtx_equal_p (op0, op1) || const_vec_duplicate_p (sel))
|
||||
/* Check if all the indices are same. */
|
||||
rtx elt;
|
||||
if (const_vec_duplicate_p (sel, &elt))
|
||||
{
|
||||
/* Note: vec_perm indices are supposed to wrap when they go beyond the
|
||||
size of the two value vectors, i.e. the upper bits of the indices
|
||||
are effectively ignored. RVV vrgather instead produces 0 for any
|
||||
out-of-range indices, so we need to modulo all the vec_perm indices
|
||||
to ensure they are all in range of [0, nunits - 1]. */
|
||||
rtx max_sel = gen_const_vector_dup (sel_mode, nunits - 1);
|
||||
rtx sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
|
||||
OPTAB_DIRECT);
|
||||
emit_vlmax_gather_insn (target, op1, sel_mod);
|
||||
poly_uint64 value = rtx_to_poly_int64 (elt);
|
||||
rtx op = op0;
|
||||
if (maybe_gt (value, nunits - 1))
|
||||
{
|
||||
sel = gen_const_vector_dup (sel_mode, value - nunits);
|
||||
op = op1;
|
||||
}
|
||||
emit_vlmax_gather_insn (target, op, sel);
|
||||
}
|
||||
|
||||
/* Note: vec_perm indices are supposed to wrap when they go beyond the
|
||||
size of the two value vectors, i.e. the upper bits of the indices
|
||||
are effectively ignored. RVV vrgather instead produces 0 for any
|
||||
out-of-range indices, so we need to modulo all the vec_perm indices
|
||||
to ensure they are all in range of [0, nunits - 1] when op0 == op1
|
||||
or all in range of [0, 2 * nunits - 1] when op0 != op1. */
|
||||
rtx sel_mod
|
||||
= modulo_sel_indices (sel,
|
||||
rtx_equal_p (op0, op1) ? nunits - 1 : 2 * nunits - 1);
|
||||
|
||||
/* Check if the two values vectors are the same. */
|
||||
if (rtx_equal_p (op0, op1))
|
||||
{
|
||||
emit_vlmax_gather_insn (target, op0, sel_mod);
|
||||
return;
|
||||
}
|
||||
|
||||
rtx sel_mod = sel;
|
||||
rtx max_sel = gen_const_vector_dup (sel_mode, 2 * nunits - 1);
|
||||
/* We don't need to modulo indices for VLA vector.
|
||||
Since we should gurantee they aren't out of range before. */
|
||||
if (nunits.is_constant ())
|
||||
{
|
||||
/* Note: vec_perm indices are supposed to wrap when they go beyond the
|
||||
size of the two value vectors, i.e. the upper bits of the indices
|
||||
are effectively ignored. RVV vrgather instead produces 0 for any
|
||||
out-of-range indices, so we need to modulo all the vec_perm indices
|
||||
to ensure they are all in range of [0, 2 * nunits - 1]. */
|
||||
sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
|
||||
OPTAB_DIRECT);
|
||||
}
|
||||
|
||||
/* This following sequence is handling the case that:
|
||||
__builtin_shufflevector (vec1, vec2, index...), the index can be any
|
||||
|
@ -3007,6 +3067,7 @@ expand_load_store (rtx *ops, bool is_load)
|
|||
}
|
||||
}
|
||||
|
||||
|
||||
/* Return true if the operation is the floating-point operation need FRM. */
|
||||
static bool
|
||||
needs_fp_rounding (rtx_code code, machine_mode mode)
|
||||
|
@ -3047,4 +3108,163 @@ expand_cond_len_binop (rtx_code code, rtx *ops)
|
|||
gcc_unreachable ();
|
||||
}
|
||||
|
||||
/* Prepare insn_code for gather_load/scatter_store according to
|
||||
the vector mode and index mode. */
|
||||
static insn_code
|
||||
prepare_gather_scatter (machine_mode vec_mode, machine_mode idx_mode,
|
||||
bool is_load)
|
||||
{
|
||||
if (!is_load)
|
||||
return code_for_pred_indexed_store (UNSPEC_UNORDERED, vec_mode, idx_mode);
|
||||
else
|
||||
{
|
||||
unsigned src_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (idx_mode));
|
||||
unsigned dst_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (vec_mode));
|
||||
if (dst_eew_bitsize == src_eew_bitsize)
|
||||
return code_for_pred_indexed_load_same_eew (UNSPEC_UNORDERED, vec_mode);
|
||||
else if (dst_eew_bitsize > src_eew_bitsize)
|
||||
{
|
||||
unsigned factor = dst_eew_bitsize / src_eew_bitsize;
|
||||
switch (factor)
|
||||
{
|
||||
case 2:
|
||||
return code_for_pred_indexed_load_x2_greater_eew (
|
||||
UNSPEC_UNORDERED, vec_mode);
|
||||
case 4:
|
||||
return code_for_pred_indexed_load_x4_greater_eew (
|
||||
UNSPEC_UNORDERED, vec_mode);
|
||||
case 8:
|
||||
return code_for_pred_indexed_load_x8_greater_eew (
|
||||
UNSPEC_UNORDERED, vec_mode);
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
unsigned factor = src_eew_bitsize / dst_eew_bitsize;
|
||||
switch (factor)
|
||||
{
|
||||
case 2:
|
||||
return code_for_pred_indexed_load_x2_smaller_eew (
|
||||
UNSPEC_UNORDERED, vec_mode);
|
||||
case 4:
|
||||
return code_for_pred_indexed_load_x4_smaller_eew (
|
||||
UNSPEC_UNORDERED, vec_mode);
|
||||
case 8:
|
||||
return code_for_pred_indexed_load_x8_smaller_eew (
|
||||
UNSPEC_UNORDERED, vec_mode);
|
||||
default:
|
||||
gcc_unreachable ();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* Expand LEN_MASK_{GATHER_LOAD,SCATTER_STORE}. */
|
||||
void
|
||||
expand_gather_scatter (rtx *ops, bool is_load)
|
||||
{
|
||||
rtx ptr, vec_offset, vec_reg, len, mask;
|
||||
bool zero_extend_p;
|
||||
int scale_log2;
|
||||
if (is_load)
|
||||
{
|
||||
vec_reg = ops[0];
|
||||
ptr = ops[1];
|
||||
vec_offset = ops[2];
|
||||
zero_extend_p = INTVAL (ops[3]);
|
||||
scale_log2 = exact_log2 (INTVAL (ops[4]));
|
||||
len = ops[5];
|
||||
mask = ops[7];
|
||||
}
|
||||
else
|
||||
{
|
||||
vec_reg = ops[4];
|
||||
ptr = ops[0];
|
||||
vec_offset = ops[1];
|
||||
zero_extend_p = INTVAL (ops[2]);
|
||||
scale_log2 = exact_log2 (INTVAL (ops[3]));
|
||||
len = ops[5];
|
||||
mask = ops[7];
|
||||
}
|
||||
|
||||
machine_mode vec_mode = GET_MODE (vec_reg);
|
||||
machine_mode idx_mode = GET_MODE (vec_offset);
|
||||
scalar_mode inner_vec_mode = GET_MODE_INNER (vec_mode);
|
||||
scalar_mode inner_idx_mode = GET_MODE_INNER (idx_mode);
|
||||
unsigned inner_vsize = GET_MODE_BITSIZE (inner_vec_mode);
|
||||
unsigned inner_offsize = GET_MODE_BITSIZE (inner_idx_mode);
|
||||
poly_int64 nunits = GET_MODE_NUNITS (vec_mode);
|
||||
poly_int64 value;
|
||||
bool is_vlmax = poly_int_rtx_p (len, &value) && known_eq (value, nunits);
|
||||
|
||||
if (inner_offsize < inner_vsize)
|
||||
{
|
||||
/* 7.2. Vector Load/Store Addressing Modes.
|
||||
If the vector offset elements are narrower than XLEN, they are
|
||||
zero-extended to XLEN before adding to the ptr effective address. If
|
||||
the vector offset elements are wider than XLEN, the least-significant
|
||||
XLEN bits are used in the address calculation. An implementation must
|
||||
raise an illegal instruction exception if the EEW is not supported for
|
||||
offset elements.
|
||||
|
||||
RVV spec only refers to the scale_log == 0 case. */
|
||||
if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))
|
||||
{
|
||||
if (zero_extend_p)
|
||||
inner_idx_mode
|
||||
= int_mode_for_size (inner_offsize * 2, 0).require ();
|
||||
else
|
||||
inner_idx_mode = int_mode_for_size (BITS_PER_WORD, 0).require ();
|
||||
machine_mode new_idx_mode
|
||||
= get_vector_mode (inner_idx_mode, nunits).require ();
|
||||
rtx tmp = gen_reg_rtx (new_idx_mode);
|
||||
emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, idx_mode,
|
||||
zero_extend_p ? true : false));
|
||||
vec_offset = tmp;
|
||||
idx_mode = new_idx_mode;
|
||||
}
|
||||
}
|
||||
|
||||
if (scale_log2 != 0)
|
||||
{
|
||||
rtx tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
|
||||
gen_int_mode (scale_log2, Pmode), NULL_RTX, 0,
|
||||
OPTAB_DIRECT);
|
||||
vec_offset = tmp;
|
||||
}
|
||||
|
||||
insn_code icode = prepare_gather_scatter (vec_mode, idx_mode, is_load);
|
||||
if (is_vlmax)
|
||||
{
|
||||
if (is_load)
|
||||
{
|
||||
rtx load_ops[]
|
||||
= {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
|
||||
emit_vlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops);
|
||||
}
|
||||
else
|
||||
{
|
||||
rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
|
||||
emit_vlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops);
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
if (is_load)
|
||||
{
|
||||
rtx load_ops[]
|
||||
= {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
|
||||
emit_nonvlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops, len);
|
||||
}
|
||||
else
|
||||
{
|
||||
rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
|
||||
emit_nonvlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops,
|
||||
len);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace riscv_vector
|
||||
|
|
|
@ -2037,7 +2037,14 @@ riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
|
|||
(m, n) = base * magn + constant.
|
||||
This calculation doesn't need div operation. */
|
||||
|
||||
emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
|
||||
if (known_le (GET_MODE_SIZE (mode), GET_MODE_SIZE (Pmode)))
|
||||
emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
|
||||
else
|
||||
{
|
||||
emit_move_insn (gen_highpart (Pmode, tmp), CONST0_RTX (Pmode));
|
||||
emit_move_insn (gen_lowpart (Pmode, tmp),
|
||||
gen_int_mode (BYTES_PER_RISCV_VECTOR, Pmode));
|
||||
}
|
||||
|
||||
if (BYTES_PER_RISCV_VECTOR.is_constant ())
|
||||
{
|
||||
|
@ -2144,7 +2151,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
|
|||
return false;
|
||||
}
|
||||
|
||||
if (satisfies_constraint_vp (src))
|
||||
if (satisfies_constraint_vp (src) && GET_MODE (src) == Pmode)
|
||||
return false;
|
||||
|
||||
if (GET_MODE_SIZE (mode).to_constant () < GET_MODE_SIZE (Pmode))
|
||||
|
|
|
@ -115,6 +115,9 @@
|
|||
|
||||
(define_mode_iterator VEEWEXT2 [
|
||||
(VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx64HI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
|
||||
(VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
|
||||
(VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI "TARGET_VECTOR_ELEN_64")
|
||||
(VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
|
||||
|
@ -161,6 +164,8 @@
|
|||
(define_mode_iterator VEEWTRUNC2 [
|
||||
(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
|
||||
(VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
|
||||
(VNx2SF "TARGET_VECTOR_ELEN_FP_32")
|
||||
|
@ -172,6 +177,8 @@
|
|||
(define_mode_iterator VEEWTRUNC4 [
|
||||
(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI (VNx32QI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI (VNx16HI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
])
|
||||
|
||||
(define_mode_iterator VEEWTRUNC8 [
|
||||
|
@ -362,46 +369,67 @@
|
|||
])
|
||||
|
||||
(define_mode_iterator VNX1_QHSD [
|
||||
(VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1QI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1HI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1SI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
|
||||
(VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
|
||||
(VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
|
||||
(VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX2_QHSD [
|
||||
VNx2QI VNx2HI VNx2SI
|
||||
VNx2QI
|
||||
VNx2HI
|
||||
VNx2SI
|
||||
(VNx2DI "TARGET_VECTOR_ELEN_64")
|
||||
(VNx2HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx2SF "TARGET_VECTOR_ELEN_FP_32")
|
||||
(VNx2DF "TARGET_VECTOR_ELEN_FP_64")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX4_QHSD [
|
||||
VNx4QI VNx4HI VNx4SI
|
||||
VNx4QI
|
||||
VNx4HI
|
||||
VNx4SI
|
||||
(VNx4DI "TARGET_VECTOR_ELEN_64")
|
||||
(VNx4HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx4SF "TARGET_VECTOR_ELEN_FP_32")
|
||||
(VNx4DF "TARGET_VECTOR_ELEN_FP_64")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX8_QHSD [
|
||||
VNx8QI VNx8HI VNx8SI
|
||||
VNx8QI
|
||||
VNx8HI
|
||||
VNx8SI
|
||||
(VNx8DI "TARGET_VECTOR_ELEN_64")
|
||||
(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx8SF "TARGET_VECTOR_ELEN_FP_32")
|
||||
(VNx8DF "TARGET_VECTOR_ELEN_FP_64")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX16_QHS [
|
||||
VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32")
|
||||
(define_mode_iterator VNX16_QHSD [
|
||||
VNx16QI
|
||||
VNx16HI
|
||||
(VNx16SI "TARGET_MIN_VLEN > 32")
|
||||
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
|
||||
(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
|
||||
(VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
|
||||
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
|
||||
(VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX32_QHS [
|
||||
VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128") (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
|
||||
VNx32QI
|
||||
(VNx32HI "TARGET_MIN_VLEN > 32")
|
||||
(VNx32SI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
|
||||
(VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX64_QH [
|
||||
(VNx64QI "TARGET_MIN_VLEN > 32")
|
||||
(VNx64HI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX128_Q [
|
||||
|
@ -409,35 +437,49 @@
|
|||
])
|
||||
|
||||
(define_mode_iterator VNX1_QHSDI [
|
||||
(VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
|
||||
(VNx1QI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1HI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1SI "TARGET_MIN_VLEN < 128")
|
||||
(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128 && TARGET_64BIT")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX2_QHSDI [
|
||||
VNx2QI VNx2HI VNx2SI
|
||||
(VNx2DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
|
||||
VNx2QI
|
||||
VNx2HI
|
||||
VNx2SI
|
||||
(VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX4_QHSDI [
|
||||
VNx4QI VNx4HI VNx4SI
|
||||
(VNx4DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
|
||||
VNx4QI
|
||||
VNx4HI
|
||||
VNx4SI
|
||||
(VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX8_QHSDI [
|
||||
VNx8QI VNx8HI VNx8SI
|
||||
(VNx8DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
|
||||
VNx8QI
|
||||
VNx8HI
|
||||
VNx8SI
|
||||
(VNx8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX16_QHSDI [
|
||||
VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx16DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
|
||||
VNx16QI
|
||||
VNx16HI
|
||||
(VNx16SI "TARGET_MIN_VLEN > 32")
|
||||
(VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128 && TARGET_64BIT")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX32_QHSI [
|
||||
VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
|
||||
VNx32QI
|
||||
(VNx32HI "TARGET_MIN_VLEN > 32")
|
||||
(VNx32SI "TARGET_MIN_VLEN >= 128")
|
||||
])
|
||||
|
||||
(define_mode_iterator VNX64_QHI [
|
||||
VNx64QI (VNx64HI "TARGET_MIN_VLEN >= 128")
|
||||
(VNx64QI "TARGET_MIN_VLEN > 32")
|
||||
(VNx64HI "TARGET_MIN_VLEN >= 128")
|
||||
])
|
||||
|
||||
(define_mode_iterator V_WHOLE [
|
||||
|
@ -1393,6 +1435,8 @@
|
|||
(define_mode_attr VINDEX_DOUBLE_TRUNC [
|
||||
(VNx1HI "VNx1QI") (VNx2HI "VNx2QI") (VNx4HI "VNx4QI") (VNx8HI "VNx8QI")
|
||||
(VNx16HI "VNx16QI") (VNx32HI "VNx32QI") (VNx64HI "VNx64QI")
|
||||
(VNx1HF "VNx1QI") (VNx2HF "VNx2QI") (VNx4HF "VNx4QI") (VNx8HF "VNx8QI")
|
||||
(VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
|
||||
(VNx1SI "VNx1HI") (VNx2SI "VNx2HI") (VNx4SI "VNx4HI") (VNx8SI "VNx8HI")
|
||||
(VNx16SI "VNx16HI") (VNx32SI "VNx32HI")
|
||||
(VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI")
|
||||
|
@ -1420,6 +1464,7 @@
|
|||
(define_mode_attr VINDEX_DOUBLE_EXT [
|
||||
(VNx1QI "VNx1HI") (VNx2QI "VNx2HI") (VNx4QI "VNx4HI") (VNx8QI "VNx8HI") (VNx16QI "VNx16HI") (VNx32QI "VNx32HI") (VNx64QI "VNx64HI")
|
||||
(VNx1HI "VNx1SI") (VNx2HI "VNx2SI") (VNx4HI "VNx4SI") (VNx8HI "VNx8SI") (VNx16HI "VNx16SI") (VNx32HI "VNx32SI")
|
||||
(VNx1HF "VNx1SI") (VNx2HF "VNx2SI") (VNx4HF "VNx4SI") (VNx8HF "VNx8SI") (VNx16HF "VNx16SI") (VNx32HF "VNx32SI")
|
||||
(VNx1SI "VNx1DI") (VNx2SI "VNx2DI") (VNx4SI "VNx4DI") (VNx8SI "VNx8DI") (VNx16SI "VNx16DI")
|
||||
(VNx1SF "VNx1DI") (VNx2SF "VNx2DI") (VNx4SF "VNx4DI") (VNx8SF "VNx8DI") (VNx16SF "VNx16DI")
|
||||
])
|
||||
|
@ -1427,6 +1472,7 @@
|
|||
(define_mode_attr VINDEX_QUAD_EXT [
|
||||
(VNx1QI "VNx1SI") (VNx2QI "VNx2SI") (VNx4QI "VNx4SI") (VNx8QI "VNx8SI") (VNx16QI "VNx16SI") (VNx32QI "VNx32SI")
|
||||
(VNx1HI "VNx1DI") (VNx2HI "VNx2DI") (VNx4HI "VNx4DI") (VNx8HI "VNx8DI") (VNx16HI "VNx16DI")
|
||||
(VNx1HF "VNx1DI") (VNx2HF "VNx2DI") (VNx4HF "VNx4DI") (VNx8HF "VNx8DI") (VNx16HF "VNx16DI")
|
||||
])
|
||||
|
||||
(define_mode_attr VINDEX_OCT_EXT [
|
||||
|
@ -1471,6 +1517,40 @@
|
|||
(VNx4DI "VNx8BI") (VNx8DI "VNx16BI") (VNx16DI "VNx32BI")
|
||||
])
|
||||
|
||||
(define_mode_attr gs_extension [
|
||||
(VNx1QI "immediate_operand") (VNx2QI "immediate_operand") (VNx4QI "immediate_operand") (VNx8QI "immediate_operand") (VNx16QI "immediate_operand")
|
||||
(VNx32QI "vector_gs_extension_operand") (VNx64QI "const_1_operand")
|
||||
(VNx1HI "immediate_operand") (VNx2HI "immediate_operand") (VNx4HI "immediate_operand") (VNx8HI "immediate_operand") (VNx16HI "immediate_operand")
|
||||
(VNx32HI "vector_gs_extension_operand") (VNx64HI "const_1_operand")
|
||||
(VNx1SI "immediate_operand") (VNx2SI "immediate_operand") (VNx4SI "immediate_operand") (VNx8SI "immediate_operand") (VNx16SI "immediate_operand")
|
||||
(VNx32SI "vector_gs_extension_operand")
|
||||
(VNx1DI "immediate_operand") (VNx2DI "immediate_operand") (VNx4DI "immediate_operand") (VNx8DI "immediate_operand") (VNx16DI "immediate_operand")
|
||||
|
||||
(VNx1HF "immediate_operand") (VNx2HF "immediate_operand") (VNx4HF "immediate_operand") (VNx8HF "immediate_operand") (VNx16HF "immediate_operand")
|
||||
(VNx32HF "vector_gs_extension_operand") (VNx64HF "const_1_operand")
|
||||
(VNx1SF "immediate_operand") (VNx2SF "immediate_operand") (VNx4SF "immediate_operand") (VNx8SF "immediate_operand") (VNx16SF "immediate_operand")
|
||||
(VNx32SF "vector_gs_extension_operand")
|
||||
(VNx1DF "immediate_operand") (VNx2DF "immediate_operand") (VNx4DF "immediate_operand") (VNx8DF "immediate_operand") (VNx16DF "immediate_operand")
|
||||
])
|
||||
|
||||
(define_mode_attr gs_scale [
|
||||
(VNx1QI "const_1_operand") (VNx2QI "const_1_operand") (VNx4QI "const_1_operand") (VNx8QI "const_1_operand")
|
||||
(VNx16QI "const_1_operand") (VNx32QI "const_1_operand") (VNx64QI "const_1_operand")
|
||||
(VNx1HI "vector_gs_scale_operand_16") (VNx2HI "vector_gs_scale_operand_16") (VNx4HI "vector_gs_scale_operand_16") (VNx8HI "vector_gs_scale_operand_16")
|
||||
(VNx16HI "vector_gs_scale_operand_16") (VNx32HI "vector_gs_scale_operand_16_rv32") (VNx64HI "const_1_operand")
|
||||
(VNx1SI "vector_gs_scale_operand_32") (VNx2SI "vector_gs_scale_operand_32") (VNx4SI "vector_gs_scale_operand_32") (VNx8SI "vector_gs_scale_operand_32")
|
||||
(VNx16SI "vector_gs_scale_operand_32") (VNx32SI "vector_gs_scale_operand_32_rv32")
|
||||
(VNx1DI "vector_gs_scale_operand_64") (VNx2DI "vector_gs_scale_operand_64") (VNx4DI "vector_gs_scale_operand_64") (VNx8DI "vector_gs_scale_operand_64")
|
||||
(VNx16DI "vector_gs_scale_operand_64")
|
||||
|
||||
(VNx1HF "vector_gs_scale_operand_16") (VNx2HF "vector_gs_scale_operand_16") (VNx4HF "vector_gs_scale_operand_16") (VNx8HF "vector_gs_scale_operand_16")
|
||||
(VNx16HF "vector_gs_scale_operand_16") (VNx32HF "vector_gs_scale_operand_16_rv32") (VNx64HF "const_1_operand")
|
||||
(VNx1SF "vector_gs_scale_operand_32") (VNx2SF "vector_gs_scale_operand_32") (VNx4SF "vector_gs_scale_operand_32") (VNx8SF "vector_gs_scale_operand_32")
|
||||
(VNx16SF "vector_gs_scale_operand_32") (VNx32SF "vector_gs_scale_operand_32_rv32")
|
||||
(VNx1DF "vector_gs_scale_operand_64") (VNx2DF "vector_gs_scale_operand_64") (VNx4DF "vector_gs_scale_operand_64") (VNx8DF "vector_gs_scale_operand_64")
|
||||
(VNx16DF "vector_gs_scale_operand_64")
|
||||
])
|
||||
|
||||
(define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM])
|
||||
|
||||
(define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED])
|
||||
|
|
|
@ -818,7 +818,7 @@
|
|||
;; This pattern only handles duplicates of non-constant inputs.
|
||||
;; Constant vectors go through the movm pattern instead.
|
||||
;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT.
|
||||
(define_expand "vec_duplicate<mode>"
|
||||
(define_expand "@vec_duplicate<mode>"
|
||||
[(set (match_operand:V 0 "register_operand")
|
||||
(vec_duplicate:V
|
||||
(match_operand:<VEL> 1 "direct_broadcast_operand")))]
|
||||
|
@ -1357,8 +1357,16 @@
|
|||
}
|
||||
}
|
||||
else if (GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)
|
||||
&& immediate_operand (operands[3], Pmode))
|
||||
operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, force_reg (Pmode, operands[3]));
|
||||
&& (immediate_operand (operands[3], Pmode)
|
||||
|| (CONST_POLY_INT_P (operands[3])
|
||||
&& known_ge (rtx_to_poly_int64 (operands[3]), 0U)
|
||||
&& known_le (rtx_to_poly_int64 (operands[3]), GET_MODE_SIZE (<MODE>mode)))))
|
||||
{
|
||||
rtx tmp = gen_reg_rtx (Pmode);
|
||||
poly_int64 value = rtx_to_poly_int64 (operands[3]);
|
||||
emit_move_insn (tmp, gen_int_mode (value, Pmode));
|
||||
operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, tmp);
|
||||
}
|
||||
else
|
||||
operands[3] = force_reg (<VEL>mode, operands[3]);
|
||||
})
|
||||
|
@ -1387,7 +1395,8 @@
|
|||
vlse<sew>.v\t%0,%3,zero
|
||||
vmv.s.x\t%0,%3
|
||||
vmv.s.x\t%0,%3"
|
||||
"register_operand (operands[3], <VEL>mode)
|
||||
"(register_operand (operands[3], <VEL>mode)
|
||||
|| CONST_POLY_INT_P (operands[3]))
|
||||
&& GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)"
|
||||
[(set (match_dup 0)
|
||||
(if_then_else:VI (unspec:<VM> [(match_dup 1) (match_dup 4)
|
||||
|
@ -1397,6 +1406,12 @@
|
|||
(match_dup 2)))]
|
||||
{
|
||||
gcc_assert (can_create_pseudo_p ());
|
||||
if (CONST_POLY_INT_P (operands[3]))
|
||||
{
|
||||
rtx tmp = gen_reg_rtx (<VEL>mode);
|
||||
emit_move_insn (tmp, operands[3]);
|
||||
operands[3] = tmp;
|
||||
}
|
||||
rtx m = assign_stack_local (<VEL>mode, GET_MODE_SIZE (<VEL>mode),
|
||||
GET_MODE_ALIGNMENT (<VEL>mode));
|
||||
m = validize_mem (m);
|
||||
|
@ -1483,6 +1498,7 @@
|
|||
(match_operand 5 "vector_length_operand" " rK, rK, rK")
|
||||
(match_operand 6 "const_int_operand" " i, i, i")
|
||||
(match_operand 7 "const_int_operand" " i, i, i")
|
||||
(match_operand 8 "const_int_operand" " i, i, i")
|
||||
(reg:SI VL_REGNUM)
|
||||
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
|
||||
(unspec:V
|
||||
|
@ -1738,7 +1754,7 @@
|
|||
[(set_attr "type" "vst<order>x")
|
||||
(set_attr "mode" "<VNX8_QHSD:MODE>")])
|
||||
|
||||
(define_insn "@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>"
|
||||
(define_insn "@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
|
||||
[(set (mem:BLK (scratch))
|
||||
(unspec:BLK
|
||||
[(unspec:<VM>
|
||||
|
@ -1749,11 +1765,11 @@
|
|||
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
|
||||
(match_operand 1 "pmode_reg_or_0_operand" " rJ")
|
||||
(match_operand:VNX16_QHSDI 2 "register_operand" " vr")
|
||||
(match_operand:VNX16_QHS 3 "register_operand" " vr")] ORDER))]
|
||||
(match_operand:VNX16_QHSD 3 "register_operand" " vr")] ORDER))]
|
||||
"TARGET_VECTOR"
|
||||
"vs<order>xei<VNX16_QHSDI:sew>.v\t%3,(%z1),%2%p0"
|
||||
[(set_attr "type" "vst<order>x")
|
||||
(set_attr "mode" "<VNX16_QHS:MODE>")])
|
||||
(set_attr "mode" "<VNX16_QHSD:MODE>")])
|
||||
|
||||
(define_insn "@pred_indexed_<order>store<VNX32_QHS:mode><VNX32_QHSI:mode>"
|
||||
[(set (mem:BLK (scratch))
|
||||
|
|
|
@ -0,0 +1,38 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
#define INDEX16 uint16_t
|
||||
#define INDEX32 uint32_t
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,32 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict *src) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += *src[i]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t) \
|
||||
T (uint8_t) \
|
||||
T (int16_t) \
|
||||
T (uint16_t) \
|
||||
T (_Float16) \
|
||||
T (int32_t) \
|
||||
T (uint32_t) \
|
||||
T (float) \
|
||||
T (int64_t) \
|
||||
T (uint64_t) \
|
||||
T (double)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,112 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-vect-cost-model -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x, \
|
||||
INDEX_TYPE *restrict index) \
|
||||
{ \
|
||||
for (int i = 0; i < 100; ++i) \
|
||||
{ \
|
||||
y[i * 2] = x[index[i * 2]] + 1; \
|
||||
y[i * 2 + 1] = x[index[i * 2 + 1]] + 2; \
|
||||
} \
|
||||
}
|
||||
|
||||
TEST_LOOP (int8_t, int8_t)
|
||||
TEST_LOOP (uint8_t, int8_t)
|
||||
TEST_LOOP (int16_t, int8_t)
|
||||
TEST_LOOP (uint16_t, int8_t)
|
||||
TEST_LOOP (int32_t, int8_t)
|
||||
TEST_LOOP (uint32_t, int8_t)
|
||||
TEST_LOOP (int64_t, int8_t)
|
||||
TEST_LOOP (uint64_t, int8_t)
|
||||
TEST_LOOP (_Float16, int8_t)
|
||||
TEST_LOOP (float, int8_t)
|
||||
TEST_LOOP (double, int8_t)
|
||||
TEST_LOOP (int8_t, int16_t)
|
||||
TEST_LOOP (uint8_t, int16_t)
|
||||
TEST_LOOP (int16_t, int16_t)
|
||||
TEST_LOOP (uint16_t, int16_t)
|
||||
TEST_LOOP (int32_t, int16_t)
|
||||
TEST_LOOP (uint32_t, int16_t)
|
||||
TEST_LOOP (int64_t, int16_t)
|
||||
TEST_LOOP (uint64_t, int16_t)
|
||||
TEST_LOOP (_Float16, int16_t)
|
||||
TEST_LOOP (float, int16_t)
|
||||
TEST_LOOP (double, int16_t)
|
||||
TEST_LOOP (int8_t, int32_t)
|
||||
TEST_LOOP (uint8_t, int32_t)
|
||||
TEST_LOOP (int16_t, int32_t)
|
||||
TEST_LOOP (uint16_t, int32_t)
|
||||
TEST_LOOP (int32_t, int32_t)
|
||||
TEST_LOOP (uint32_t, int32_t)
|
||||
TEST_LOOP (int64_t, int32_t)
|
||||
TEST_LOOP (uint64_t, int32_t)
|
||||
TEST_LOOP (_Float16, int32_t)
|
||||
TEST_LOOP (float, int32_t)
|
||||
TEST_LOOP (double, int32_t)
|
||||
TEST_LOOP (int8_t, int64_t)
|
||||
TEST_LOOP (uint8_t, int64_t)
|
||||
TEST_LOOP (int16_t, int64_t)
|
||||
TEST_LOOP (uint16_t, int64_t)
|
||||
TEST_LOOP (int32_t, int64_t)
|
||||
TEST_LOOP (uint32_t, int64_t)
|
||||
TEST_LOOP (int64_t, int64_t)
|
||||
TEST_LOOP (uint64_t, int64_t)
|
||||
TEST_LOOP (_Float16, int64_t)
|
||||
TEST_LOOP (float, int64_t)
|
||||
TEST_LOOP (double, int64_t)
|
||||
TEST_LOOP (int8_t, uint8_t)
|
||||
TEST_LOOP (uint8_t, uint8_t)
|
||||
TEST_LOOP (int16_t, uint8_t)
|
||||
TEST_LOOP (uint16_t, uint8_t)
|
||||
TEST_LOOP (int32_t, uint8_t)
|
||||
TEST_LOOP (uint32_t, uint8_t)
|
||||
TEST_LOOP (int64_t, uint8_t)
|
||||
TEST_LOOP (uint64_t, uint8_t)
|
||||
TEST_LOOP (_Float16, uint8_t)
|
||||
TEST_LOOP (float, uint8_t)
|
||||
TEST_LOOP (double, uint8_t)
|
||||
TEST_LOOP (int8_t, uint16_t)
|
||||
TEST_LOOP (uint8_t, uint16_t)
|
||||
TEST_LOOP (int16_t, uint16_t)
|
||||
TEST_LOOP (uint16_t, uint16_t)
|
||||
TEST_LOOP (int32_t, uint16_t)
|
||||
TEST_LOOP (uint32_t, uint16_t)
|
||||
TEST_LOOP (int64_t, uint16_t)
|
||||
TEST_LOOP (uint64_t, uint16_t)
|
||||
TEST_LOOP (_Float16, uint16_t)
|
||||
TEST_LOOP (float, uint16_t)
|
||||
TEST_LOOP (double, uint16_t)
|
||||
TEST_LOOP (int8_t, uint32_t)
|
||||
TEST_LOOP (uint8_t, uint32_t)
|
||||
TEST_LOOP (int16_t, uint32_t)
|
||||
TEST_LOOP (uint16_t, uint32_t)
|
||||
TEST_LOOP (int32_t, uint32_t)
|
||||
TEST_LOOP (uint32_t, uint32_t)
|
||||
TEST_LOOP (int64_t, uint32_t)
|
||||
TEST_LOOP (uint64_t, uint32_t)
|
||||
TEST_LOOP (_Float16, uint32_t)
|
||||
TEST_LOOP (float, uint32_t)
|
||||
TEST_LOOP (double, uint32_t)
|
||||
TEST_LOOP (int8_t, uint64_t)
|
||||
TEST_LOOP (uint8_t, uint64_t)
|
||||
TEST_LOOP (int16_t, uint64_t)
|
||||
TEST_LOOP (uint16_t, uint64_t)
|
||||
TEST_LOOP (int32_t, uint64_t)
|
||||
TEST_LOOP (uint32_t, uint64_t)
|
||||
TEST_LOOP (int64_t, uint64_t)
|
||||
TEST_LOOP (uint64_t, uint64_t)
|
||||
TEST_LOOP (_Float16, uint64_t)
|
||||
TEST_LOOP (float, uint64_t)
|
||||
TEST_LOOP (double, uint64_t)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
|
||||
/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
|
|
@ -0,0 +1,38 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 uint16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 int16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 uint32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 int32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-1.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-10.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,39 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-11.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE *src_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src2_##DATA_TYPE[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = src2_##DATA_TYPE + i; \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] + src_##DATA_TYPE[i][0]));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,124 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-12.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, INDEX_TYPE) \
|
||||
DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
for (int i = 0; i < 202; i++) \
|
||||
{ \
|
||||
src_##DATA_TYPE##_##INDEX_TYPE[i] \
|
||||
= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1)); \
|
||||
index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55); \
|
||||
} \
|
||||
f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE, \
|
||||
src_##DATA_TYPE##_##INDEX_TYPE, \
|
||||
index_##DATA_TYPE##_##INDEX_TYPE); \
|
||||
for (int i = 0; i < 100; i++) \
|
||||
{ \
|
||||
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2] \
|
||||
== (src_##DATA_TYPE##_##INDEX_TYPE \
|
||||
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2]] \
|
||||
+ 1)); \
|
||||
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1] \
|
||||
== (src_##DATA_TYPE##_##INDEX_TYPE \
|
||||
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]] \
|
||||
+ 2)); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, int8_t)
|
||||
RUN_LOOP (uint8_t, int8_t)
|
||||
RUN_LOOP (int16_t, int8_t)
|
||||
RUN_LOOP (uint16_t, int8_t)
|
||||
RUN_LOOP (int32_t, int8_t)
|
||||
RUN_LOOP (uint32_t, int8_t)
|
||||
RUN_LOOP (int64_t, int8_t)
|
||||
RUN_LOOP (uint64_t, int8_t)
|
||||
RUN_LOOP (_Float16, int8_t)
|
||||
RUN_LOOP (float, int8_t)
|
||||
RUN_LOOP (double, int8_t)
|
||||
RUN_LOOP (int8_t, int16_t)
|
||||
RUN_LOOP (uint8_t, int16_t)
|
||||
RUN_LOOP (int16_t, int16_t)
|
||||
RUN_LOOP (uint16_t, int16_t)
|
||||
RUN_LOOP (int32_t, int16_t)
|
||||
RUN_LOOP (uint32_t, int16_t)
|
||||
RUN_LOOP (int64_t, int16_t)
|
||||
RUN_LOOP (uint64_t, int16_t)
|
||||
RUN_LOOP (_Float16, int16_t)
|
||||
RUN_LOOP (float, int16_t)
|
||||
RUN_LOOP (double, int16_t)
|
||||
RUN_LOOP (int8_t, int32_t)
|
||||
RUN_LOOP (uint8_t, int32_t)
|
||||
RUN_LOOP (int16_t, int32_t)
|
||||
RUN_LOOP (uint16_t, int32_t)
|
||||
RUN_LOOP (int32_t, int32_t)
|
||||
RUN_LOOP (uint32_t, int32_t)
|
||||
RUN_LOOP (int64_t, int32_t)
|
||||
RUN_LOOP (uint64_t, int32_t)
|
||||
RUN_LOOP (_Float16, int32_t)
|
||||
RUN_LOOP (float, int32_t)
|
||||
RUN_LOOP (double, int32_t)
|
||||
RUN_LOOP (int8_t, int64_t)
|
||||
RUN_LOOP (uint8_t, int64_t)
|
||||
RUN_LOOP (int16_t, int64_t)
|
||||
RUN_LOOP (uint16_t, int64_t)
|
||||
RUN_LOOP (int32_t, int64_t)
|
||||
RUN_LOOP (uint32_t, int64_t)
|
||||
RUN_LOOP (int64_t, int64_t)
|
||||
RUN_LOOP (uint64_t, int64_t)
|
||||
RUN_LOOP (_Float16, int64_t)
|
||||
RUN_LOOP (float, int64_t)
|
||||
RUN_LOOP (double, int64_t)
|
||||
RUN_LOOP (int8_t, uint8_t)
|
||||
RUN_LOOP (uint8_t, uint8_t)
|
||||
RUN_LOOP (int16_t, uint8_t)
|
||||
RUN_LOOP (uint16_t, uint8_t)
|
||||
RUN_LOOP (int32_t, uint8_t)
|
||||
RUN_LOOP (uint32_t, uint8_t)
|
||||
RUN_LOOP (int64_t, uint8_t)
|
||||
RUN_LOOP (uint64_t, uint8_t)
|
||||
RUN_LOOP (_Float16, uint8_t)
|
||||
RUN_LOOP (float, uint8_t)
|
||||
RUN_LOOP (double, uint8_t)
|
||||
RUN_LOOP (int8_t, uint16_t)
|
||||
RUN_LOOP (uint8_t, uint16_t)
|
||||
RUN_LOOP (int16_t, uint16_t)
|
||||
RUN_LOOP (uint16_t, uint16_t)
|
||||
RUN_LOOP (int32_t, uint16_t)
|
||||
RUN_LOOP (uint32_t, uint16_t)
|
||||
RUN_LOOP (int64_t, uint16_t)
|
||||
RUN_LOOP (uint64_t, uint16_t)
|
||||
RUN_LOOP (_Float16, uint16_t)
|
||||
RUN_LOOP (float, uint16_t)
|
||||
RUN_LOOP (double, uint16_t)
|
||||
RUN_LOOP (int8_t, uint32_t)
|
||||
RUN_LOOP (uint8_t, uint32_t)
|
||||
RUN_LOOP (int16_t, uint32_t)
|
||||
RUN_LOOP (uint16_t, uint32_t)
|
||||
RUN_LOOP (int32_t, uint32_t)
|
||||
RUN_LOOP (uint32_t, uint32_t)
|
||||
RUN_LOOP (int64_t, uint32_t)
|
||||
RUN_LOOP (uint64_t, uint32_t)
|
||||
RUN_LOOP (_Float16, uint32_t)
|
||||
RUN_LOOP (float, uint32_t)
|
||||
RUN_LOOP (double, uint32_t)
|
||||
RUN_LOOP (int8_t, uint64_t)
|
||||
RUN_LOOP (uint8_t, uint64_t)
|
||||
RUN_LOOP (int16_t, uint64_t)
|
||||
RUN_LOOP (uint16_t, uint64_t)
|
||||
RUN_LOOP (int32_t, uint64_t)
|
||||
RUN_LOOP (uint32_t, uint64_t)
|
||||
RUN_LOOP (int64_t, uint64_t)
|
||||
RUN_LOOP (uint64_t, uint64_t)
|
||||
RUN_LOOP (_Float16, uint64_t)
|
||||
RUN_LOOP (float, uint64_t)
|
||||
RUN_LOOP (double, uint64_t)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-2.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-3.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-4.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-5.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-6.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-7.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-8.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "gather_load-9.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,39 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
#define INDEX16 uint16_t
|
||||
#define INDEX32 uint32_t
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,116 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-vect-cost-model -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x, \
|
||||
INDEX_TYPE *restrict index, \
|
||||
INDEX_TYPE *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 100; ++i) \
|
||||
{ \
|
||||
if (cond[i * 2]) \
|
||||
y[i * 2] = x[index[i * 2]] + 1; \
|
||||
if (cond[i * 2 + 1]) \
|
||||
y[i * 2 + 1] = x[index[i * 2 + 1]] + 2; \
|
||||
} \
|
||||
}
|
||||
|
||||
TEST_LOOP (int8_t, int8_t)
|
||||
TEST_LOOP (uint8_t, int8_t)
|
||||
TEST_LOOP (int16_t, int8_t)
|
||||
TEST_LOOP (uint16_t, int8_t)
|
||||
TEST_LOOP (int32_t, int8_t)
|
||||
TEST_LOOP (uint32_t, int8_t)
|
||||
TEST_LOOP (int64_t, int8_t)
|
||||
TEST_LOOP (uint64_t, int8_t)
|
||||
TEST_LOOP (_Float16, int8_t)
|
||||
TEST_LOOP (float, int8_t)
|
||||
TEST_LOOP (double, int8_t)
|
||||
TEST_LOOP (int8_t, int16_t)
|
||||
TEST_LOOP (uint8_t, int16_t)
|
||||
TEST_LOOP (int16_t, int16_t)
|
||||
TEST_LOOP (uint16_t, int16_t)
|
||||
TEST_LOOP (int32_t, int16_t)
|
||||
TEST_LOOP (uint32_t, int16_t)
|
||||
TEST_LOOP (int64_t, int16_t)
|
||||
TEST_LOOP (uint64_t, int16_t)
|
||||
TEST_LOOP (_Float16, int16_t)
|
||||
TEST_LOOP (float, int16_t)
|
||||
TEST_LOOP (double, int16_t)
|
||||
TEST_LOOP (int8_t, int32_t)
|
||||
TEST_LOOP (uint8_t, int32_t)
|
||||
TEST_LOOP (int16_t, int32_t)
|
||||
TEST_LOOP (uint16_t, int32_t)
|
||||
TEST_LOOP (int32_t, int32_t)
|
||||
TEST_LOOP (uint32_t, int32_t)
|
||||
TEST_LOOP (int64_t, int32_t)
|
||||
TEST_LOOP (uint64_t, int32_t)
|
||||
TEST_LOOP (_Float16, int32_t)
|
||||
TEST_LOOP (float, int32_t)
|
||||
TEST_LOOP (double, int32_t)
|
||||
TEST_LOOP (int8_t, int64_t)
|
||||
TEST_LOOP (uint8_t, int64_t)
|
||||
TEST_LOOP (int16_t, int64_t)
|
||||
TEST_LOOP (uint16_t, int64_t)
|
||||
TEST_LOOP (int32_t, int64_t)
|
||||
TEST_LOOP (uint32_t, int64_t)
|
||||
TEST_LOOP (int64_t, int64_t)
|
||||
TEST_LOOP (uint64_t, int64_t)
|
||||
TEST_LOOP (_Float16, int64_t)
|
||||
TEST_LOOP (float, int64_t)
|
||||
TEST_LOOP (double, int64_t)
|
||||
TEST_LOOP (int8_t, uint8_t)
|
||||
TEST_LOOP (uint8_t, uint8_t)
|
||||
TEST_LOOP (int16_t, uint8_t)
|
||||
TEST_LOOP (uint16_t, uint8_t)
|
||||
TEST_LOOP (int32_t, uint8_t)
|
||||
TEST_LOOP (uint32_t, uint8_t)
|
||||
TEST_LOOP (int64_t, uint8_t)
|
||||
TEST_LOOP (uint64_t, uint8_t)
|
||||
TEST_LOOP (_Float16, uint8_t)
|
||||
TEST_LOOP (float, uint8_t)
|
||||
TEST_LOOP (double, uint8_t)
|
||||
TEST_LOOP (int8_t, uint16_t)
|
||||
TEST_LOOP (uint8_t, uint16_t)
|
||||
TEST_LOOP (int16_t, uint16_t)
|
||||
TEST_LOOP (uint16_t, uint16_t)
|
||||
TEST_LOOP (int32_t, uint16_t)
|
||||
TEST_LOOP (uint32_t, uint16_t)
|
||||
TEST_LOOP (int64_t, uint16_t)
|
||||
TEST_LOOP (uint64_t, uint16_t)
|
||||
TEST_LOOP (_Float16, uint16_t)
|
||||
TEST_LOOP (float, uint16_t)
|
||||
TEST_LOOP (double, uint16_t)
|
||||
TEST_LOOP (int8_t, uint32_t)
|
||||
TEST_LOOP (uint8_t, uint32_t)
|
||||
TEST_LOOP (int16_t, uint32_t)
|
||||
TEST_LOOP (uint16_t, uint32_t)
|
||||
TEST_LOOP (int32_t, uint32_t)
|
||||
TEST_LOOP (uint32_t, uint32_t)
|
||||
TEST_LOOP (int64_t, uint32_t)
|
||||
TEST_LOOP (uint64_t, uint32_t)
|
||||
TEST_LOOP (_Float16, uint32_t)
|
||||
TEST_LOOP (float, uint32_t)
|
||||
TEST_LOOP (double, uint32_t)
|
||||
TEST_LOOP (int8_t, uint64_t)
|
||||
TEST_LOOP (uint8_t, uint64_t)
|
||||
TEST_LOOP (int16_t, uint64_t)
|
||||
TEST_LOOP (uint16_t, uint64_t)
|
||||
TEST_LOOP (int32_t, uint64_t)
|
||||
TEST_LOOP (uint32_t, uint64_t)
|
||||
TEST_LOOP (int64_t, uint64_t)
|
||||
TEST_LOOP (uint64_t, uint64_t)
|
||||
TEST_LOOP (_Float16, uint64_t)
|
||||
TEST_LOOP (float, uint64_t)
|
||||
TEST_LOOP (double, uint64_t)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
|
||||
/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
|
||||
/* { dg-final { scan-assembler-not {vlse64\.v\s+v[0-9]+,\s*0\([a-x0-9]+\),\s*zero} } } */
|
|
@ -0,0 +1,39 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 uint16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 int16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 uint32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 int32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fno-schedule-insns -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[i] += src[indices[i]]; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-1.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-10.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,140 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
/* { dg-additional-options "-mcmodel=medany" } */
|
||||
|
||||
#include "mask_gather_load-11.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, INDEX_TYPE) \
|
||||
DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
INDEX_TYPE cond_##DATA_TYPE##_##INDEX_TYPE[202] = {0}; \
|
||||
for (int i = 0; i < 202; i++) \
|
||||
{ \
|
||||
src_##DATA_TYPE##_##INDEX_TYPE[i] \
|
||||
= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1)); \
|
||||
dest_##DATA_TYPE##_##INDEX_TYPE[i] \
|
||||
= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1)); \
|
||||
dest2_##DATA_TYPE##_##INDEX_TYPE[i] \
|
||||
= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1)); \
|
||||
index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55); \
|
||||
cond_##DATA_TYPE##_##INDEX_TYPE[i] = (INDEX_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE, \
|
||||
src_##DATA_TYPE##_##INDEX_TYPE, \
|
||||
index_##DATA_TYPE##_##INDEX_TYPE, \
|
||||
cond_##DATA_TYPE##_##INDEX_TYPE); \
|
||||
for (int i = 0; i < 100; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2]) \
|
||||
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2] \
|
||||
== (src_##DATA_TYPE##_##INDEX_TYPE \
|
||||
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2]] \
|
||||
+ 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2] \
|
||||
== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2]); \
|
||||
if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]) \
|
||||
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1] \
|
||||
== (src_##DATA_TYPE##_##INDEX_TYPE \
|
||||
[index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]] \
|
||||
+ 2)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1] \
|
||||
== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, int8_t)
|
||||
RUN_LOOP (uint8_t, int8_t)
|
||||
RUN_LOOP (int16_t, int8_t)
|
||||
RUN_LOOP (uint16_t, int8_t)
|
||||
RUN_LOOP (int32_t, int8_t)
|
||||
RUN_LOOP (uint32_t, int8_t)
|
||||
RUN_LOOP (int64_t, int8_t)
|
||||
RUN_LOOP (uint64_t, int8_t)
|
||||
RUN_LOOP (_Float16, int8_t)
|
||||
RUN_LOOP (float, int8_t)
|
||||
RUN_LOOP (double, int8_t)
|
||||
RUN_LOOP (int8_t, int16_t)
|
||||
RUN_LOOP (uint8_t, int16_t)
|
||||
RUN_LOOP (int16_t, int16_t)
|
||||
RUN_LOOP (uint16_t, int16_t)
|
||||
RUN_LOOP (int32_t, int16_t)
|
||||
RUN_LOOP (uint32_t, int16_t)
|
||||
RUN_LOOP (int64_t, int16_t)
|
||||
RUN_LOOP (uint64_t, int16_t)
|
||||
RUN_LOOP (_Float16, int16_t)
|
||||
RUN_LOOP (float, int16_t)
|
||||
RUN_LOOP (double, int16_t)
|
||||
RUN_LOOP (int8_t, int32_t)
|
||||
RUN_LOOP (uint8_t, int32_t)
|
||||
RUN_LOOP (int16_t, int32_t)
|
||||
RUN_LOOP (uint16_t, int32_t)
|
||||
RUN_LOOP (int32_t, int32_t)
|
||||
RUN_LOOP (uint32_t, int32_t)
|
||||
RUN_LOOP (int64_t, int32_t)
|
||||
RUN_LOOP (uint64_t, int32_t)
|
||||
RUN_LOOP (_Float16, int32_t)
|
||||
RUN_LOOP (float, int32_t)
|
||||
RUN_LOOP (double, int32_t)
|
||||
RUN_LOOP (int8_t, int64_t)
|
||||
RUN_LOOP (uint8_t, int64_t)
|
||||
RUN_LOOP (int16_t, int64_t)
|
||||
RUN_LOOP (uint16_t, int64_t)
|
||||
RUN_LOOP (int32_t, int64_t)
|
||||
RUN_LOOP (uint32_t, int64_t)
|
||||
RUN_LOOP (int64_t, int64_t)
|
||||
RUN_LOOP (uint64_t, int64_t)
|
||||
RUN_LOOP (_Float16, int64_t)
|
||||
RUN_LOOP (float, int64_t)
|
||||
RUN_LOOP (double, int64_t)
|
||||
RUN_LOOP (int8_t, uint8_t)
|
||||
RUN_LOOP (uint8_t, uint8_t)
|
||||
RUN_LOOP (int16_t, uint8_t)
|
||||
RUN_LOOP (uint16_t, uint8_t)
|
||||
RUN_LOOP (int32_t, uint8_t)
|
||||
RUN_LOOP (uint32_t, uint8_t)
|
||||
RUN_LOOP (int64_t, uint8_t)
|
||||
RUN_LOOP (uint64_t, uint8_t)
|
||||
RUN_LOOP (_Float16, uint8_t)
|
||||
RUN_LOOP (float, uint8_t)
|
||||
RUN_LOOP (double, uint8_t)
|
||||
RUN_LOOP (int8_t, uint16_t)
|
||||
RUN_LOOP (uint8_t, uint16_t)
|
||||
RUN_LOOP (int16_t, uint16_t)
|
||||
RUN_LOOP (uint16_t, uint16_t)
|
||||
RUN_LOOP (int32_t, uint16_t)
|
||||
RUN_LOOP (uint32_t, uint16_t)
|
||||
RUN_LOOP (int64_t, uint16_t)
|
||||
RUN_LOOP (uint64_t, uint16_t)
|
||||
RUN_LOOP (_Float16, uint16_t)
|
||||
RUN_LOOP (float, uint16_t)
|
||||
RUN_LOOP (double, uint16_t)
|
||||
RUN_LOOP (int8_t, uint32_t)
|
||||
RUN_LOOP (uint8_t, uint32_t)
|
||||
RUN_LOOP (int16_t, uint32_t)
|
||||
RUN_LOOP (uint16_t, uint32_t)
|
||||
RUN_LOOP (int32_t, uint32_t)
|
||||
RUN_LOOP (uint32_t, uint32_t)
|
||||
RUN_LOOP (int64_t, uint32_t)
|
||||
RUN_LOOP (uint64_t, uint32_t)
|
||||
RUN_LOOP (_Float16, uint32_t)
|
||||
RUN_LOOP (float, uint32_t)
|
||||
RUN_LOOP (double, uint32_t)
|
||||
RUN_LOOP (int8_t, uint64_t)
|
||||
RUN_LOOP (uint8_t, uint64_t)
|
||||
RUN_LOOP (int16_t, uint64_t)
|
||||
RUN_LOOP (uint16_t, uint64_t)
|
||||
RUN_LOOP (int32_t, uint64_t)
|
||||
RUN_LOOP (uint32_t, uint64_t)
|
||||
RUN_LOOP (int64_t, uint64_t)
|
||||
RUN_LOOP (uint64_t, uint64_t)
|
||||
RUN_LOOP (_Float16, uint64_t)
|
||||
RUN_LOOP (float, uint64_t)
|
||||
RUN_LOOP (double, uint64_t)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-2.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-3.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-4.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-5.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-6.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
/* { dg-additional-options "-mcmodel=medany" } */
|
||||
|
||||
#include "mask_gather_load-7.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
/* { dg-additional-options "-mcmodel=medany" } */
|
||||
|
||||
#include "mask_gather_load-8.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_gather_load-9.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128] = {0}; \
|
||||
DATA_TYPE src_##DATA_TYPE[128] = {0}; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[i] \
|
||||
== (dest2_##DATA_TYPE[i] \
|
||||
+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]])); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,39 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
#define INDEX16 uint16_t
|
||||
#define INDEX32 uint32_t
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,39 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 uint16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 int16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 uint32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 int32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,36 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices, INDEX##BITS *restrict cond) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
if (cond[i]) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-1.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-10.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-2.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-3.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-4.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-5.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-6.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
/* { dg-additional-options "-mcmodel=medany" } */
|
||||
|
||||
#include "mask_scatter_store-7.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-8.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,48 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "mask_scatter_store-9.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0}; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
if (cond_##DATA_TYPE##_##BITS[i]) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1)); \
|
||||
else \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,38 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
#define INDEX16 uint16_t
|
||||
#define INDEX32 uint32_t
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,38 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 uint8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX8 int8_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 8) \
|
||||
T (uint8_t, 8) \
|
||||
T (int16_t, 8) \
|
||||
T (uint16_t, 8) \
|
||||
T (_Float16, 8) \
|
||||
T (int32_t, 8) \
|
||||
T (uint32_t, 8) \
|
||||
T (float, 8) \
|
||||
T (int64_t, 8) \
|
||||
T (uint64_t, 8) \
|
||||
T (double, 8)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 uint16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX16 int16_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 16) \
|
||||
T (uint8_t, 16) \
|
||||
T (int16_t, 16) \
|
||||
T (uint16_t, 16) \
|
||||
T (_Float16, 16) \
|
||||
T (int32_t, 16) \
|
||||
T (uint32_t, 16) \
|
||||
T (float, 16) \
|
||||
T (int64_t, 16) \
|
||||
T (uint64_t, 16) \
|
||||
T (double, 16)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 uint32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX32 int32_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 32) \
|
||||
T (uint8_t, 32) \
|
||||
T (int16_t, 32) \
|
||||
T (uint16_t, 32) \
|
||||
T (_Float16, 32) \
|
||||
T (int32_t, 32) \
|
||||
T (uint32_t, 32) \
|
||||
T (float, 32) \
|
||||
T (int64_t, 32) \
|
||||
T (uint64_t, 32) \
|
||||
T (double, 32)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,35 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d -fdump-tree-vect-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#define INDEX64 uint64_t
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS *restrict indices) \
|
||||
{ \
|
||||
for (int i = 0; i < 128; ++i) \
|
||||
dest[indices[i]] = src[i] + 1; \
|
||||
}
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
T (int8_t, 64) \
|
||||
T (uint8_t, 64) \
|
||||
T (int16_t, 64) \
|
||||
T (uint16_t, 64) \
|
||||
T (_Float16, 64) \
|
||||
T (int32_t, 64) \
|
||||
T (uint32_t, 64) \
|
||||
T (float, 64) \
|
||||
T (int64_t, 64) \
|
||||
T (uint64_t, 64) \
|
||||
T (double, 64)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
|
||||
/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-1.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-10.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
/* { dg-additional-options "-mcmodel=medany" } */
|
||||
|
||||
#include "scatter_store-2.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-3.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-4.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-5.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-6.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-7.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
|
||||
#include "scatter_store-8.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
/* { dg-additional-options "-mcmodel=medany" } */
|
||||
|
||||
#include "scatter_store-9.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE[128]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE[128]; \
|
||||
DATA_TYPE src_##DATA_TYPE[128]; \
|
||||
INDEX##BITS indices_##DATA_TYPE##_##BITS[128]; \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128); \
|
||||
} \
|
||||
f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE, \
|
||||
indices_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < 128; i++) \
|
||||
assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]] \
|
||||
== (src_##DATA_TYPE[i] + 1));
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,45 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#ifndef INDEX8
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
#endif
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS stride, INDEX##BITS n) \
|
||||
{ \
|
||||
for (INDEX##BITS i = 0; i < n; ++i) \
|
||||
dest[i] += src[i * stride]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, DATA_TYPE) \
|
||||
T (DATA_TYPE, 8) \
|
||||
T (DATA_TYPE, 16) \
|
||||
T (DATA_TYPE, 32) \
|
||||
T (DATA_TYPE, 64)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, int8_t) \
|
||||
TEST_TYPE (T, uint8_t) \
|
||||
TEST_TYPE (T, int16_t) \
|
||||
TEST_TYPE (T, uint16_t) \
|
||||
TEST_TYPE (T, _Float16) \
|
||||
TEST_TYPE (T, int32_t) \
|
||||
TEST_TYPE (T, uint32_t) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, int64_t) \
|
||||
TEST_TYPE (T, uint64_t) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 66 "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
|
|
@ -0,0 +1,45 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#ifndef INDEX8
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
#endif
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS stride, INDEX##BITS n) \
|
||||
{ \
|
||||
for (INDEX##BITS i = 0; i < (BITS + 13); ++i) \
|
||||
dest[i] += src[i * (BITS - 3)]; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, DATA_TYPE) \
|
||||
T (DATA_TYPE, 8) \
|
||||
T (DATA_TYPE, 16) \
|
||||
T (DATA_TYPE, 32) \
|
||||
T (DATA_TYPE, 64)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, int8_t) \
|
||||
TEST_TYPE (T, uint8_t) \
|
||||
TEST_TYPE (T, int16_t) \
|
||||
TEST_TYPE (T, uint16_t) \
|
||||
TEST_TYPE (T, _Float16) \
|
||||
TEST_TYPE (T, int32_t) \
|
||||
TEST_TYPE (T, uint32_t) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, int64_t) \
|
||||
TEST_TYPE (T, uint64_t) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 46 "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
|
|
@ -0,0 +1,84 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
#include "strided_load-1.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3); \
|
||||
INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13); \
|
||||
for (INDEX##BITS i = 0; \
|
||||
i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
} \
|
||||
f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
|
||||
stride_##DATA_TYPE##_##BITS, \
|
||||
n_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++) \
|
||||
{ \
|
||||
assert ( \
|
||||
dest_##DATA_TYPE##_##BITS[i] \
|
||||
== (dest2_##DATA_TYPE##_##BITS[i] \
|
||||
+ src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS])); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,84 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
#include "strided_load-2.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3); \
|
||||
INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13); \
|
||||
for (INDEX##BITS i = 0; \
|
||||
i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
} \
|
||||
f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
|
||||
stride_##DATA_TYPE##_##BITS, \
|
||||
n_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++) \
|
||||
{ \
|
||||
assert ( \
|
||||
dest_##DATA_TYPE##_##BITS[i] \
|
||||
== (dest2_##DATA_TYPE##_##BITS[i] \
|
||||
+ src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS])); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
|
@ -0,0 +1,45 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#ifndef INDEX8
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
#endif
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS stride, INDEX##BITS n) \
|
||||
{ \
|
||||
for (INDEX##BITS i = 0; i < n; ++i) \
|
||||
dest[i * stride] = src[i] + BITS; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, DATA_TYPE) \
|
||||
T (DATA_TYPE, 8) \
|
||||
T (DATA_TYPE, 16) \
|
||||
T (DATA_TYPE, 32) \
|
||||
T (DATA_TYPE, 64)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, int8_t) \
|
||||
TEST_TYPE (T, uint8_t) \
|
||||
TEST_TYPE (T, int16_t) \
|
||||
TEST_TYPE (T, uint16_t) \
|
||||
TEST_TYPE (T, _Float16) \
|
||||
TEST_TYPE (T, int32_t) \
|
||||
TEST_TYPE (T, uint32_t) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, int64_t) \
|
||||
TEST_TYPE (T, uint64_t) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 66 "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
|
|
@ -0,0 +1,45 @@
|
|||
/* { dg-do compile } */
|
||||
/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
|
||||
|
||||
#include <stdint-gcc.h>
|
||||
|
||||
#ifndef INDEX8
|
||||
#define INDEX8 int8_t
|
||||
#define INDEX16 int16_t
|
||||
#define INDEX32 int32_t
|
||||
#define INDEX64 int64_t
|
||||
#endif
|
||||
|
||||
#define TEST_LOOP(DATA_TYPE, BITS) \
|
||||
void __attribute__ ((noinline, noclone)) \
|
||||
f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src, \
|
||||
INDEX##BITS stride, INDEX##BITS n) \
|
||||
{ \
|
||||
for (INDEX##BITS i = 0; i < n; ++i) \
|
||||
dest[i * (BITS - 3)] = src[i] + BITS; \
|
||||
}
|
||||
|
||||
#define TEST_TYPE(T, DATA_TYPE) \
|
||||
T (DATA_TYPE, 8) \
|
||||
T (DATA_TYPE, 16) \
|
||||
T (DATA_TYPE, 32) \
|
||||
T (DATA_TYPE, 64)
|
||||
|
||||
#define TEST_ALL(T) \
|
||||
TEST_TYPE (T, int8_t) \
|
||||
TEST_TYPE (T, uint8_t) \
|
||||
TEST_TYPE (T, int16_t) \
|
||||
TEST_TYPE (T, uint16_t) \
|
||||
TEST_TYPE (T, _Float16) \
|
||||
TEST_TYPE (T, int32_t) \
|
||||
TEST_TYPE (T, uint32_t) \
|
||||
TEST_TYPE (T, float) \
|
||||
TEST_TYPE (T, int64_t) \
|
||||
TEST_TYPE (T, uint64_t) \
|
||||
TEST_TYPE (T, double)
|
||||
|
||||
TEST_ALL (TEST_LOOP)
|
||||
|
||||
/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 44 "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
|
||||
/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
|
|
@ -0,0 +1,82 @@
|
|||
/* { dg-do run { target { riscv_vector } } } */
|
||||
|
||||
#include "strided_store-1.c"
|
||||
#include <assert.h>
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
#define RUN_LOOP(DATA_TYPE, BITS) \
|
||||
DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)]; \
|
||||
INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3); \
|
||||
INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13); \
|
||||
for (INDEX##BITS i = 0; \
|
||||
i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++) \
|
||||
{ \
|
||||
dest_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
dest2_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1)); \
|
||||
src_##DATA_TYPE##_##BITS[i] \
|
||||
= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1)); \
|
||||
} \
|
||||
f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
|
||||
stride_##DATA_TYPE##_##BITS, \
|
||||
n_##DATA_TYPE##_##BITS); \
|
||||
for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++) \
|
||||
{ \
|
||||
assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS] \
|
||||
== (src_##DATA_TYPE##_##BITS[i] + BITS)); \
|
||||
}
|
||||
|
||||
RUN_LOOP (int8_t, 8)
|
||||
RUN_LOOP (uint8_t, 8)
|
||||
RUN_LOOP (int16_t, 8)
|
||||
RUN_LOOP (uint16_t, 8)
|
||||
RUN_LOOP (_Float16, 8)
|
||||
RUN_LOOP (int32_t, 8)
|
||||
RUN_LOOP (uint32_t, 8)
|
||||
RUN_LOOP (float, 8)
|
||||
RUN_LOOP (int64_t, 8)
|
||||
RUN_LOOP (uint64_t, 8)
|
||||
RUN_LOOP (double, 8)
|
||||
|
||||
RUN_LOOP (int8_t, 16)
|
||||
RUN_LOOP (uint8_t, 16)
|
||||
RUN_LOOP (int16_t, 16)
|
||||
RUN_LOOP (uint16_t, 16)
|
||||
RUN_LOOP (_Float16, 16)
|
||||
RUN_LOOP (int32_t, 16)
|
||||
RUN_LOOP (uint32_t, 16)
|
||||
RUN_LOOP (float, 16)
|
||||
RUN_LOOP (int64_t, 16)
|
||||
RUN_LOOP (uint64_t, 16)
|
||||
RUN_LOOP (double, 16)
|
||||
|
||||
RUN_LOOP (int8_t, 32)
|
||||
RUN_LOOP (uint8_t, 32)
|
||||
RUN_LOOP (int16_t, 32)
|
||||
RUN_LOOP (uint16_t, 32)
|
||||
RUN_LOOP (_Float16, 32)
|
||||
RUN_LOOP (int32_t, 32)
|
||||
RUN_LOOP (uint32_t, 32)
|
||||
RUN_LOOP (float, 32)
|
||||
RUN_LOOP (int64_t, 32)
|
||||
RUN_LOOP (uint64_t, 32)
|
||||
RUN_LOOP (double, 32)
|
||||
|
||||
RUN_LOOP (int8_t, 64)
|
||||
RUN_LOOP (uint8_t, 64)
|
||||
RUN_LOOP (int16_t, 64)
|
||||
RUN_LOOP (uint16_t, 64)
|
||||
RUN_LOOP (_Float16, 64)
|
||||
RUN_LOOP (int32_t, 64)
|
||||
RUN_LOOP (uint32_t, 64)
|
||||
RUN_LOOP (float, 64)
|
||||
RUN_LOOP (int64_t, 64)
|
||||
RUN_LOOP (uint64_t, 64)
|
||||
RUN_LOOP (double, 64)
|
||||
return 0;
|
||||
}
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Reference in a new issue