ARC-V RHX-100 upstream patch series#192
Conversation
MichielDerhaeg
left a comment
There was a problem hiding this comment.
I tried to split up the commits in something sensible. Didn't check whether they can be built individually though.
| case SIGN_EXTRACT: | ||
| if (TARGET_XTHEADBB && outer_code == SET | ||
| if ((TARGET_ARCV_RHX100 || TARGET_XTHEADBB) | ||
| && outer_code == SET |
There was a problem hiding this comment.
FYI, this was added for the bit-extract fusion.
| (define_insn "*zero_extract_fused" | ||
| [(set (match_operand:SI 0 "register_operand" "=r") | ||
| (zero_extract:SI (match_operand:SI 1 "register_operand" "r") | ||
| (match_operand 2 "const_int_operand") | ||
| (match_operand 3 "const_int_operand")))] | ||
| "TARGET_ARCV_RHX100 && !TARGET_64BIT | ||
| && (INTVAL (operands[2]) > 1 || !TARGET_ZBS)" | ||
| { | ||
| int amount = INTVAL (operands[2]); | ||
| int end = INTVAL (operands[3]) + amount; | ||
| operands[2] = GEN_INT (BITS_PER_WORD - end); | ||
| operands[3] = GEN_INT (BITS_PER_WORD - amount); | ||
| return "slli\t%0,%1,%2\n\tsrli\t%0,%0,%3"; | ||
| } | ||
| [(set_attr "type" "alu_fused")] | ||
| ) |
There was a problem hiding this comment.
As far as I can tell, this fusion was never implemented as a define_insn_and_split. Might not be trivial to force these exact instructions after a split.
f61d7aa to
d479345
Compare
00daa51 to
d625af5
Compare
39e1a5e to
b64f328
Compare
24c1ff5 to
45a3bae
Compare
| if (!TARGET_ARCV_RHX100) | ||
| return priority; | ||
|
|
There was a problem hiding this comment.
We already check whether the fusion was enabled in the caller.
There was a problem hiding this comment.
Do we want to check RISCV_FUSE_ARCV or specifically RHX100 ? I know currently it's the same think, but for future purposes.
There was a problem hiding this comment.
I think we want most things to be gated by RISCV_FUSE_ARCV except for the things added by the scheduling patch (that adds reorder2, etc).
The scheduling patch adds things that are really specific about the pipeline definition but everything else is just about the fusion schedule grouping.
| extern bool arcv_can_issue_more_p (int, int); | ||
| extern int arcv_sched_variable_issue (rtx_insn *, int); | ||
| extern void arcv_sched_init (void); | ||
| extern int arcv_sched_reorder2 (rtx_insn **, int *); | ||
| extern int arcv_sched_adjust_priority (rtx_insn *, int); | ||
| extern int arcv_sched_adjust_cost (rtx_insn *, int, int); |
There was a problem hiding this comment.
These need to be added in the scheduling pass.
There was a problem hiding this comment.
I might have messed up during rebasing. Fixing it...
b27a878 to
f86d7b8
Compare
9392bf1 to
f8745dc
Compare
4ab7f9c to
c0cb07a
Compare
c0cb07a to
682b1c3
Compare
64cf146 to
c6c53a7
Compare
fae1aae to
b6f8190
Compare
|
@MichielDerhaeg Take a look at the current fixups |
|
|
||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
Upstream covers ld+lui already.
We can split this up so we just have to add st+lui.
There was a problem hiding this comment.
Ive simplified and removed the function call. I believe its easier to read like this: 5d4f626
| if (riscv_fusion_enabled_p (RISCV_FUSE_MEMOP_LUI)) | ||
| { | ||
| if (riscv_memop_lui_pair_p (prev, curr)) | ||
| { | ||
| if (dump_file) | ||
| fprintf (dump_file, "RISCV_FUSE_MEMOP_LUI (prev, curr)\n"); | ||
| return true; | ||
| } | ||
| if (riscv_memop_lui_pair_p (curr, prev)) | ||
| { | ||
| if (dump_file) | ||
| fprintf (dump_file, "RISCV_FUSE_MEMOP_LUI (curr, prev)\n"); | ||
| return true; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
If we merge part of this fusion with the upstream version. We can outline the upstream version and add a _REV fusion type for the reversed case.
There was a problem hiding this comment.
I have created a new RISCV_FUSE_LUI_LD_REV alongside the other RISCV_FUSE_LUI_LD See 5ffb20e
But, shouldnt we align with the others? Either adapt this to use (curr, prev) or make it more modular and create new fusion types for the reversed cases. e.g., RISCV_FUSE_STORE_LUI and RISCV_FUSE_STORE_LUI_REV
There was a problem hiding this comment.
I would outline the original fusions, and just reverse the input params for _rev.
|
@exur00 not sure if this happened already, but the next_fusible_insn has to be fixed here as well |
I have added it |
| RISCV_FUSE_ADJACENT_STORE = (1 << 17), | ||
| RISCV_FUSE_LS_UPDATE = (1 << 18), | ||
| RISCV_FUSE_MEMOP_LUI = (1 << 19), | ||
| RISCV_FUSE_STORE_LUI = (1 << 19), |
There was a problem hiding this comment.
I would call it RISCV_FUSE_LUI_ST to align with the old ones.
| volatile int g1, g2, g3; | ||
|
|
||
| void | ||
| fuse_store_lui (volatile int *p, int val) | ||
| { | ||
| *p = val; | ||
| g1 = val; | ||
| p[1] = val; | ||
| g2 = val; | ||
| p[2] = val; | ||
| g3 = val; | ||
| } | ||
|
|
||
| /* { dg-final { scan-rtl-dump "RISCV_FUSE_STORE_LUI \\(prev, curr\\)" "sched2" } } */ | ||
| /* { dg-final { scan-rtl-dump "RISCV_FUSE_STORE_LUI \\(curr, prev\\)" "sched2" } } */ |
There was a problem hiding this comment.
Why the changes here?
There was a problem hiding this comment.
RISCV_FUSE_MEMOP_LUI used to handle both load and store LUI fusion. When we split them apart, the existing test case didnt cover RISCV_FUSE_LUI_ST (previously RISCV_FUSE_STORE_LUI), so I added a test case for it.
| if (GET_MODE (store_src) != GET_MODE (load_dest)) | ||
| fprintf (dump_file, "RISCV_FUSE_LI_STORE (subreg)\n"); | ||
| else | ||
| fprintf (dump_file, "RISCV_FUSE_LI_STORE\n"); | ||
| } |
There was a problem hiding this comment.
What's the point of this?
There was a problem hiding this comment.
This was already in the original code. Just distinguishes in the dump whether the li and store operate on the same mode or not. (e..g, li sets a DImode reg but the store uses it as SImode). Happy to drop the distinction and just print RISCV_FUSE_LI_STORE if you think it's just noise.
There was a problem hiding this comment.
It can go, I think it isn't useful.
This patch adds the fusion patterns for the Synopsys RHX-100. All of these
patterns will be enabled when RISCV_FUSE_ARCV is added to the tune_info.
It prioritizes to double load/store fusion, suppressing the other types until
sched2.
A new arcv.cc file is added to contain the new ARC-V specific functions, and
the preexisting arcv_* functions are moved from riscv.cc to this new file.
gcc/ChangeLog:
* config.gcc: Add arcv.o to extra_objs.
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): New function.
* config/riscv/t-riscv: Add arcv.o build rule.
* config/riscv/arcv.cc: New file.
Co-authored-by: Artemiy Volkov <artemiyv@acm.org>
Co-authored-by: Michiel Derhaeg <michiel@synopsys.com>
Signed-off-by: Luis Silva <luiss@synopsys.com>
…eries.
This patch implements the TARGET_SCHED_FUSION_PRIORITY hook for the Synopsys
RHX-100 processor to improve instruction scheduling by prioritizing fusible
memory operations.
To take better advantage of double load/store fusion, make use of the
sched_fusion pass that assigns unique "fusion priorities" to load/store
instructions and schedules operations on adjacent addresses together. This
maximizes the probability that loads/stores are fused between each other
rather than with other instructions.
gcc/ChangeLog:
* config/riscv/arcv.cc (arcv_fusion_load_store): New function.
(arcv_sched_fusion_priority): New function.
* config/riscv/riscv.cc (riscv_sched_fusion_priority): New function.
(TARGET_SCHED_FUSION_PRIORITY): Define hook.
Co-authored-by: Artemiy Volkov <artemiyv@acm.org>
Co-authored-by: Michiel Derhaeg <michiel@synopsys.com>
Signed-off-by: Luis Silva <luiss@synopsys.com>
This patch implements instruction scheduling support for the dual-issue Synopsys
RHX-100 processor by adding scheduler hooks and state tracking for the two
execution pipes.
These hooks address microarchitectural details not covered by the pipeline
description.
The riscv_sched_variable_issue () and riscv_sched_reorder2 () hooks work
together to make sure that:
- the critical path and the instruction priorities are respected;
- both pipes are filled (taking advantage of parallel dispatch within the
microarchitectural constraints);
- there is as much fusion going on as possible;
- the existing fusion pairs are not broken up.
riscv_sched_adjust_priority () slightly bumps the priority of load/store pairs.
As a result it becomes easier for riscv_sched_reorder2 () to schedule
instructions in the memory pipe.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (arcv_sched_init): New declaration.
(arcv_sched_reorder2): declaration.
(arcv_sched_adjust_priority): New declaration.
(arcv_sched_adjust_cost): New declaration.
(arcv_can_issue_more_p): New declaration.
(arcv_sched_variable_issue): New declaration.
* config/riscv/arcv.cc (struct arcv_sched_state): New struct.
(arcv_sched_init): New function.
(arcv_next_fusible_insn): New function.
(arcv_sched_reorder2): New function.
(arcv_sched_adjust_priority): New function.
(arcv_sched_adjust_cost): New function.
(arcv_can_issue_more_p): New function.
(arcv_sched_variable_issue): New function.
* config/riscv/riscv.cc (riscv_fusion_enabled_p): Add forward
declaration.
(riscv_sched_init): Add call to arcv_shed_init.
(riscv_sched_variable_issue): Add ARC-V-specific handling.
(riscv_sched_adjust_cost): Add ARC-V-specific cost adjustment and fix
parameter names.
(riscv_sched_adjust_priority): New function.
(riscv_sched_reorder2): New function.
(TARGET_SCHED_ADJUST_PRIORITY): Define hook.
(TARGET_SCHED_REORDER2): Define hook.
* config/riscv/riscv.h (TARGET_ARCV_RHX100): New macro.
Co-authored-by: Artemiy Volkov <artemiyv@acm.org>
Co-authored-by: Michiel Derhaeg <michiel@synopsys.com>
Co-authored-by: Alex Turjan <turjan@synopsys.com>
Signed-off-by: Luis Silva <luiss@synopsys.com>
…act fusion.
This patch adds instruction patterns to support fusion of multiply-add and bit
extraction sequences for the Synopsys RHX-100 processor. This increases the
likelihood that fusible sequences are produced in more situations.
gcc/ChangeLog:
* config/riscv/arcv-rhx100.md (arcv_rhx100_imul_fused): New reservation.
(arcv_rhx100_alu_fused): New reservation.
* config/riscv/iterators.md (is_zero_extract): New code attribute.
* config/riscv/riscv.cc (riscv_rtx_costs): Reduce cost for zero_extract
for RHX-100.
* config/riscv/riscv.md: Add imul_fused and alu_fused type attributes.
(umaddhisi4): New expand.
(madd_split): New insn_and_split.
(madd_split_extended): New insn_and_split.
(*zero_extract_fused): New insn_and_split.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arcv-fusion-limm-condbr.c: New test.
* gcc.target/riscv/arcv-fusion-madd.c: New test.
* gcc.target/riscv/arcv-fusion-xbfu.c: New test.
Co-authored-by: Artemiy Volkov <artemiyv@acm.org>
Co-authored-by: Michiel Derhaeg <michiel@synopsys.com>
Signed-off-by: Luis Silva <luiss@synopsys.com>
1ccf8e0 to
694295e
Compare
No description provided.