Ci flwadd#514
Conversation
* Move wormhole concentrator outside pod ruche array * merge north and south vcaches traffic extend cid to 2-bit
* clean up branch trace * remote interrupt interface * npc_r * add mret encoding * fcsr load goes to rs2_val * interrupt csr test * add mcsr * interrupt trace test * set pc_init_val by _start; * interrupt remote test * remote interrrupt icache miss * trace interrupt icache miss test * CR * change remote interrupt eva * interrupt trace countdown * printing interrupt taken * interrupt trace float * mret display; debug_p * more test; remote interrupt ptr * interrupt trace jump loop * interrupt trace jump loop icache * interrupt dual source * interrupt trace branch loop icache * interrupt_trace_branch_mispredict_loop * remote load loop with interrupt
* io rtr tag client * set width_p for io rtr tag clients
* wh return fiof * fix
* Added reset_done port to dpi interface * Adds missing inputs to endpoint_to_fifos * Removed pod x/y * Adds CUDA kernel for DMA test * Adds missing file Co-authored-by: Dustin Richmond <dustinar@uw.edu>
* temp commit * fix regression * offset dmem start addr by 8 bytes for interrupt handler * CR * cr * adding .dmem.interrupt section; declaring _interrupt_arr in crt.S; update hello to use interrupt arr using extern * add comment
* Adding trace interrupt test for idiv, fdiv and imul * Adding remote interrupt tests with multiplier and divider * Testing multiple remote interrupts, icache misses in handler and different handler return strategies * Adding a interrupt test regression * Deleting duplicates within the interrupt regression suite without contaminating author history * Using a macro for the start code * Makefile cleanup * More regression makefile cleanup * Modifying macro name * Adding npc mret test * Deleting duplicate files and noting original author in specific files * Adding interrupt tests to the no recurse list * Adding a mini threading test * Directed test exposing the need for PR #463 * Adding a readme * Minor makefile modifications * Adding missing test to regression
* Rewrite Victim Cache Profiler Parser (#428) * Fixed vcache profiler bugs with re-write * fix tile parser to get correct min cycle no. for absolute total cycles calculation when using multiple tags (#480) * Fix header_print_p (for bigblade) Bugs: * Order of operations: Python binds or higher than > * Counting of repeated tags. Previously, only the last tag was counted * Iteration-order aware tag window New Features: * Atomic Misses * Response Stalls * More documentation * Remove deprecated code * Small field name modifications * Fixed issue with mismatched tags error * Fixed issue where TG origin/dim were confused for Device origin/dim Co-authored-by: Emily Furst <eafurst@cs.washington.edu>
* imul mux order * fp_exe flush minimize toggles * add comment * add comment 2
refactor dram hash func
* io rtr * pod row refactor * move bsg_tag out of pod * remove unused port * clean up * move out wh ruche buffers * move out west ruche buffers
* num_clk_ports_p for subarray * fix x -> c * fix * add param in pod row
| // Machine Format: | ||
| // rs1 rs2 rd opcode | ||
| // 0000000_?????_?????_111_?????_0000100 | ||
| `define RV32_FLWADD_OP 7'b0000100 |
There was a problem hiding this comment.
// Pulpino supports a post-increment instruction in their compiler
// here: https://github.com/taylor-bsg/pulp-riscv-gcc/blame/master/gcc/config/riscv/riscv.md#L3663
// this can be used as a reference for adding compiler support
// for instructions with sideeffects to addresses.
| // Load & Store | ||
| logic is_load_op; // Op loads data from memory | ||
| logic is_store_op; // Op stores data to memory | ||
| logic is_load_op; // Op is lw or flw |
| | (instruction_i.funct7 == `RV32_FCVT_S_F2I_FUN7)); // FCVT.W.S, FCVT.WU.S | ||
| end | ||
| `RV32_FLWADD_OP: begin | ||
| decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0); |
There was a problem hiding this comment.
presumably part of this redundant with the fact that we are in the RW32_FLWADD_OP case statement?
There was a problem hiding this comment.
Probably can just add FLWADD_OP to standard set of cases at the top of the casez statement?
| decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0); | ||
| end | ||
| `RV32_SYSTEM: begin | ||
| decode_o.write_rd = (instruction_i.rd != '0); // CSRRW, CSRRS |
There was a problem hiding this comment.
combine with case above?
| `RV32_BRANCH, `RV32_STORE, `RV32_OP: begin | ||
| decode_o.read_rs2 = 1'b1; | ||
| end | ||
| `RV32_FLWADD_OP: begin |
| decode_o.read_frs2 = 1'b0; | ||
| decode_o.read_frs3 = 1'b0; | ||
| decode_o.write_frd = 1'b1; | ||
| decode_o.is_fp_op = 1'b0; |
There was a problem hiding this comment.
comment why FLWADD is not an fp op
| assign dmem_v_o = is_local_dmem_addr & | ||
| (exe_decode_i.is_load_op | exe_decode_i.is_store_op | | ||
| exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op); | ||
| exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op | exe_decode_i.is_flwadd_op); |
There was a problem hiding this comment.
now is_load_op includes is_flwadd_op right?
|
|
||
| assign remote_req_v_o = icache_miss_i | | ||
| ((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op) & ~is_local_dmem_addr); | ||
| ((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op | exe_decode_i.is_flwadd_op) & ~is_local_dmem_addr); |
| // is_flwadd_op has exe_r.decode.write_rd high will not trigger int_remote_load_in_exe | ||
| wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_rd; | ||
| wire float_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_frd; | ||
| wire float_remote_load_in_exe = remote_req_in_exe & (exe_r.decode.is_load_op | exe_r.decode.is_flwadd_op) & exe_r.decode.write_frd; |
There was a problem hiding this comment.
is_load_op now includes is_flwadd_op?
There was a problem hiding this comment.
wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & ~exe_r.decode.is_flwadd_op & exe_r.decode.write_rd;
| |(id_r.decode.read_frs2 & (id_rs2 == exe_r.instruction.rd) & exe_r.decode.write_frd) | ||
| |(id_r.decode.read_frs3 & (id_rs3 == exe_r.instruction.rd) & exe_r.decode.write_frd)); | ||
|
|
||
|
|
There was a problem hiding this comment.
seems like the above is redundant if we include flwadd in local_load_in_exe
taylor-bsg
left a comment
There was a problem hiding this comment.
High level feed back: the code is simpler if is_load includes flwadd; (which is currently the case, but the code does not reflect this)
No description provided.