Skip to content

Ci flwadd#514

Open
tommydcjung wants to merge 82 commits into
ci_bigbladefrom
ci_flwadd
Open

Ci flwadd#514
tommydcjung wants to merge 82 commits into
ci_bigbladefrom
ci_flwadd

Conversation

@tommydcjung

Copy link
Copy Markdown
Contributor

No description provided.

tommydcjung and others added 30 commits January 11, 2021 15:21
* Move wormhole concentrator outside pod ruche array

* merge north and south vcaches traffic
extend cid to 2-bit
* clean up branch trace

* remote interrupt interface

* npc_r

* add mret encoding

* fcsr load goes to rs2_val

* interrupt csr test

* add mcsr

* interrupt trace test

* set pc_init_val by _start;

* interrupt remote test

* remote interrrupt icache miss

* trace interrupt icache miss test

* CR

* change remote interrupt eva

* interrupt trace countdown

* printing interrupt taken

* interrupt trace float

* mret display; debug_p

* more test; remote interrupt ptr

* interrupt trace jump loop

* interrupt trace jump loop icache

* interrupt dual source

* interrupt trace branch loop icache

* interrupt_trace_branch_mispredict_loop

* remote load loop with interrupt
* io rtr tag client

* set width_p for io rtr tag clients
* wh return fiof

* fix
* Added reset_done port to dpi interface

* Adds missing inputs to endpoint_to_fifos

* Removed pod x/y

* Adds CUDA kernel for DMA test

* Adds missing file

Co-authored-by: Dustin Richmond <dustinar@uw.edu>
* temp commit

* fix regression

* offset dmem start addr by 8 bytes for interrupt handler

* CR

* cr

* adding .dmem.interrupt section; declaring _interrupt_arr in crt.S; update hello to use interrupt arr using extern

* add comment
* Adding trace interrupt test for idiv, fdiv and imul

* Adding remote interrupt tests with multiplier and divider

* Testing multiple remote interrupts, icache misses in handler and different handler return strategies

* Adding a interrupt test regression

* Deleting duplicates within the interrupt regression suite without contaminating author history

* Using a macro for the start code

* Makefile cleanup

* More regression makefile cleanup

* Modifying macro name

* Adding npc mret test

* Deleting duplicate files and noting original author in specific files

* Adding interrupt tests to the no recurse list

* Adding a mini threading test

* Directed test exposing the need for PR #463

* Adding a readme

* Minor makefile modifications

* Adding missing test to regression
tommydcjung and others added 16 commits March 30, 2021 08:12
* Rewrite Victim Cache Profiler Parser (#428)
* Fixed vcache profiler bugs with re-write
* fix tile parser to get correct min cycle no. for absolute total cycles calculation when using multiple tags (#480)
* Fix header_print_p (for bigblade)

Bugs:
* Order of operations: Python binds or higher than >
* Counting of repeated tags. Previously, only the last tag was counted
* Iteration-order aware tag window

New Features:
* Atomic Misses
* Response Stalls
* More documentation

* Remove deprecated code
* Small field name modifications
* Fixed issue with mismatched tags error
* Fixed issue where TG origin/dim were confused for Device origin/dim


Co-authored-by: Emily Furst <eafurst@cs.washington.edu>
* imul mux order

* fp_exe flush minimize toggles

* add comment

* add comment 2
* io rtr

* pod row refactor

* move bsg_tag out of pod

* remove unused port

* clean up

* move out wh ruche buffers

* move out west ruche buffers
* num_clk_ports_p for subarray

* fix x -> c

* fix

* add param in pod row
@tommydcjung

Copy link
Copy Markdown
Contributor Author

// Machine Format:
// rs1 rs2 rd opcode
// 0000000_?????_?????_111_?????_0000100
`define RV32_FLWADD_OP 7'b0000100

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Pulpino supports a post-increment instruction in their compiler
// here: https://github.com/taylor-bsg/pulp-riscv-gcc/blame/master/gcc/config/riscv/riscv.md#L3663
// this can be used as a reference for adding compiler support
// for instructions with sideeffects to addresses.

// Load & Store
logic is_load_op; // Op loads data from memory
logic is_store_op; // Op stores data to memory
logic is_load_op; // Op is lw or flw

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should now add flwadd?

| (instruction_i.funct7 == `RV32_FCVT_S_F2I_FUN7)); // FCVT.W.S, FCVT.WU.S
end
`RV32_FLWADD_OP: begin
decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably part of this redundant with the fact that we are in the RW32_FLWADD_OP case statement?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably can just add FLWADD_OP to standard set of cases at the top of the casez statement?

decode_o.write_rd = (instruction_i.funct7 == 7'b0000000) & (instruction_i.funct3 == 3'b111) & (instruction_i.rs1 != '0);
end
`RV32_SYSTEM: begin
decode_o.write_rd = (instruction_i.rd != '0); // CSRRW, CSRRS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine with case above?

`RV32_BRANCH, `RV32_STORE, `RV32_OP: begin
decode_o.read_rs2 = 1'b1;
end
`RV32_FLWADD_OP: begin

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine with above?

decode_o.read_frs2 = 1'b0;
decode_o.read_frs3 = 1'b0;
decode_o.write_frd = 1'b1;
decode_o.is_fp_op = 1'b0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment why FLWADD is not an fp op

Comment thread v/vanilla_bean/lsu.v
assign dmem_v_o = is_local_dmem_addr &
(exe_decode_i.is_load_op | exe_decode_i.is_store_op |
exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op);
exe_decode_i.is_lr_op | exe_decode_i.is_lr_aq_op | exe_decode_i.is_flwadd_op);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now is_load_op includes is_flwadd_op right?

Comment thread v/vanilla_bean/lsu.v

assign remote_req_v_o = icache_miss_i |
((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op) & ~is_local_dmem_addr);
((exe_decode_i.is_load_op | exe_decode_i.is_store_op | exe_decode_i.is_amo_op | exe_decode_i.is_flwadd_op) & ~is_local_dmem_addr);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above?

// is_flwadd_op has exe_r.decode.write_rd high will not trigger int_remote_load_in_exe
wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_rd;
wire float_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & exe_r.decode.write_frd;
wire float_remote_load_in_exe = remote_req_in_exe & (exe_r.decode.is_load_op | exe_r.decode.is_flwadd_op) & exe_r.decode.write_frd;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_load_op now includes is_flwadd_op?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wire int_remote_load_in_exe = remote_req_in_exe & exe_r.decode.is_load_op & ~exe_r.decode.is_flwadd_op & exe_r.decode.write_rd;

|(id_r.decode.read_frs2 & (id_rs2 == exe_r.instruction.rd) & exe_r.decode.write_frd)
|(id_r.decode.read_frs3 & (id_rs3 == exe_r.instruction.rd) & exe_r.decode.write_frd));


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like the above is redundant if we include flwadd in local_load_in_exe

@taylor-bsg taylor-bsg left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level feed back: the code is simpler if is_load includes flwadd; (which is currently the case, but the code does not reflect this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants