Parametric systolic design for matrix multiplication on an M×M 2D torus (wrap-around) mesh using a Cannon-style alignment and shift schedule. Each Processing Element (PE) performs a local MAC while streaming A left and B up across the mesh.
- PE: Local registers (
A,B,ACC) and a small FSM (IDLE,LOAD,COMPUTE,DONE) - Top_Mesh: Generates an
M×Mtorus-connected grid (wrap-around on rows/cols) - Testbenches: deterministic / random (seedable) / PE trace (cycle-by-cycle internal visibility)
rtl/
├── PE.v
└── Top_Mesh.v
tb/
├── tb_cannon.v
├── tb_cannon_rand.v
└── tb_cannon_pe_trace.v
sim/
├── run.do
├── wave.do
├── run_rand.do
├── wave_rand.do
├── run_pe.do
└── wave_pe.do
utilization_report.txt
Run from the repository root:
do sim/run.do
do sim/run_rand.do
do sim/run_rand.do 12345
do sim/run_pe.do
do sim/run_pe.do 12345Note: Waveform configurations are loaded automatically via
sim/wave*.do.
Update parameters consistently in rtl/Top_Mesh.v and the selected testbench:
| Parameter | Default | Description |
|---|---|---|
M |
5 | Mesh dimension (M×M grid) |
WIDTH |
16 | Bit-width of each matrix element |
Important: When changing
MorWIDTH, update bothrtl/Top_Mesh.vand the chosen testbench intb/.
Simulated in ModelSim/Questa and synthesized in Vivado 2025.1.1. Utilization snapshot:
utilization_report.txt
Tip: For deployment, wrap top-level I/O behind a standard interface (e.g., AXI-lite / AXI-stream) as needed.