This repository provides an extended and unified framework for the PUMA Compiler and PUMA Simulator, enabling detailed simulation of in-memory computing (IMC) architectures.
PUMASim-v2 is used in the following works:
- Scalable In-Memory Computing Architecture with Optimal Weight-Data Flow from Off-chip DRAM
- HASTILY: Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
- PUMA Compiler was originally developed by Izzat El Hajj (UIUC) (https://github.com/illinois-impact/puma-compiler)
- PUMA Simulator was originally developed by Aayush Ankit (Purdue) (https://github.com/Aayush-Ankit/puma-simulator)
This repository extends and integrates both tools to support enhanced architectural modeling, scheduling, and analysis.
PUMASim-v2 consists of two components:
-
PUMA Compiler (C++)
Translates a neural network model (e.g., CNNs or Transformers) into PUMA ISA instruction files for each tile and core. -
PUMA Simulator (Python)
Executes the generated instruction files to evaluate latency, energy, and area.
Both components are required to run a complete simulation.
This section provides a basic end-to-end workflow.
export LD_LIBRARY_PATH=<YOUR_PUMA_COMPILER_PATH>/src:$LD_LIBRARY_PATHExample:
export LD_LIBRARY_PATH=/local/scratch/a/user/PUMASim-v2/puma-compiler-v4/src:$LD_LIBRARY_PATHcd puma-compiler-v4/src
makecd ../test
make clean
make cnn.test
./cnn.testThis generates multiple .puma instruction files, for example:
cnn-tile0-core0.puma
cnn-tile0-core1.puma
...Each .puma file contains the PUMA ISA instructions executed by a specific tile and core.
Note
-
The model name (
cnn) is defined in your model source file (cnn.cpp), for example:Model model = Model::create("cnn");
-
If you have your own model source file (e.g.,
xyz.cpp), build and run it as follows:make xyz.test ./xyz.test
-
If it gives segmentation fault, try increasing
N_TILES_PER_NODEinsrc/common.h.
The generated .puma instruction files must be converted into a format readable by the PUMA Simulator.
Before proceeding, make sure the simulator path is correctly set in the following files:
puma-compiler-v4/test/generate-py.shpuma-compiler-v4/test/populate.py
SIMULATOR_PATH = <path_to_puma-simulator-v4>
./generate-py.shThis script will:
- Create a directory named after the model (e.g., cnn/)
- Copy this directory into:
puma-simulator-v4/test/testasm/
Note
The generated directory name depends on the model name you specified.
If errors occur during this step, check for configuration mismatches between:
puma-compiler-v4/src/common.hpuma-simulator-v4/include/config.py
In particular, the following parameters must be consistent:
- MVMU Dimension
- Number of MVMUs per core
- Number of cores per tile
Navigate to the simulator src directory and run:
cd ../../puma-simulator-v4/src
python dpe.py -n cnnHere, cnn refers to the directory located in:
puma-simulator-v4/test/testasm/Simulation results are generated in the following directory:
puma-simulator-v4/test/traces/cnn/This directory contains detailed traces and statistics for latency, energy, and area.