This project implements a Greatest Common Divisor (GCD) calculator in Verilog, showcasing two different approaches to solving the same mathematical problem. The project emphasizes digital hardware design principles by comparing:
- Behavioral Verilog model using a while loop and modulus (%) operator
- Fully synthesizable FSM and Datapath architecture based on the subtraction-driven Euclidean algorithm
- C reference implementation used as a software model for verification, testbench comparison, and generating test vectors
While both approaches are functionally correct, this project is primarily built around the FSM + Datapath design, which mirrors the internal workings of a CPU and is suitable for actual hardware implementation.
The main learning goal of this project was to explore the controller + datapath design pattern, which is foundational in modern CPU architecture. The FSM handles sequencing and logic control, while the datapath performs the arithmetic operations.
Additionally, an alternate Verilog implementation using a while loop and % modulus operator is included to compare both hardware feasibility and performance considerations.
This model follows the traditional GCD approach taught in programming courses β repeatedly applying modulus until one operand becomes zero.
while (a != 0 && b != 0) {
if (a > b)
a = a % b;
else
b = b % a;
}
gcd = (a == 0) ? b : a;- Functional in simulation
- Not synthesizable
- Uses
%operator andwhileloop, which do not map directly to logic gates - Module:
GCD_while.v,GCD_while_tb.v
This version is helpful for early-stage functional validation, but not suitable for synthesis or FPGA/ASIC design.
This is the core of the project. It builds a hardware-friendly GCD unit by breaking the design into:
- Datapath:
- Subtractor
- Comparator
- Registers (PIPO)
- MUXes
- Controller FSM:
- Issues control signals (
lda,ldb,sel1,sel2,sel3) - Monitors condition flags (
Lt,Gt,Et) - Determines which operand to update each cycle
- Issues control signals (
This architecture mimics how processors execute arithmetic through datapaths and control units, making it an excellent exercise in VLSI and digital system design.
Alongside the Verilog implementations, this project also includes a C-based GCD calculator.
This serves as a golden reference model for verifying correctness against a predefined test data file.
- Fast and portable β runs in software on any system
- Useful for validating the hardware RTL outputs
- Files:
GCD_using_C.c,gcd_test_data.txt
While both methods compute the GCD correctly, the final design was centered around the subtraction method for hardware feasibility:
| Feature | Subtraction Method | Modulus Method |
|---|---|---|
| Synthesizable | Yes | No |
| Hardware Resource Cost | Low (simple subtractor) | High (requires division circuitry) |
| Performance | Faster in hardware | Slower due to modulo complexity |
| Suitability | Excellent for FSM control | Best for simulation only |
| Final Choice | Chosen | Rejected for synthesis |
In digital hardware, modulus (%) is slower and resource-heavy compared to subtraction. Since both approaches require repetitive operations until convergence, the subtraction method is computationally more efficient in a circuit. Subtraction is just an adder with inversion and carry-in, while modulus requires division hardware or iterative logic, making it costlier and slower.
Thus, I opted to go with the FSM + Datapath design and treated the behavioral model as a learning milestone rather than the final implementation.
The FSM transitions through 6 states:
| State | Function |
|---|---|
| S0 | Load first operand A |
| S1 | Load second operand B |
| S2 | Check for equality or decide subtraction direction |
| S3 | A > B: Subtract B from A |
| S4 | B > A: Subtract A from B |
| S5 | Done β GCD ready |
| Component | Role |
|---|---|
pipo.v |
Registers to hold A and B |
subtractor.v |
Performs A - B or B - A |
comparator.v |
Generates Lt, Gt, Et flags |
mux_2x1.v |
Select paths for operand and result routing |
Example terminal output for FSM + Datapath design (from testbench with inputs 4 and 24):
GCD calculation completed
GCD of the two numbers is: 4
The Controller + Datapath GCD design was synthesized successfully, confirming that the architecture is fully hardware realizable.
A schematic was auto-generated during synthesis, showcasing how the datapath and FSM logic are mapped into gates and registers.
Overview Schematic π Overview Schematic PDF
Detailed Schematic π Detailed Schematic PDF
βββ hdl/
β βββ rtl/
β β βββ datapath.v
β β βββ controller.v
β β βββ pipo.v
β β βββ subtractor.v
β β βββ comparator.v
β β βββ mux_2x1.v
β β βββ GCD_while.v
β β
β βββ tb/
β βββ gcd_tb.v
β βββ GCD_while_tb.v
β
βββ images/
β βββ block_diagram.jpeg
β βββ controller_flowchart.jpeg
β βββ fsm_diagram.jpeg
β βββ datapath_flow.jpeg
β βββ output_subtraction.gif
β βββ output_modulus.gif
β βββ schematic.png
β βββ schematic_1.png
β
βββ GCD_Using_C/
β βββ GCD_using_C.c
β βββ gcd_test_data.txt
β
βββ README.md
βββ LICENSE
| File | Description |
|---|---|
datapath.v |
Main datapath module |
controller.v |
FSM controller logic |
pipo.v |
PIPO register |
subtractor.v |
Arithmetic subtractor |
comparator.v |
Comparison logic |
mux_2x1.v |
2:1 multiplexer |
gcd_tb.v |
Testbench for FSM-based GCD |
GCD_while.v |
Behavioral GCD using while + modulus |
GCD_while_tb.v |
Testbench for GCD_while |
images/ |
Waveform and output screenshots |
*.vcd |
Simulation waveform files |
gcd.c |
C reference model for GCD |
gcd_test_data.txt |
Test vectors for validating the C model |
| Tool | Purpose |
|---|---|
| Icarus Verilog | Compile/simulate Verilog code |
| GTKWave | View simulation waveform dumps (.vcd files) |
| EDA Playground | Online Verilog editor and schematic viewer |
This project demonstrates two distinct ways to compute GCD in Verilog β one focused on hardware synthesis and the other on algorithmic clarity:
- The FSM-based subtraction model is modular, synthesizable, and suited for real hardware.
- The while + modulus approach offers quick simulation modeling but is not synthesizable and impractical in hardware.
Through this, I gained a deeper understanding of:
- FSM and datapath partitioning
- Hardware implementation vs behavioral modeling
- Trade-offs in algorithm selection based on speed and synthesizability
Open for educational and personal use under the MIT License








