Skip to content

Hardware Build And Deployment

Linus Jungemann edited this page Sep 25, 2025 · 1 revision

image

A model where all layers have been converted to either HLS or RTL layers can be processed by FINN+ to build a bitfile and driver targeting a Zynq or Alveo system or to generate a Vivado IP Integrator (IPI) design with AXI stream (FIFO) in-out interfaces, which can be integrated onto any Xilinx FPGA as part of a larger system.

Hardware Build

Internally, the hardware build consists of the following steps:

  1. DMA and DWC node insertion
  2. Partitioning for floorplanning
  3. FIFO insertion and IP generation
  4. Vivado/Vitis project generation and synthesis
  5. Driver generation

DMA and DWC Node Insertion

At this step, FINN+ will insert custom hardware-oriented ONNX nodes into the graph. These are DMA engines for moving data into and out of the accelerator (from DRAM), and data width converters between consecutive nodes where required. These steps are accomplished by calling the finn.transformation.fpgadataflow.insert_iodma.InsertIODMA and finn.transformation.fpgadataflow.insert_dwc.InsertDWC transformations, respectively.

Partitioning for Floorplanning

FINN+ will now partition the graph into several StreamingDataflowPartitions. This capability is most to facilitate floorplanning for future FINN+ releases and does not alter the functioning of the model itself. Each DMA node will be placed into its own partition. If no partition number attributes are specified, all the regular network nodes will become a single partition. This is accomplished by the finn.transformation.fpgadataflow.floorplan.Floorplan and finn.transformation.fpgadataflow.create_dataflow_partition.CreateDataflowPartition transformations.

Tip

For Vitis, each partition will be compiled as a separate kernel, and linked together afterwards. For Zynq, each partition will become an IP block.

FIFO Insertion and IP Generation

FINN+ will descend into each partition and insert FIFO nodes between streaming nodes, where FIFO depths dictated by the node attributes, using the finn.transformation.fpgadataflow.insert_fifo.InsertFIFO transformation. Afterwards, IP blocks will be created for each partition, which in turn contain the IP blocks for HLS layers and RTL modules for RTL layers stitched together. The layer-level IP blocks for HLS layers are generated by Vitis HLS, using a sequence of finn.transformation.fpgadataflow.prepare_ip.PrepareIP and finn.transformation.fpgadataflow.hlssynth_ip.HLSSynthIP transformations. For RTL layers calling finn.transformation.fpgadataflow.prepare_ip.PrepareIP will fill out the RTL wrapper files and store all files belonging to the RTL module in a folder.

The top-level IP blocks are generated in Vivado IPI, using the finn.transformation.fpgadataflow.create_stitched_ip.CreateStitchedIP transformation.

Vivado/Vitis Project Generation and Synthesis

The final step in the hardware build flow is to generate a Vivado (for Zynq) or Vitis (for Alveo) project, and run synthesis to generate a bitfile. This is done using the MakeZYNQProject transformation for Zynq, and the VitisLink transformation for Alveo.

Driver Generation

To rapidly test the generated design on PYNQ platforms, FINN+ is capable of generating a Python driver for the given design. This driver packs/unpacks the input/output tensors in the expected format, then uses PYNQ APIs to initiate data movement and transfer back the results to the host CPU. The generation of the driver is done by transformation pass finn.transformation.fpgadataflow.make_driver.MakePYNQDriver.

Additionally, a C++ driver can be generated for high-performance applications on Alveo platforms. This driver uses XRT APIs to manage data movement and kernel execution. The generation of the C++ driver is done by transformation pass finn.transformation.fpgadataflow.make_driver.MakeCPPDriver.

Deployment

The bitfile and the driver file(s) can be copied to the PYNQ board and be executed there. For more information see the description in the end2end_example Jupyter notebooks.

Clone this wiki locally