finn.transformation

This page contains the complete API reference for all modules in the finn.transformation package.

finn.transformation.fpgadataflow.annotate_cycles
finn.transformation.fpgadataflow.annotate_resources
finn.transformation.fpgadataflow.attention
finn.transformation.fpgadataflow.attention_heads
finn.transformation.fpgadataflow.cleanup
finn.transformation.fpgadataflow.compile_cppsim
finn.transformation.fpgadataflow.convert_to_hw_layers
finn.transformation.fpgadataflow.create_dataflow_partition
finn.transformation.fpgadataflow.create_stitched_ip
finn.transformation.fpgadataflow.derive_characteristic
finn.transformation.fpgadataflow.externalize_params
finn.transformation.fpgadataflow.floorplan
finn.transformation.fpgadataflow.hlssynth_ip
finn.transformation.fpgadataflow.infer_pixel_padding_deconv
finn.transformation.fpgadataflow.insert_dwc
finn.transformation.fpgadataflow.insert_fifo
finn.transformation.fpgadataflow.insert_hook
finn.transformation.fpgadataflow.insert_iodma
finn.transformation.fpgadataflow.insert_tlastmarker
finn.transformation.fpgadataflow.instrumentation
finn.transformation.fpgadataflow.loop_rolling
finn.transformation.fpgadataflow.make_driver
finn.transformation.fpgadataflow.make_zynq_proj
finn.transformation.fpgadataflow.minimize_accumulator_width
finn.transformation.fpgadataflow.minimize_weight_bit_width
finn.transformation.fpgadataflow.prepare_cppsim
finn.transformation.fpgadataflow.prepare_ip
finn.transformation.fpgadataflow.prepare_rtlsim
finn.transformation.fpgadataflow.raise_scalar_to_rank1
finn.transformation.fpgadataflow.replace_verilog_relpaths
finn.transformation.fpgadataflow.set_exec_mode
finn.transformation.fpgadataflow.set_fifo_depths
finn.transformation.fpgadataflow.set_folding
finn.transformation.fpgadataflow.set_loop_boundary
finn.transformation.fpgadataflow.specialize_layers
finn.transformation.fpgadataflow.synth_ooc
finn.transformation.fpgadataflow.templates
finn.transformation.fpgadataflow.transpose_decomposition
finn.transformation.fpgadataflow.vitis_build
finn.transformation.fpgadataflow.vivado_power_estimation
finn.transformation.general
finn.transformation.move_reshape
finn.transformation.qonnx.convert_qonnx_to_finn
finn.transformation.qonnx.fold_quant_weights
finn.transformation.qonnx.infer_quant_avg_pool_2d
finn.transformation.qonnx.qonnx_activation_handlers
finn.transformation.qonnx.quant_act_to_multithreshold
finn.transformation.squeeze
finn.transformation.streamline
finn.transformation.streamline.absorb
finn.transformation.streamline.collapse_repeated
finn.transformation.streamline.extract_norm_scale_bias
finn.transformation.streamline.reorder
finn.transformation.streamline.round_thresholds
finn.transformation.streamline.sign_to_thres
finn.transformation.util

finn.transformation.fpgadataflow.annotate_cycles

Annotate the estimate of clock cycles per sample taken by each fpgadataflow node as an attribute on the node.

AnnotateCycles Objects

class AnnotateCycles(Transformation)

Annotate the estimate of clock cycles per sample taken by each fpgadataflow node as an attribute on the node.

init

def __init__() -> None

Constucts the AnnotateCycles transformation.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, Literal[False]]

Annotate the estimate of clock cycles per sample taken for each layer.

finn.transformation.fpgadataflow.annotate_resources

AnnotateResources Objects

class AnnotateResources(Transformation)

Annotate the amount of FPGA resources taken by each fpgadataflow node as an attribute on the node, depending on the mode parameter:

'estimate' -- use the analytical estimation model
'hls' -- use results from the HLS synthesis report
'synth' -- use post-synthesis (Vivado or Vitis) report

No annotations can be provided unless the relevant transformation for the chosen mode (e.g. HLSSynthIP for hls) was previously run.

finn.transformation.fpgadataflow.attention

Transformations for converting scaled dot-product attention patterns to FINN hardware layers.

This module provides transformations to detect and convert multi-node ONNX patterns representing scaled dot-product attention into single FINN custom hardware operations, enabling efficient FPGA implementation of transformer attention mechanisms.

InferScaledDotProductAttention Objects

class InferScaledDotProductAttention(Transformation)

Convert the operator pattern corresponding to scaled dot-product attention to the hardware custom operator node.

apply

def apply(model: ModelWrapper)

Apply the transform to a whole model graph.

AbsorbMultiThresholdIntoScaledDotProductAttention Objects

class AbsorbMultiThresholdIntoScaledDotProductAttention(Transformation)

Absorb a MultiThreshold into ScaledDotProductAttention if there is not already an activation included.

apply

def apply(model: ModelWrapper)

Apply the transform to a whole model graph.

finn.transformation.fpgadataflow.attention_heads

Transformations for multi-head attention patterns in FPGA dataflow.

InferMultiHeads Objects

class InferMultiHeads(Transformation)

Infer multi-head attention patterns and convert to custom operators.

Converts Reshape and Transpose patterns to SplitMultiHeads and MergeMultiHeads hardware custom operators.

apply

def apply(model: ModelWrapper)

Apply the transformation to infer multi-head patterns in the model graph.

MoveSplitMultiHeadsPastMultiThreshold Objects

class MoveSplitMultiHeadsPastMultiThreshold(Transformation)

Move SplitMultiHeads operation past MultiThreshold operation.

Required as a precondition for unrolling attention heads.

apply

def apply(model: ModelWrapper)

Apply the transformation to move SplitMultiHeads past MultiThreshold.

MoveMergeMultiHeadsPastMultiThreshold Objects

class MoveMergeMultiHeadsPastMultiThreshold(Transformation)

Move MergeMultiHeads operation past MultiThreshold operation.

Avoids merging excessively large streams and potentially allows absorbing thresholds into the attention operator.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, bool]

Apply the transformation to move MergeMultiHeads past MultiThreshold.

is_multi_head_attention

def is_multi_head_attention(node: NodeProto, model: ModelWrapper) -> bool

Detect if a node is part of a multi-head attention pattern.

Returns True if the node is a ScaledDotProductAttention with proper SplitMultiHeads and MergeMultiHeads operations.

UnrollMultiHeadAttention Objects

class UnrollMultiHeadAttention(Transformation)

Unroll multiple attention heads for parallel implementation.

Transforms the ONNX graph to implement attention heads in parallel.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, bool]

Apply the transformation to unroll multi-head attention.

finn.transformation.fpgadataflow.cleanup

CleanUp Objects

class CleanUp(Transformation)

Remove any generated files for fpgadataflow nodes.

finn.transformation.fpgadataflow.compile_cppsim

CompileCppSim Objects

class CompileCppSim(NodeLocalTransformation)

For every node: compile C++ code in node attribute "code_gen_dir_cppsim" and save path to executables in node attribute "executable_path". All nodes in the graph must have the fpgadataflow backend attribute.

To use these executables, exec_mode must be set to "cppsim" (using transformation SetExecMode) and the model has to be executed using execute_onnx() from finn.core.onnx_exec

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

finn.transformation.fpgadataflow.convert_to_hw_layers

Transformations to map ONNX operators to FINN hardware layers.

InferConvInpGen Objects

class InferConvInpGen(Transformation)

Convert Im2Col layers to ConvolutionInputGenerator layers.

init

def __init__()

Initialize the transformation.

apply

def apply(model)

Apply the transformation to infer ConvolutionInputGenerator layers.

InferFMPadding Objects

class InferFMPadding(Transformation)

Convert Pad layers to FMPadding layers.

apply

def apply(model: ModelWrapper)

Apply the transformation to the entire model graph.

InferThresholdingLayer Objects

class InferThresholdingLayer(Transformation)

Convert any MultiThreshold into a standalone thresholding layer.

init

def __init__()

Initialize the transformation.

apply

def apply(model)

Apply the transformation to infer standalone thresholding layers.

InferRequantLayer Objects

class InferRequantLayer(Transformation)

Convert MultiThreshold or Quant nodes to Requant.

For MultiThreshold nodes where all channels have uniform (equal-step) thresholds, the comparison-based threshold lookup can be replaced with a simpler requantization operation:

output = clip(round(input * scale + bias), min, max)

where: scale = 1.0 / step_size bias = 0.5 - first_threshold / step_size

For Quant nodes with scale=1 and zeropt=0 (after ExtractQuantScaleZeroPt), the operation simplifies to:

output = clip(round(input), min, max)

which is Requant with scale=1 and bias=0.

This transformation is optional and provides an alternative implementation to InferThresholdingLayer. The Requant node can then be specialized to either HLS or RTL backend.

InferUpsample Objects

class InferUpsample(Transformation)

Convert Upsample and Resize nodes to UpsampleNearestNeighbour nodes.

apply

def apply(model)

Apply the transformation to infer UpsampleNearestNeighbour nodes.

InferAddStreamsLayer Objects

class InferAddStreamsLayer(Transformation)

DEPRECATED: This transformation is deprecated and now redirects to InferElementwiseBinaryOperation.

AddStreams functionality is now covered by ElementwiseAdd operations (with both inputs as streaming). This wrapper is kept for backward compatibility.

The ElementwiseBinary operations provide the same functionality with additional features like broadcasting support and more operation types.

InferDuplicateStreamsLayer Objects

class InferDuplicateStreamsLayer(Transformation)

Insert a DuplicateStreams HW layer for any tensor with fanout >= 2.

apply

def apply(model)

Apply the transformation to insert DuplicateStreams HW layers where needed.

InferChannelwiseLinearLayer Objects

class InferChannelwiseLinearLayer(Transformation)

DEPRECATED: This transformation is deprecated and now redirects to InferElementwiseBinaryOperation.

ChannelwiseOp functionality is now covered by ElementwiseBinary operations (Add/Mul with const mode). This wrapper is kept for backward compatibility.

The ElementwiseBinary operations provide the same functionality with additional features like broadcasting support and more operation types.

InferLabelSelectLayer Objects

class InferLabelSelectLayer(Transformation)

Convert any TopK into a LabelSelect HW layer.

apply

def apply(model)

Apply transformation to convert TopK nodes to LabelSelect hardware layers.

This transformation identifies TopK operations and converts them to FINN's custom LabelSelect nodes for hardware acceleration.

InferGlobalAccPoolLayer Objects

class InferGlobalAccPoolLayer(Transformation)

Convert any GlobalAveragePool into a GlobalAccPool HW layer and a scalar Mul.

apply

def apply(model)

Apply transformation to infer GlobalAccPool hardware layers.

InferPool Objects

class InferPool(Transformation)

If kernel_shape > strides, replace Pool layer with Im2col + pool combination.

When kernel_shape > strides, replaces Pool layer with Im2col followed by pool (with kernel_shape == strides), plus Transpose layers to keep the original data layout.

apply

def apply(model)

Apply transformation to convert Pool operations with kernel_shape > strides.

InferPoolFromReduce Objects

class InferPoolFromReduce(Transformation)

Infer pooling hardware from lowered pooling, i.e., Im2Col+Reduce.

apply

def apply(model: ModelWrapper)

Apply transformation to convert lowered pooling to hardware.

InferLookupLayer Objects

class InferLookupLayer(Transformation)

Convert Gather nodes with constant op0 into Lookup HW layers.

apply

def apply(model)

Apply transformation to convert Gather operations to Lookup hardware layers.

This transformation identifies Gather operations with constant first operand and converts them to FINN's custom Lookup nodes for hardware acceleration.

InferConcatLayer Objects

class InferConcatLayer(Transformation)

Convert suitable Concat nodes (operating on last/-1 axis) into StreamingConcat HW layers.

apply

def apply(model)

Apply transformation to convert Concat operations to StreamingConcat hardware layers.

This transformation identifies Concat operations operating on the last axis and converts them to FINN's custom StreamingConcat nodes.

InferSplitLayer Objects

class InferSplitLayer(Transformation)

Convert suitable Split nodes (operating on last/-1 axis) into StreamingSplit HW layers.

apply

def apply(model)

Apply transformation to convert Split operations to StreamingSplit hardware layers.

This transformation identifies Split operations operating on the last axis and converts them to FINN's custom StreamingSplit nodes.

InferStreamingEltwise Objects

class InferStreamingEltwise(Transformation)

DEPRECATED: This transformation is deprecated and now redirects to InferElementwiseBinaryOperation.

StreamingEltwise functionality is now covered by ElementwiseSub and ElementwiseAbsDiff operations (with both inputs as streaming). This wrapper is kept for backward compatibility.

The ElementwiseBinary operations provide the same functionality with additional features like broadcasting support and more operation types.

InferBinaryMatrixVectorActivation Objects

class InferBinaryMatrixVectorActivation(Transformation)

Convert XnorPopcountMatMul layers to MatrixVectorActivation layers.

Any immediately following MultiThreshold layers will also be absorbed into the MVTU.

init

def __init__()

Initialize the transformation.

apply

def apply(model)

Apply transformation to convert XnorPopcountMatMul to MVAU nodes.

This transformation identifies XnorPopcountMatMul operations and converts them to FINN's custom MVAU (Matrix Vector Activation Unit) nodes, potentially absorbing following MultiThreshold layers.

InferQuantizedMatrixVectorActivation Objects

class InferQuantizedMatrixVectorActivation(Transformation)

Convert MatMul layers with quantized inputs and weights to MatrixVectorActivation layers.

apply

def apply(model)

Apply transformation to convert MatMul to MVAU nodes.

InferVectorVectorActivation Objects

class InferVectorVectorActivation(Transformation)

Convert MatMul layers to VectorVectorActivation layers for depthwise convolutions.

Converts MatMul layers with quantized inputs and weights to VectorVectorActivation layers, if the sparsity annotation of the weight matrix indicates that the MatMul layer belongs to a depthwise convolution. Any immediately following MultiThreshold layers will also be absorbed into the VVAU.

init

def __init__()

Initialize the transformation.

apply

def apply(model)

Apply transformation to convert MatMul to VVAU nodes for depthwise convolutions.

InferHWSoftmax Objects

class InferHWSoftmax(Transformation)

Infers a regular softmax node without merging the multithreshold and setting the softmax to perform the quantisation.

init

def __init__()

Infers a regular softmax node without merging the multithreshold and setting the softmax to perform the quantisation.

apply

def apply(model)

Apply the transformation.

skip_first_node_transpose

def skip_first_node_transpose(model, node)

Default filter for InferShuffle: skip Transpose if it's the first node in the graph. This is useful for image classification networks where the first transpose converts NCHW to NHWC layout for data preprocessing.

InferShuffle Objects

class InferShuffle(Transformation)

Find transpose layers with (optionally) reshape layers around them and convert them into a shuffle operator

apply

def apply(model)

Apply the transformation.

lift_to_rank1

def lift_to_rank1(name: str, model: ModelWrapper)

Lift scalar to rank-1 tensor.

Converts scalar tensors (shape []) to rank-1 tensors with a single element (shape [1]).

InferElementwiseBinaryOperation Objects

class InferElementwiseBinaryOperation(Transformation)

Convert supported elementwise binary operations to their FINN custom operation.

reject_output_dequant

@staticmethod
def reject_output_dequant(model: ModelWrapper, node: NodeProto)

Filter function to filter out the last elementwise Mul operation.

Typically filters output de-quantization operations which should happen off-chip.

init

def __init__(_filter=None)

Initialize the transformation method with an optional filter function.

apply

def apply(model: ModelWrapper)

Apply the transform to convert elementwise binary operations to FINN custom ops.

InferReLUAsElementwiseMax Objects

class InferReLUAsElementwiseMax(Transformation)

Converts ReLU into ElementwiseMaximum(in, 0).

reject_unsupported_dtypes

@staticmethod
def reject_unsupported_dtypes(model: ModelWrapper, node: NodeProto)

Filter function to filter out any operation involving any floating-point tensor.

init

def __init__(_filter=reject_unsupported_dtypes)

Initializes the transformation method with an optional filter function.

apply

def apply(model: ModelWrapper)

Apply the transformation.

InferLayerNorm Objects

class InferLayerNorm(Transformation)

Convert LayerNorm into HW, only norming over channel dim. This transform is adapted from Brainsmith InferLayerNorm.

apply

def apply(model)

Apply the transformation.

elements_are_consecutive

def elements_are_consecutive(indices)

Are elements consecutive (max diff. 1 between all adjacent elements)?

InferCrop Objects

class InferCrop(Transformation)

Find gather layers that can be converted into a Crop layer and replace them with a Crop layer

init

def __init__()

Find gather layers that can be converted into a Crop layer and replace them with a Crop layer

apply

def apply(model)

Apply the transformation.

InferSqueeze Objects

class InferSqueeze(Transformation)

Converts the Squeeze operation to the corresponding FINN custom operation.

apply

def apply(model: ModelWrapper)

Apply the transform to convert Squeeze operations to FINN custom ops.

InferUnsqueeze Objects

class InferUnsqueeze(Transformation)

Convert the Unsqueeze operation to the corresponding FINN custom operation.

apply

def apply(model: ModelWrapper)

Apply the transform to convert Unsqueeze operations to FINN custom ops.

InferReshape Objects

class InferReshape(Transformation)

Converts ONNX Reshape operator to the corresponding HWCustomOp.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, bool]

Apply the transform to convert Reshape operations hardware.

finn.transformation.fpgadataflow.create_dataflow_partition

CreateDataflowPartition Objects

class CreateDataflowPartition(Transformation)

Split a graph into two graphs; one which contains non-FINN-dataflow nodes and a StreamingDataflowPartition node, and another which only contains FINN dataflow nodes. The StreamingDataflowPartition has a model attribute that indicates the filename for the second graph that only contains dataflow nodes. No action is taken if there are no dataflow nodes.

finn.transformation.fpgadataflow.create_stitched_ip

Transformation to create stitched IP from dataflow graph components.

is_external_input

def is_external_input(model, node, i)

Determine whether input i of node should be made external.

True only if input is unconnected and has no initializer. Only exception is second input of FC layers when mem_mode is external.

is_external_output

def is_external_output(model, node, i)

Determine whether output i of node should be made external.

CreateStitchedIP Objects

class CreateStitchedIP(Transformation)

Create a Vivado IP Block Design project from all the generated IPs of a graph. All nodes in the graph must have the fpgadataflow backend attribute, and the PrepareIP transformation must have been previously run on the graph. The resulting block design is also packaged as IP. The transformation gets the fpgapart as a string.

Outcome if successful: sets the vivado_stitch_proj attribute in the ONNX ModelProto's metadata_props field, with the created project dir as the value. A make_project.tcl script is also placed under the same folder, which is called to instantiate the per-layer IPs and stitch them together. The packaged block design IP can be found under the ip subdirectory.

init

def __init__(fpgapart,
             clk_ns,
             ip_name="finn_design",
             vitis=False,
             signature=[])

Initialize CreateStitchedIP transformation with FPGA part and clock settings.

is_double_pumped

def is_double_pumped(node)

Check if node uses double pumped computation.

connect_clk_rst

def connect_clk_rst(node)

Connect clock and reset signals for the node.

connect_axi

def connect_axi(node, model)

Connect AXI interfaces for the node.

connect_m_axis_external

def connect_m_axis_external(node, idx=None)

Connect master AXI stream interfaces as external ports.

connect_s_axis_external

def connect_s_axis_external(node, idx=None)

Connect slave AXI stream interfaces as external ports.

connect_ap_none_external

def connect_ap_none_external(node)

Connect ap_none interfaces as external ports.

insert_signature

def insert_signature(checksum_count)

Insert signature block for design identification.

apply

def apply(model)

Apply the CreateStitchedIP transformation to the model.

finn.transformation.fpgadataflow.derive_characteristic

DeriveCharacteristic Objects

class DeriveCharacteristic(NodeLocalTransformation)

For each node in the graph, run rtlsim to obtain the i/o characteristic function for FIFO sizing and set the attribute. It is assumed that the PrepareRTLSim transformation was already called on the graph.

This transformation performs rtlsim for each node, so it will run for some time (minutes to hours depending on configuration).

period (int) desired period over which the characteristic function will be derived.
num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

DeriveFIFOSizes Objects

class DeriveFIFOSizes(NodeLocalTransformation)

Prerequisite: DeriveCharacteristic already called on graph. For each node in the graph, use the accumulated I/O characteristic function to perform FIFO sizing, setting the in/outFIFODepths attributes of HLSCustomOp nodes.

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

finn.transformation.fpgadataflow.externalize_params

ExternalizeParams Objects

class ExternalizeParams(Transformation)

Create top-level graph inputs for IODMAs serving layers where weights are marked as external using mem_mode="external".

finn.transformation.fpgadataflow.floorplan

Floorplan Objects

class Floorplan(Transformation)

Perform Floorplanning of the dataflow design:

floorplan: path to a JSON containing a dictionary with SLR assignments for each node in the ONNX graph. Must be parse-able by the ApplyConfig transform.

The transform applies the properties in the supplied JSON then: -Separates DMAs into their own partitions IDs, -If not explicitly assigned, assigns DWCs to SLRs to minimize SLLs required -If not explicitly assigned, assigns FIFOs to the SLR of the upstream node

finn.transformation.fpgadataflow.hlssynth_ip

HLSSynthIP Objects

class HLSSynthIP(NodeLocalTransformation)

For each HLS node: generate IP block from code in folder that is referenced in node attribute "code_gen_dir_ipgen" and save path of generated project in node attribute "ipgen_path". All nodes in the graph must have the fpgadataflow backend attribute. Any nodes that already have a ipgen_path attribute pointing to a valid path will be skipped.

This transformation calls Vitis HLS for synthesis, so it will run for some time (minutes to hours depending on configuration).

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

finn.transformation.fpgadataflow.infer_pixel_padding_deconv

InferPixelPaddingDeconv Objects

class InferPixelPaddingDeconv(Transformation)

Lowering and conversion of ConvTranspose (NCHW) nodes to FMPadding_Pixel + Im2Col + MatMul (NHWC) surrounded by Transpose nodes note: this transformation produces a mix of hw layers and non hw layers to implement this on an FPGA the Im2Col and MatMul nodes need to be converted to hw layers after applying this transformation and the resulting transpose nodes need to be streamlined. See deconv test case under tests/fpgadataflow for an example.

finn.transformation.fpgadataflow.insert_dwc

InsertDWC Objects

class InsertDWC(Transformation)

Add data width converters between layers where necessary.

finn.transformation.fpgadataflow.insert_fifo

InsertFIFO Objects

class InsertFIFO(Transformation)

Inserting FIFOs in the beginning and end of the graph as well as

between fpgadataflow nodes.

Takes the setting for the depth from the surrounding nodes by extracting node attribute 'outFIFODepths' of the previous and node attribute 'inFIFODepths' of the subsequent node. max() of these two values sets the FIFO depth.

Constructor arguments:

Arguments:

max_qsrl_depth: FIFOs deeper than this will use Vivado IP instead of Verilog FIFOs (Q_srl.v)
vivado_ram_style: the StreamingFIFO.ram_style attribute to be used for large FIFOs implemented by Vivado
create_shallow_fifos: Normally, shallow-depth (<=2) FIFOs won't be created since HLS streaming interfaces already have a degree of buffering. Override with this parameter.

The other node attributes necessary to create a FIFO node are taken from the node the FIFO node is inserted after: 'folded_shape' and 'dtype'

finn.transformation.fpgadataflow.insert_hook

InsertHook Objects

class InsertHook(Transformation)

Inserting hook layer after each layer that has the node attribute 'output_hook' specified

finn.transformation.fpgadataflow.insert_iodma

InsertIODMA Objects

class InsertIODMA(Transformation)

Insert DMA nodes on inputs and outputs, or as specified by filters in the constructor.

get_mem_init

def get_mem_init(weights, pe, simd)

Returns matrix ready for pack_innermost_dim_as_hex_string with reverse=False (finn.util.data_packing) to return the memory init file little endian packed. That is, get_mem_init returns: elem(pe,simd) addr = 0: [(pe-1,simd-1),(pe-1,simd-2),...(0,1),(0,0)] addr = 1: [(pe-1,simd*2-1),.......(0,simd+1),(0,simd)] .

finn.transformation.fpgadataflow.insert_tlastmarker

InsertTLastMarker Objects

class InsertTLastMarker(Transformation)

Ensure that the graph is started/terminated with a TLastMarker_hls node, inserting one if necessary. Use constructor args to determine type of TLastMarker to be inserted. More information available on the TLastMarker documentation.

finn.transformation.fpgadataflow.instrumentation

Transformations for generating and simulating instrumentation IP.

collect_ip_dirs

def collect_ip_dirs(model, ipstitch_path)

Collect list of all IP directories required by the design.

GenerateInstrumentationIP Objects

class GenerateInstrumentationIP(Transformation)

Generate instrumentation IP for performance monitoring.

init

def __init__(fpga_part, clk_period_ns, avg_n=64, format="ip")

Initialize instrumentation IP generation with FPGA part and clock settings.

apply

def apply(model)

Generate instrumentation IP core.

PrepareInstrumentationSim Objects

class PrepareInstrumentationSim(Transformation)

Prepare simulation environment for instrumentation.

init

def __init__(fpga_part)

Initialize instrumentation simulation preparation.

apply

def apply(model)

Prepare scripts for simulating instrumentation IP.

RunInstrumentationSim Objects

class RunInstrumentationSim(Transformation)

Run instrumentation simulation to collect performance data.

init

def __init__()

Initialize instrumentation simulation runner.

apply

def apply(model)

Run instrumentation simulation script.

finn.transformation.fpgadataflow.loop_rolling

get_constant_from_value

def get_constant_from_value(value)

Get the constant value of a tensor.

same_values

def same_values(inputs)

Check if all inputs have the same constant value.

LoopRolling Objects

class LoopRolling(Transformation)

Boilerplate Transformation for loop rolling in fpgadataflow.

finn.transformation.fpgadataflow.make_driver

Create C++ and PYNQ drivers for FINN-generated accelerators.

update_bitfile_path_after_copy

def update_bitfile_path_after_copy(bitfile_path: str, json_path: str) -> None

Update the xclbinPath in the JSON configuration to point to the new bitfile location.

Arguments:

json_path str - Path to the JSON configuration file
bitfile_path str - New path to the bitfile (.xclbin)

MakeCPPDriver Objects

class MakeCPPDriver(Transformation)

Create CPP code to correctly interface the generated accelerator, including data packing/unpacking. Should be called after conversion to HLS layers, folding and the creation of dataflow partitions for correct operation. platform: has to be "alveo", otherwise an error is thrown Outcome if successful: sets the cpp_driver_dir attribute in the ONNX ModelProto's metadata_props field, with the created driver dir as the value. runtime writeable weights not yet supported.

resolve_dt_name

def resolve_dt_name(s: str) -> str

Resolve datatype name for C++ driver code generation.

Arguments:

s - Datatype string to resolve

Returns:

Resolved C++ datatype name

Raises:

FINNInternalError - If datatype is unknown

init

def __init__(platform: str, version: str, host_mem: str)

Initialize MakeCPPDriver transformation.

Arguments:

platform - Target platform (must be "alveo")
version - Version of finn-cpp-driver to use ("latest" or commit hash)
host_mem - Memory type (FpgaMemoryType.HOST_MEM or FpgaMemoryType.DEVICE_MEM)

Raises:

FINNUserError - If platform is not "alveo"

apply

def apply(model: ModelWrapper) -> Tuple[ModelWrapper, bool]

Apply the MakeCPPDriver transformation to generate C++ driver code.

Arguments:

model - ONNX model wrapper

Returns:

Tuple of (modified model, transformation success flag)

MakePYNQDriver Objects

class MakePYNQDriver(Transformation)

Create PYNQ Python code to correctly interface the generated accelerator, including data packing/unpacking. Should be called after conversion to HLS layers, folding and the creation of dataflow partitions for correct operation.

platform: one of ["zynq-iodma", "alveo"]

Outcome if successful: sets the pynq_driver_dir attribute in the ONNX ModelProto's metadata_props field, with the created driver dir as the value. If any layers use runtime-writable parameters, those will be gathered under the runtime_weights/ subfolder of the pynq_driver_dir.

init

def __init__(platform,
             driver_type,
             clk_period_ns=None,
             validation_datset=None,
             experiment_info=None,
             board=None)

Initialize PYNQ driver generation.

Arguments:

platform - Target platform, one of ["zynq-iodma", "alveo"].
driver_type - Type/name of the driver to generate (e.g. "FINNDMAOverlay", "FINNDMAInstrumentationOverlay").
clk_period_ns - Clock period in nanoseconds used for performance calculations.
validation_datset - Validation dataset path or identifier.
experiment_info - Path to a JSON file containing experiment metadata.

apply

def apply(model)

Apply the MakePYNQDriver transformation.

Creates a PYNQ Python driver package for interfacing with the generated accelerator, including data packing/unpacking and runtime weight handling.

Arguments:

model - The ONNX model to generate a driver for.

Returns:

Tuple of (modified model, False) indicating transformation applied.

finn.transformation.fpgadataflow.make_zynq_proj

Transformation to create Zynq Vivado projects for FINN dataflow designs.

collect_ip_dirs

def collect_ip_dirs(model, ipstitch_path)

Collect list of all IP directories required by the design.

MakeZYNQProject Objects

class MakeZYNQProject(Transformation)

Create a Vivado overlay project (including the shell infrastructure) from the already-stitched IP block for this graph. All nodes in the graph must have the fpgadataflow backend attribute, and the CreateStitchedIP transformation must have been previously run on the graph. This is functionally equivalent with MakePYNQProject but does not use Pynq infrastructure and instead creates a fully custom block design. However, this transform requires DMAs in the accelerator design.

Outcome if successful: sets the vivado_pynq_proj attribute in the ONNX ModelProto's metadata_props field, with the created project dir as the value.

init

def __init__(platform,
             period_ns,
             enable_debug=False,
             enable_finn_switch=False,
             live_fifo_sizing=False)

Initialize MakeZYNQProject with platform settings.

apply

def apply(model)

Apply the transformation to create a Zynq project.

ZynqBuild Objects

class ZynqBuild(Transformation)

Best-effort attempt at building the accelerator for Zynq. It assumes the model has only fpgadataflow nodes

init

def __init__(platform,
             period_ns,
             enable_debug=False,
             enable_instrumentation=False,
             instrumentation_no_dma=False,
             instrumentation_avg_n=64,
             live_fifo_sizing=False,
             partition_model_dir=None)

Initialize ZynqBuild with platform and build settings.

apply

def apply(model)

Apply the ZynqBuild transformation to create a complete Zynq accelerator.

finn.transformation.fpgadataflow.minimize_accumulator_width

MinimizeAccumulatorWidth Objects

class MinimizeAccumulatorWidth(Transformation)

For relevant nodes, call the accumulator width minimization functions to save on resources. May alter tensor DataType for certain nodes if they produce an accumulator as result.

finn.transformation.fpgadataflow.minimize_weight_bit_width

MinimizeWeightBitWidth Objects

class MinimizeWeightBitWidth(Transformation)

For relevant nodes, call the weight bit width minimization functions to save on resources. May alter tensor weightDataType if the node does not have runtime writeable weights.

finn.transformation.fpgadataflow.prepare_cppsim

PrepareCppSim Objects

class PrepareCppSim(Transformation)

Call custom implementation to generate code for single custom node and create folder that contains all the generated files. All nodes in the graph must have the fpgadataflow backend attribute.

Outcome if succesful: Node attribute "code_gen_dir_cppsim" contains path to folder that contains generated C++ code that can be used to simulate node using cppsim. The subsequent transformation is CompileCppSim

finn.transformation.fpgadataflow.prepare_ip

PrepareIP Objects

class PrepareIP(Transformation)

Call custom implementation to generate code for single custom node and create folder that contains all the generated files. All nodes in the graph must have the fpgadataflow backend attribute and transformation gets additional arguments:

fpgapart (string)
clk in ns (int)

Any nodes that already have a code_gen_dir_ipgen attribute pointing to a valid path will be skipped.

Outcome if succesful: Node attribute "code_gen_dir_ipgen" contains path to folder that contains:

For HLS layers: generated C++ code that can be used to generate a Vivado IP block. The necessary subsequent transformation is HLSSynthIP.
For RTL layers: filled template verilog files that can be used to instantiate as module during IP stitching.

finn.transformation.fpgadataflow.prepare_rtlsim

PrepareRTLSim Objects

class PrepareRTLSim(NodeLocalTransformation)

For a graph with generated RTL sources (after HLSSynthIP), create an emulation library for each node to prepare for rtlsim execution and set the rtlsim_so property to the path to the generated emulation library.

To use these libraries, exec_mode must be set to "rtlsim" (using SetExecMode) and the model has to be executed using execute_onnx() from finn.core.onnx_exec

num_workers (int or None) number of parallel workers, see documentation in NodeLocalTransformation for more details.

finn.transformation.fpgadataflow.raise_scalar_to_rank1

RaiseScalarToRank1 Objects

class RaiseScalarToRank1(Transformation)

Lift all scalar tensors in the model to rank-1 tensors.

Scalars in ONNX are represented with an empty shape. Downstream FINN transformations often expect tensors to have at least rank 1. This transformation scans all tensors that have shape information attached and ensures scalars are reshaped to have shape [1] while keeping any initializer data consistent.

finn.transformation.fpgadataflow.replace_verilog_relpaths

ReplaceVerilogRelPaths Objects

class ReplaceVerilogRelPaths(Transformation)

Convert ./ relative file paths to absolute ones for generated Verilog

finn.transformation.fpgadataflow.set_exec_mode

SetExecMode Objects

class SetExecMode(Transformation)

Set attribute exec_mode in all fpgadataflow nodes to specify which kind of execution should be used ("cppsim" or "rtlsim"). Note that RTL components do not support cppsim. When cppsim is selected for RTL components, by default the execution of the HW op parent is executed.

finn.transformation.fpgadataflow.set_fifo_depths

Transformations for inserting and sizing FIFOs in FINN dataflow graphs.

reset_implementation

def reset_implementation(node: "HWCustomOp") -> None

Reset IP generation attributes of a node to trigger re-synthesis.

set_signal

def set_signal(sim: _SimProtocol, keyw: str, value: int) -> None

Set the first simulation input signal whose name contains keyw to value.

get_signal

def get_signal(sim: _SimProtocol, keyw: str) -> int | None

Return the value of the first simulation output signal whose name contains keyw.

optimize_depth

def optimize_depth(depth: int) -> int

Round depth to avoid resource-inefficient FIFO sizes.

RemoveShallowFIFOs Objects

class RemoveShallowFIFOs(Transformation)

Remove zero-depth FIFOs The threshold used to be 2 instead of 0, but with increasing number of FINN RTL components 2-depth FIFOs are still important for decoupling..

init

def __init__(shallow_threshold: int = 0) -> None

Initialize RemoveShallowFIFOs with the given depth threshold.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, Literal[False]]

Remove FIFOs at or below the shallow threshold depth.

CapConvolutionFIFODepths Objects

class CapConvolutionFIFODepths(Transformation)

Make the size of FIFOs for convolution layers smaller where possible.

Will be automatically called from InsertAndSetFIFODepths if the appropriate constructor flag is set.

Constructor arguments:

Arguments:

max_qsrl_depth: FIFOs deeper than this will use Vivado IP instead of Verilog FIFOs (Q_srl.v)

Assumed input graph properties:

all nodes are fpgadataflow nodes
FIFOs inserted with InsertAndSetFIFODepths

Output:

graph with smaller-depth FIFOs for convolutions

Background: The simulation-based rtlsim_exec tends to overestimate the required depth of FIFOs between the ConvolutionInputGenerator (here called SWG) and the MatrixVectorActivation (here called MVAU). As the SWG has an internal buffer of 1 image row, we use this as a rule of thumb to set FIFO depth to be no larger than 1 row.

init

def __init__(max_qsrl_depth: int = 256) -> None

Initialize CapConvolutionFIFODepths with the given maximum SRL FIFO depth.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, Literal[False]]

Cap FIFO depths between ConvolutionInputGenerator and MVAU nodes.

xsi_fifosim

def xsi_fifosim(model: ModelWrapper,
                n_inferences: int,
                max_iters: float | None = None,
                throttle_cycles: int = 0) -> dict[str, int]

Create a XSI model of stitched IP and use a simple C++ driver to drive the input stream. Useful for FIFO sizing, latency and throughput measurement. If max_iters is None, use the default liveness threshold instead. throttle_cycles can be used for throttling the input stream every time a frame is finished.

InsertAndSetFIFODepths Objects

class InsertAndSetFIFODepths(Transformation)

Insert appropriate-depth StreamingFIFOs through RTLSim that preserve

throughput in the created accelerator.

Constructor arguments:

Arguments:

clk_ns: clock period (used for IP preparation)
max_qsrl_depth: FIFOs deeper than this will use Vivado IP instead of Verilog FIFOs (Q_srl.v)
max_depth: how deep the "max"-sized FIFOs initially inserted will be. If set to None, use the tensor size as the depth
swg_exception: call CapConvolutionFIFODepths to make convolution FIFOs smaller where appropriate
vivado_ram_style: the StreamingFIFO.ram_style attribute to be used for large FIFOs implemented by Vivado afterwards
fifosim_input_throttle: use input throttling based on dataflow analysis while doing simulation-based FIFO sizing

Assumed input graph properties:

all nodes are fpgadataflow nodes
no FIFOs inserted,
(inFIFODepths/outFIFODepths attrs will be ignored)

Output:

graph with appropriate-depth FIFOs inserted

Background: Even with all FINN HLS fpgadatflow layers appropriately parallelized, it is necessary to insert FIFOs between them to prevent stalls due to bursty behavior. The sizes of those FIFOs are hard to predict analytically, so we do the following:

insert deep (=tensor size) FIFOs between all fpgadataflow nodes
create stitched design
run through rtlsim with stream of multiple random input images (to fill pipeline)
keep track of observed maximum occupancy for each FIFO during rtlsim
when sim finished, update each FIFO depth to maximum observed occupancy and set inFIFODepths/outFIFODepths attrs to that depth as well

init

def __init__(fpgapart: str,
             clk_ns: float = 10.0,
             max_qsrl_depth: int = 256,
             max_depth: int | None = None,
             swg_exception: bool = False,
             vivado_ram_style: str = "auto",
             fifosim_input_throttle: bool = True,
             cfg_n_inferences: int = 2) -> None

Initialize InsertAndSetFIFODepths with synthesis and simulation parameters.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, Literal[False]]

Insert and size StreamingFIFOs using RTL simulation.

get_fifo_split_configs

def get_fifo_split_configs(
        depth: int,
        max_qsrl_depth: int = 256,
        max_vivado_depth: int = 32768) -> list[tuple[int, str]]

Break non-power-of-2 sized FIFO depths into several ones.

SplitLargeFIFOs Objects

class SplitLargeFIFOs(Transformation)

Split large FIFOs before implementation, for two reasons.

impl_style="vivado" supports a max depth of 32k. Any larger FIFOs must be implemented as a sequence of smaller FIFOs.
impl_style="vivado" requires power-of-two depths, which is normally handled by rounding up to the nearest power-of-two. So a FIFO of size 8196 normally gets rounded-up to a depth of 16384 and wastes a lot of resources. Here, instead, we split this up into two FIFOs of depth 8192 + 4.

init

def __init__(max_qsrl_depth: int = 256, max_vivado_depth: int = 32768) -> None

Initialize SplitLargeFIFOs with maximum FIFO depth constraints.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, Literal[False]]

Split large FIFOs into chains of smaller power-of-two FIFOs.

finn.transformation.fpgadataflow.set_folding

Automatically sets folding, i.e., parallelism attributes for all FINN operators.

divisors

def divisors(num: int) -> Generator[int, Any, None]

Yield divisors of num.

common_divisors

def common_divisors(numbers: list[int]) -> np.ndarray

Return common divisors of the list of numbers.

SetFolding Objects

class SetFolding(Transformation)

Attempt to set parallelism attributes in all nodes to meet a specific target expressed as cycles per frame target_cycles_per_frame. For each HLSCustomOp node type, the attribute may vary but is typically one of {PE, SIMD}, and has a certain allowed-maximum value and divisibility constraints, which SetFolding will take into account. Note that the algorithm implemented by SetFolding is very simple and it is often possible to hand-tune the returned parallelism configuration for better results.

In the returned model, each node's cycles_estimate attribute will be set to its estimated number of cycles.

If two_pass_relaxation is enabled, SetFolding will internally run a second time if the target cycles from the first pass could not be achieved, instead using the achievable target (which may be constrained by a single node) to obtain a balanced pipeline.

Notable exceptions and special behavior:

When folding dense convolution/FC compute engines ("MVAU"/MatrixVectorActivation), which have two attributes (PE and SIMD):

first increases SIMD while weight stream width per PE is <= mvau_wwidth_max (configurable in the SetFolding initializer, defaults to 36)
then increases PE until the target is met or max PE reached

When folding depthwise convolutions ("VVAU"/VectorVectorActivation) or spatial reduction ops (Pool_Batch):

the producer of the node is expected to be a ConvolutionInputGenerator with depthwise=1, whose SIMD value will be set equal to the PE value of its consumer node
the VVAU also supports SIMD ("input window") parallelism next to PE ("channels"), but current ConvInpGen limitations require PE to be fully unfolded before SIMD is increased

init

def __init__(target_cycles_per_frame: int = 1000,
             mvau_wwidth_max: int = 36,
             two_pass_relaxation: bool = True) -> None

Initialize the folding target and constraints.

optimize_attribute_val

def optimize_attribute_val(node_inst: HWCustomOp, max_val: int,
                           attr_name: str) -> None

Optimize the folding attribute until the target cycles are met.

apply

def apply(model: "ModelWrapper") -> tuple[ModelWrapper, Literal[False]]

Apply SetFolding to all supported nodes in the model.

finn.transformation.fpgadataflow.set_loop_boundary

SetLoopBoundary Objects

class SetLoopBoundary(Transformation)

Sets metadata attributes to nodes between defined node or tensor ranges in an ONNX model.

Arguments:

node_metadata: Dictionary containing metadata attributes to set on the nodes.
node_range: Tuple containing start and end node names (start_node, end_node).
tensor_range: Tuple containing start and end tensor names (start_tensor, end_tensor).

finn.transformation.fpgadataflow.specialize_layers

Transformations for specializing FINN layers to HLS or RTL implementations.

This module provides functionality to automatically select and specialize FINN dataflow layers to their optimal hardware implementation variants (HLS or RTL) based on FPGA target, layer constraints, and user preferences.

SpecializeLayers Objects

class SpecializeLayers(Transformation)

Specialize all layers to either HLS or RTL variants

init

def __init__(fpgapart)

Initialize the SpecializeLayers transformation.

Arguments:

fpgapart - Target FPGA part string for implementation selection

apply

def apply(model)

Apply layer specialization transformation to model.

Converts all dataflow layers to their optimal HLS or RTL implementation variants based on target FPGA and layer constraints.

finn.transformation.fpgadataflow.synth_ooc

Transformation for out-of-context Vivado synthesis on stitched IP designs.

is_hls_float_op

def is_hls_float_op(node: NodeProto, model: ModelWrapper) -> bool

Check if a node is an HLS operator with floating-point inputs.

SynthOutOfContext Objects

class SynthOutOfContext(Transformation)

Run out-of-context Vivado synthesis on a stitched IP design.

init

def __init__(part: str,
             clk_period_ns: float,
             clk_name: str = "ap_clk") -> None

Initialize the SynthOutOfContext transformation.

Arguments:

part - Target FPGA part for synthesis
clk_period_ns - Clock period in nanoseconds
clk_name - Clock signal name (default: "ap_clk")

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, bool]

Apply out-of-context synthesis transformation to the model.

finn.transformation.fpgadataflow.templates

Template strings for FPGA dataflow build scripts.

finn.transformation.fpgadataflow.transpose_decomposition

shuffle_perfect_loopnest_coeffs

def shuffle_perfect_loopnest_coeffs(shape: tuple[int],
                                    perm: tuple[int]) -> tuple[int]

Given an input shape and permutation matrix calculate the coefficients for the perfect loop nest for HLS generation.

apply_inner_shuffle_operation

def apply_inner_shuffle_operation(perm: List[int],
                                  shape: List[int] = None,
                                  simd: int = 1) -> List[int]

Apply inner_shuffle operation: swap the last two positions (..., a, b) -> (..., b, a)

apply_outer_shuffle_operation

def apply_outer_shuffle_operation(perm: List[int],
                                  i: int,
                                  j: int,
                                  shape: List[int] = None,
                                  simd: int = 1) -> Optional[List[int]]

Apply outer_shuffle operation: swap positions i and j Constraint: cannot move the very last dimension

get_all_possible_moves

def get_all_possible_moves(
        perm: List[int],
        shape: List[int] = None,
        simd: int = 1
) -> List[Tuple[List[int], str, Optional[Tuple[int, int]]]]

Get all possible moves from current permutation. Returns list of (new_permutation, operation_type, operation_params) tuples.

Each outer_shuffle move represents a single pairwise swap that doesn't involve the last dimension. Complex permutations are built by chaining multiple such operations.

operation_type is either 'inner_shuffle' or 'outer_shuffle' operation_params is None for inner_shuffle, (i, j) for outer_shuffle

is_valid_hardware_permutation

def is_valid_hardware_permutation(perm_array: List[int]) -> bool

Check if a permutation array represents a valid hardware operation. Valid operations are:

inner_shuffle: swap last two elements
outer_shuffle: any permutation that doesn't move the last element

find_minimal_operation_sequence

def find_minimal_operation_sequence(
        start_perm: List[int],
        target_perm: List[int],
        shape: List[int] = None,
        simd: int = 1
) -> Optional[List[Tuple[str, Optional[Tuple[int, int]]]]]

Find minimal sequence of operations to transform start_perm into target_perm. Uses BFS to find shortest path, ensuring all intermediate permutations are hardware-valid. Returns list of (operation_type, operation_params) tuples.

TODO: We want this to be cost based and include a buffer size cost model.

convert_operations_to_permutations

def convert_operations_to_permutations(start_perm: List[int],
                                       operations: List[Tuple[
                                           str, Optional[Tuple[int, int]]]],
                                       shape: List[int] = None,
                                       simd: int = 1) -> List[List[int]]

Convert a sequence of operations to a list of permutation arrays. Each permutation represents the transformation for that step.

can_be_single_operation

def can_be_single_operation(
        target_perm: List[int],
        shape: List[int] = None,
        simd: int = 1) -> Optional[Tuple[str, Optional[Tuple[int, int]]]]

Check if the target permutation can be achieved with a single operation. i.e. no decomposition is required. Returns (operation_type, operation_params) or None if not possible.

decompose_transpose_with_constraints

def decompose_transpose_with_constraints(
        target_perm: List[int],
        shape: List[int] = None,
        simd: int = 1) -> Tuple[List[List[int]], List[str]]

Decompose a target permutation into a sequence of hardware-constrained operations.

inner_shuffle: swaps the last two dimensions outer_shuffle: can implement any permutation that doesn't move the last dimension (may require multiple steps)

Returns (permutations, operation_types).

permutations: list of permutation arrays for each step
operation_types: list of operation types ('inner_shuffle' or 'outer_shuffle') for each step

ShuffleDecomposition Objects

class ShuffleDecomposition(Transformation)

Transformation that decomposes Shuffle nodes into a chain of Shuffle ops that can map to InnerShuffle and OuterShuffle nodes.

InferInnerOuterShuffles Objects

class InferInnerOuterShuffles(Transformation)

Infers Inner and Outer Shuffles from Shuffle operators. This should run after the ShuffleDecomposition transformation.

finn.transformation.fpgadataflow.vitis_build

Transformation to build FINN dataflow designs for Alveo using Vitis.

CreateVitisXO Objects

class CreateVitisXO(Transformation)

Create a Vitis object file from a stitched FINN ip.

Outcome if successful: sets the vitis_xo attribute in the ONNX ModelProto's metadata_props field with the name of the object file as value. The object file can be found under the ip subdirectory.

init

def __init__(ip_name="finn_design")

Initialize CreateVitisXO transformation.

apply

def apply(model)

Apply CreateVitisXO transformation to create Vitis object file.

VitisLink Objects

class VitisLink(Transformation)

Create an XCLBIN with Vitis.

Outcome if successful: sets the bitfile attribute in the ONNX ModelProto's metadata_props field with the XCLBIN full path as value.

init

def __init__(platform,
             f_mhz=200,
             strategy=VitisOptStrategy.PERFORMANCE,
             enable_debug=False,
             fpga_memory_type="default")

Initialize VitisLink transformation with platform and build settings.

apply

def apply(model)

Apply VitisLink transformation to create XCLBIN.

VitisBuild Objects

class VitisBuild(Transformation)

Best-effort attempt at building the accelerator with Vitis.

It assumes the model has only fpgadataflow nodes

Arguments:

fpga_part: string identifying the target FPGA
period_ns: target clock period
platform: target Alveo platform, one of ["U50", "U200", "U250", "U280"]
strategy: Vitis optimization strategy
enable_debug: add Chipscope to all AXI interfaces
floorplan_file: path to a JSON containing a dictionary with SLR assignments for each node in the ONNX graph. Must be parse-able by the ApplyConfig transform.
enable_link: enable linking kernels (.xo files), otherwise just synthesize them independently.
fpga_memory_type: Specify whether Host or FPGA memory such as DDR/HBM should be used

init

def __init__(fpga_part,
             period_ns,
             platform,
             strategy=VitisOptStrategy.PERFORMANCE,
             enable_debug=False,
             floorplan_file=None,
             enable_link=True,
             partition_model_dir=None,
             fpga_memory_type=FpgaMemoryType.DEFAULT)

Initialize VitisBuild transformation with FPGA and build settings.

apply

def apply(model)

Apply VitisBuild transformation to create complete Vitis accelerator.

finn.transformation.fpgadataflow.vivado_power_estimation

VivadoPowerEstimation Objects

class VivadoPowerEstimation(Transformation)

Run Vivado power estimation on the stitched IP after OOC synthesis. simulate_switching_activity: False = use a fixed set of toggle rates and static probabilities. True = additionally simulate the switching activity of the design for power estimation.

finn.transformation.general

Generally applicable transformations.

ApplyConfig Objects

class ApplyConfig(Transformation)

Applies node properties (attributes) from either a config dict or its JSON representation given as a filename. The JSON file can specify default values for particular op_types, as well as values for nodes with particular names. Example dict::

{
# set kernel_size = 3 for all nodes with op_type=Im2Col
"Defaults" : {"kernel_size" : [3, ["Im2Col"]]},
# set kernel_size = 7 for the particular node with name Im2Col_0
"Im2Col_0" : {"kernel_size" : 7}
}

init

def __init__(
        config: Path | str | dict,
        node_filter: Callable[[NodeProto], bool] = lambda _: True) -> None

Apply a JSON config file to the model.

configure_network

def configure_network(target: GraphProto | ModelWrapper, model_config: dict,
                      subgraph_hier: str | None) -> None

Configure network - target can be a GraphProto or ModelWrapper. If it's a ModelWrapper, get the graph.

apply

def apply(model: ModelWrapper) -> tuple[ModelWrapper, bool]

Apply the config to the model.

finn.transformation.move_reshape

RemoveCNVtoFCFlatten Objects

class RemoveCNVtoFCFlatten(Transformation)

Removes a flatten node if it is between two fpgadataflow nodes. For an NHWC-Conv to FC transition, the preceding transpose is absorbed. The flatten operation can also be implemented by a reshape node.

finn.transformation.qonnx.convert_qonnx_to_finn

ConvertQONNXtoFINN Objects

class ConvertQONNXtoFINN(Transformation)

Converts QONNX dialect to FINN ONNX dialect.

First the weights are converted using the FoldQuantWeights transformation, then the ConvertQuantActToMultiThreshold transformation is used to convert the activations. If incompatibilities are found a ValueError or RuntimeError is raised.

The optional keyword argument filter_function presents a way to control which Quant and BipolarQuant nodes in the activation path are converted to MultiThreshold nodes. A warning will be emitted when a Quant node is not converted to a MultiThreshold node.

Arguments:

filter_function: Each candidate Quant and BinaryQant node is first evaluated by this function. If the function returns False, then the node is not converted to a MultiTrheshold node. The function is given the model and candidate node as parameters. Per default a filter function is inserted, which disables the conversion of Quant nodes, which have a bit width of larger than 8. Defaults to: default_filter_function_generator(max_multithreshold_bit_width=8)

finn.transformation.qonnx.fold_quant_weights

FoldQuantWeights Objects

class FoldQuantWeights(Transformation)

Merges Quant nodes, which are used as weights into the initializer of the weight tensor.

finn.transformation.qonnx.infer_quant_avg_pool_2d

AvgPoolAndTruncToQuantAvgPool Objects

class AvgPoolAndTruncToQuantAvgPool(Transformation)

Convert a section of nodes of the pattern: AveragePool -> Mul (scalar) -> Trunc To the FINN op: QuantAvgPool2d

AvgPoolAndTruncv1ToQuantAvgPool Objects

class AvgPoolAndTruncv1ToQuantAvgPool(Transformation)

Convert a section of nodes of the pattern: AveragePool -> Mul (scalar) -> Trunc (v1) To the FINN op: Div -> QuantAvgPool2d -> Mul

AvgPoolAndTruncv2ToQuantAvgPool Objects

class AvgPoolAndTruncv2ToQuantAvgPool(Transformation)

Convert a section of nodes of the pattern: AveragePool -> Trunc (v2) To the FINN op: Div -> QuantAvgPool2d -> Mul

finn.transformation.qonnx.qonnx_activation_handlers

QuantActBaseHandler Objects

class QuantActBaseHandler(ABC)

Base class for converting quantized activation expressed in the QONNX dialect

to the FINN ONNX dialect.

Arguments:

model (class: qonnx.core.modelwrapper.ModelWrapper``): The model on which this handler should operate.
quant_node: The Quant node which a given handler should replace.
quant_node_index (int): The index of the Quant node in the given model.

init

def __init__(model: ModelWrapper, quant_node, quant_node_index: int)

Base class constructor

valid_predecessor_op_types

@classmethod
def valid_predecessor_op_types()

Defines which op types the preceding node is allowed to have for this type of activation.

calculate_node_parameters

def calculate_node_parameters()

Calculate all parameters required for replacing the QONNX style activation with a FINN style one.

replace_quant_node

def replace_quant_node()

Replace the given QONNX style activation with a FINN style one.

QuantReluHandler Objects

class QuantReluHandler(QuantActBaseHandler)

Class for converting a quantized relu operation expressed in the QONNX dialect to the FINN ONNX dialect.

QuantIdentityHandler Objects

class QuantIdentityHandler(QuantActBaseHandler)

Class for converting a quantized identity operation expressed in the QONNX dialect to the FINN ONNX dialect. This handler also takes care of quantized HardTanh activations, because these are equivalent to quantized identity activations.

finn.transformation.qonnx.quant_act_to_multithreshold

default_filter_function_generator

def default_filter_function_generator(max_multithreshold_bit_width=8)

This function generates the default filter function for the ConvertQuantActToMultiThreshold transformation. Per default the returned function disables the conversion of Quant nodes which have a bit width above 8 bit.

This function generator can be used as a template to write custom filter functions.

ConvertQuantActToMultiThreshold Objects

class ConvertQuantActToMultiThreshold(Transformation)

Converts Quant nodes in the activation path to MultiThreshold nodes.

The optional keyword argument filter_function presents a way to control which Quant and BipolarQuant nodes in the activation path are converted to MultiThreshold nodes. A warning will be emitted when a Quant node is not converted to a MultiThreshold node.

Arguments:

filter_function: Each candidate Quant and BinaryQant node is first evaluated by this function. If the function returns False, then the node is not converted to a MultiTrheshold node. The function is given the model and candidate node as parameters. Per default a filter function is inserted, which disables the conversion of Quant nodes, which have a bit width of larger than 8. Defaults to: default_filter_function_generator(max_multithreshold_bit_width=8)

finn.transformation.squeeze

Transformation to squeeze tensors by removing dimensions of size 1.

Squeeze Objects

class Squeeze(Transformation)

Squeezes, i.e., removes, dimensions of size 1 Note: Use this transformation with great care, it currently serves only the purpose of turning the not well-supported 3d data layouts encountered in transformer models with batch dimension of size 1 into 2d data layouts where the sequence dimension is treated as a batch dimension. Everything else is not tested, it might break the model or simply lack support for certain node op-types.

apply

def apply(model: ModelWrapper)

Apply squeeze transformation to remove size-1 dimensions.

finn.transformation.streamline

Collection of default streamlining transformations.

Streamline Objects

class Streamline(Transformation)

Apply the streamlining transform, see arXiv:1709.04060.

apply

def apply(model)

Collects and applies the default list of streamlining transformations.

finn.transformation.streamline.absorb

AbsorbSignBiasIntoMultiThreshold Objects

class AbsorbSignBiasIntoMultiThreshold(Transformation)

Absorb scalar bias originating from signed int export back into MultiThreshold and re-evaluate the output datatype.

AbsorbAddIntoMultiThreshold Objects

class AbsorbAddIntoMultiThreshold(Transformation)

Absorb preceding Add ops into MultiThreshold by updating the threshold values. Only scalar/1D add vectors can be absorbed.

AbsorbMulIntoMultiThreshold Objects

class AbsorbMulIntoMultiThreshold(Transformation)

Absorb preceding Mul ops into MultiThreshold by updating the threshold values. Only positive scalar/1D mul vectors can be absorbed.

FactorOutMulSignMagnitude Objects

class FactorOutMulSignMagnitude(Transformation)

Split multiply-by-constant nodes into two multiply-by-constant nodes, where the first node is a bipolar vector (of signs) and the second is a vector of magnitudes.

Absorb1BitMulIntoMatMul Objects

class Absorb1BitMulIntoMatMul(Transformation)

Absorb bipolar or binary multiplications into the preceding matrix multiply.

Absorb1BitMulIntoConv Objects

class Absorb1BitMulIntoConv(Transformation)

Absorb bipolar or binary multiplications into the preceding convolution.

AbsorbTransposeIntoMultiThreshold Objects

class AbsorbTransposeIntoMultiThreshold(Transformation)

For (NCHWTranspose -> MultiThreshold) move Transpose past MultiThreshold and set its data_layout mode to NHWC.

AbsorbTransposeIntoFlatten Objects

class AbsorbTransposeIntoFlatten(Transformation)

Absorb transpose node into succeeding flatten node, if H=W=1 and the first dimension stays the same. Can also be applied if flatten is implemented implicitly by a reshape node with shape [1, -1] and the first input dimension is 1

AbsorbScalarMulAddIntoTopK Objects

class AbsorbScalarMulAddIntoTopK(Transformation)

Remove mul/add node prior to topk node if the op is scalar. Note that the TopK output probabilities will change, but the indices won't.

AbsorbConsecutiveTransposes Objects

class AbsorbConsecutiveTransposes(Transformation)

Remove (Transpose -> Transpose) patterns when the input and output of the pattern have the same layout.

AbsorbTransposeIntoResize Objects

class AbsorbTransposeIntoResize(Transformation)

For (NCHWTranspose -> Resize) move Transpose past Resize and change the Resize node's attributes accordingly.

finn.transformation.streamline.collapse_repeated

CollapseRepeatedOp Objects

class CollapseRepeatedOp(Transformation)

Collapse repeated consecutive operations with constant parameters into a single operation. make_collapsed_param_fxn must take two tensors and return a tensor which gives the equivalent result using a single op.

CollapseRepeatedAdd Objects

class CollapseRepeatedAdd(CollapseRepeatedOp)

Collapse repeated adder node into a single operation.

CollapseRepeatedMul Objects

class CollapseRepeatedMul(CollapseRepeatedOp)

Collapse repeated multiplier node into a single operation.

finn.transformation.streamline.extract_norm_scale_bias

ExtractNormScaleBias Objects

class ExtractNormScaleBias(Transformation)

Extract LayerNormalization scale and bias into separate nodes and set initializers to 1 or 0 respectively.

finn.transformation.streamline.reorder

MoveAddPastMul Objects

class MoveAddPastMul(Transformation)

Move add operations past multiply operations on linear segments of the graph. The aim is to have them next to each other such that they can be collapsed into a single add.

MoveScalarMulPastMatMul Objects

class MoveScalarMulPastMatMul(Transformation)

Move scalar mul operations past matmul operations. We want to have muls next to each other such that they can be collapsed into a single mul.

MoveScalarAddPastMatMul Objects

class MoveScalarAddPastMatMul(Transformation)

Move scalar add operations past matmul operations. We want to have adds next to each other such that they can be collapsed into a single add.

MoveAddPastConv Objects

class MoveAddPastConv(Transformation)

Move scalar and channelwise add operations past conv operations. We want to have adds next to each other such that they can be collapsed into a single add.

MoveScalarMulPastConv Objects

class MoveScalarMulPastConv(Transformation)

Move scalar mul operations past conv operations. We want to have muls next to each other such that they can be collapsed into a single mul.

MoveScalarMulPastConvTranspose Objects

class MoveScalarMulPastConvTranspose(Transformation)

Move scalar mul operations past ConvTranspose operations. We want to have muls next to each other such that they can be collapsed into a single mul.

MoveMulPastDWConv Objects

class MoveMulPastDWConv(Transformation)

Move channelwise mul operations past depthwise conv operations. We want to have muls next to each other such that they can be collapsed into a single mul.

MoveMulPastMaxPool Objects

class MoveMulPastMaxPool(Transformation)

Move non-negative scalar or channelwise mul operations past max pool operations. We want to have muls next to each other such that they can be collapsed into a single mul.

MoveLinearPastEltwiseAdd Objects

class MoveLinearPastEltwiseAdd(Transformation)

Move linear operations (mul, add) past elementwise add operations where possible. Specifically,matches and transforms the following patterns: (xC) + (yC) -> (x + y) * C (x+A) + (y+B) -> (x + y) + (A + B) where x and y are dynamic inputs, A, B, C are constant tensors (in general).

MoveScalarLinearPastInvariants Objects

class MoveScalarLinearPastInvariants(Transformation)

Move scalar linear operations (mul, add) past functions which are invariant to them. Specifically, matches and transforms the following patterns: f(x*C) -> f(x) * C f(x+C) -> f(x) + C where x is a dynamic input, C is a constant tensor. Known f which obey this property are: Reshape, Flatten, Transpose, GlobalAveragePool

MakeMaxPoolNHWC Objects

class MakeMaxPoolNHWC(Transformation)

Convert (MaxPool, NHWCTranspose) into (NHWCTranspose, MaxPoolNHWC) and (NCHWTranspose, MaxPool) into (MaxPoolNHWC, NCHWTranspose).

MakeScaleResizeNHWC Objects

class MakeScaleResizeNHWC(Transformation)

Converts the inputs and outputs for all scales Resize and Upsample nodes from NCHW to NHWC.

MoveOpPastFork Objects

class MoveOpPastFork(Transformation)

Move node operations past graph forks. Used when a node before a fork can be merged with nodes in the branches

MoveScalarLinearPastSplit Objects

class MoveScalarLinearPastSplit(Transformation)

Move scalar Mul and Add nodes past channel split operation.

MoveMaxPoolPastMultiThreshold Objects

class MoveMaxPoolPastMultiThreshold(Transformation)

Move MaxPool nodes past MultiThreshold nodes on linear segments of the graph.

MoveFlattenPastTopK Objects

class MoveFlattenPastTopK(Transformation)

Move flatten node past a succeeding topk node, if the "axis" attribute in topk is set to -1 and the data layout before the flatten is NHWC with H=W=1

MoveFlattenPastAffine Objects

class MoveFlattenPastAffine(Transformation)

Moves a node that implements a (1, -1) reshape past a MatMul, Mul or Add node.

MoveTransposePastScalarMul Objects

class MoveTransposePastScalarMul(Transformation)

Moves a Transpose node past a scalar Mul node

MoveIdenticalOpPastJoinOp Objects

class MoveIdenticalOpPastJoinOp(Transformation)

Move multiple identical operations on different branches past the common join node. It assumes the shape to be preserved by the join op in the default move_node() method

move_node

def move_node(model, n, producers)

Should be overwritten for some operations

Returns:

bool - whether moving the node was successful

are_producers_identical

def are_producers_identical(model, producers)

Checks only op_types Should be overwritten for additional checks

MoveAddPastJoinAdd Objects

class MoveAddPastJoinAdd(MoveIdenticalOpPastJoinOp)

move_node

def move_node(model, n, producers)

We use the base move_node method to move the first producer past the join node (and delete the rest)

MoveAffinePastJoinConcat Objects

class MoveAffinePastJoinConcat(MoveIdenticalOpPastJoinOp)

Applies to scalar linear or channelwise affine ops with the same parameter value

finn.transformation.streamline.round_thresholds

Rounding and clipping of thresholds to integer representations.

RoundAndClipThresholds Objects

class RoundAndClipThresholds(Transformation)

For MultiThreshold, Thresholding, MVAU, and VVAU nodes operating on integer inp/accumulators, round up (ceil) threshold values to the nearest integer and clip to valid range. Type-casts thresholds (back) to the float32 container type (this is separate from the quantization annotation). Runs InferDataTypes() afterward to propagate any changes to the quantization data types.

apply

def apply(model: ModelWrapper)

Apply the rounding and clipping to all thresholds in the model.

finn.transformation.streamline.sign_to_thres

ConvertSignToThres Objects

class ConvertSignToThres(Transformation)

Convert Sign node instances to MultiThreshold with threshold at 0.

finn.transformation.util

Utility functions for graph transformations and node type checking.

is_threshold

def is_threshold(node: NodeProto)

Check if node is a MultiThreshold operator.

is_attention

def is_attention(node: NodeProto)

Check if node is an Attention operator.

is_join_matmul

def is_join_matmul(node: NodeProto, model: ModelWrapper)

Check if node is a join (two input) matrix multiplication.

is_matmul

def is_matmul(node: NodeProto)

Check if node is a MatMul operator.

is_softmax

def is_softmax(node: NodeProto)

Check if node is a Softmax operator.

is_mul

def is_mul(node: NodeProto)

Check if node is an element-wise Mul.

is_add

def is_add(node: NodeProto)

Check if node is an element-wise Add.

is_end

def is_end(node: NodeProto, model: ModelWrapper)

Check if node is an end node.

is_scalar

def is_scalar(tensor)

Check whether tensor is a scalar, i.e., whether all dimensions are 1.

all_upstream_to_matmul

def all_upstream_to_matmul(node: NodeProto, model: ModelWrapper)

Get all upstream nodes to matrix multiplication.

op_types

def op_types(nodes: list[NodeProto]) -> list[str]

Projects a list of ONNX graph nodes to the string representation of the operator types

is_reshape

def is_reshape(node: NodeProto)

Check if node is a reshape operator.

is_transpose

def is_transpose(node: NodeProto)

Check if node is a transpose operator.

is_reshape_transpose

def is_reshape_transpose(node: NodeProto, model: ModelWrapper)

Check if node is reshape followed by transpose.

is_transpose_reshape

def is_transpose_reshape(node: NodeProto, model: ModelWrapper)

Check if node is transpose followed by reshape.

group_inputs_by_category

def group_inputs_by_category(node: NodeProto, model: ModelWrapper)

Group inputs by categories, i.e., groups dynamic inputs first, followed by initializers. Keep order of inputs in each category.

finn.transformation

Table of Contents

finn.transformation.fpgadataflow.annotate_cycles

AnnotateCycles Objects

__init__

apply

finn.transformation.fpgadataflow.annotate_resources

AnnotateResources Objects

finn.transformation.fpgadataflow.attention

InferScaledDotProductAttention Objects

apply

AbsorbMultiThresholdIntoScaledDotProductAttention Objects

apply

finn.transformation.fpgadataflow.attention_heads

InferMultiHeads Objects

apply

MoveSplitMultiHeadsPastMultiThreshold Objects

apply

MoveMergeMultiHeadsPastMultiThreshold Objects

apply

is_multi_head_attention

UnrollMultiHeadAttention Objects

apply

finn.transformation.fpgadataflow.cleanup

CleanUp Objects

finn.transformation.fpgadataflow.compile_cppsim

CompileCppSim Objects

finn.transformation.fpgadataflow.convert_to_hw_layers

InferConvInpGen Objects

__init__

apply

InferFMPadding Objects

apply

InferThresholdingLayer Objects

__init__

apply

InferRequantLayer Objects

InferUpsample Objects

apply

InferAddStreamsLayer Objects

InferDuplicateStreamsLayer Objects

apply

InferChannelwiseLinearLayer Objects

InferLabelSelectLayer Objects

apply

InferGlobalAccPoolLayer Objects

apply

InferPool Objects

apply

InferPoolFromReduce Objects

apply

InferLookupLayer Objects

apply

InferConcatLayer Objects

apply

InferSplitLayer Objects

apply

InferStreamingEltwise Objects

InferBinaryMatrixVectorActivation Objects

__init__

apply

InferQuantizedMatrixVectorActivation Objects

apply

InferVectorVectorActivation Objects

__init__

apply

InferHWSoftmax Objects

__init__

apply

skip_first_node_transpose

InferShuffle Objects

apply

lift_to_rank1

InferElementwiseBinaryOperation Objects

reject_output_dequant

__init__

apply

InferReLUAsElementwiseMax Objects

reject_unsupported_dtypes

__init__

init

init

init

init

init

init

init

init

init

init

init

init

init