Training a CNN based custom neural network model with Pytorch and quantize its layers to 1, 2, 4, 8-bit with quantization-aware-training (QAT). Output of the process ONNX file is used as input file of FINN compiler.
Brevitas is a PyTorch library for neural network quantization, with support for both post-training quantization (PTQ) and quantization-aware training (QAT).
Brevitas: 0.10.2
Ubuntu: 20.04
Python: 3.10.12 (>= 3.8)
Torch: 2.1.0 (Brevitas 0.10.0 supports up to Torch 2.1.0 and higher than 1.9.1)
CUDA: 12.2
Xilinx:
https://github.com/Xilinx/brevitas
https://xilinx.github.io/brevitas/getting_started
For more detail, please check the Brevitas sections of the thesis in the link below.