Single tile performs a very simple reduction operation where the kernel loads data from local memory, performs the add reduction and stores the resulting value back.
Input data is brought to the local memory of the Compute tile via a Shim tile. The size of the input data N from the Shim tile is 1024xi32. The data is copied to the AIE tile, where the reduction is performed. The single output data value is copied from the AIE tile to the Shim tile.
-
vector_reduce_add.py: A Python script that defines the AIE array structural design using MLIR-AIE operations. This generates MLIR that is then compiled usingaieccto produce design binaries (ie. XCLBIN and inst.bin for the NPU in Ryzen™ AI). -
vector_reduce_add_placed.py: An alternative version of the design invector_reduce_add.py, that is expressed in a lower-level version of IRON. -
reduce_add.cc: A C++ implementation of a vectorizedaddreduction operation for AIE cores. The code uses the AIE API, which is a C++ header-only library providing types and operations that get translated into efficient low-level intrinsics, and whose documentation can be found here. The source can be found here. -
test.cpp: This C++ code is a testbench for the design example targetting Ryzen™ AI (AIE2). The code is responsible for loading the compiled XCLBIN file, configuring the AIE module, providing input data, and executing the AIE design on the NPU. After executing, the program verifies the results.
To compile the design:
makeTo compile the placed design:
env use_placed=1 makeTo compile the C++ testbench:
make vector_reduce_add.exeTo run the design:
make run