Hello,
I’m working on a stencil-based AI Engine kernel where I want to implement a sliding window using local memory. My goal is to reuse previously loaded rows and update only one row per iteration. Here's the scenario:
Suppose I have 5 rows of data:
ROW 1
ROW 2
ROW 3
ROW 4
ROW 5
In the first iteration, I load ROW 1, ROW 2, ROW 3 into local memory (e.g., using input_buffer).
In the next iteration, I want to load only ROW 4, discard ROW 1, and align memory to now contain ROW 2, ROW 3, ROW 4.
The same logic continues: shift, reuse, update only the latest row.
However, when I use a plio with input_buffer, it seems the entire buffer gets overwritten with new data in each iteration.
This prevents me from preserving previous rows in local memory across iterations.