Skip to content

Support firmware scratchpad feature#3056

Merged
andrej merged 7 commits into
Xilinx:mainfrom
andrej:scratchpad
May 15, 2026
Merged

Support firmware scratchpad feature#3056
andrej merged 7 commits into
Xilinx:mainfrom
andrej:scratchpad

Conversation

@andrej
Copy link
Copy Markdown
Collaborator

@andrej andrej commented May 7, 2026

Adds support for the XRT scratchpad memory feature, see example usage test.cpp and aie.mlir as well as docs in the code.

I'm working on a more usable higher-level abstraction for run-time parameters using this mechanism.

image

Copilot AI review requested due to automatic review settings May 7, 2026 21:47
@hunhoffe hunhoffe added this to the IRON 1.4 milestone May 7, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds AIEX IR and NPU binary emission support for the XRT/firmware “control scratchpad” feature, enabling host-provided scratchpad data to be consumed by the NPU command processor and applied to NPU memory/registers (via UPDATE_REG), with an end-to-end NPU2 test.

Changes:

  • Introduces aiex.npu.create_scratchpad and aiex.npu.update_from_scratchpad ops (with accompanying enum attr) and wires them into lowering/removal.
  • Extends NPU transaction binary emission to encode CREATE_SCRATCHPAD and UPDATE_REG instructions.
  • Adds a hardware test demonstrating host scratchpad BO usage + runtime sequence update into a core-visible buffer.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/npu-xrt/scratchpad_regwrite/test.cpp Host-side example/test writing scratchpad BO and validating returned value.
test/npu-xrt/scratchpad_regwrite/run.lit Lit harness to build the MLIR ELF + host test and run on NPU2.
test/npu-xrt/scratchpad_regwrite/aie.mlir AIEX runtime sequence creating scratchpad and applying UPDATE_REG into a tile buffer, synced by a lock.
lib/Targets/AIETargetNPU.cpp Adds binary encoding for CREATE_SCRATCHPAD and UPDATE_REG-from-scratchpad ops.
lib/Dialect/AIEX/Transforms/AIEXToStandard.cpp Removes the new AIEX NPU scratchpad ops during AIEX→Standard conversion.
lib/Dialect/AIEX/IR/AIEXDialect.cpp Includes generated enum defs and adds absolute-address helper for NpuUpdateFromScratchpadOp.
include/aie/Dialect/AIEX/IR/CMakeLists.txt Adds tablegen targets for AIEX enum decls/defs used by the new ops.
include/aie/Dialect/AIEX/IR/AIEXEnums.h New public header for generated AIEX enum declarations.
include/aie/Dialect/AIEX/IR/AIEXDialect.h Includes AIEX enums in the dialect public header.
include/aie/Dialect/AIEX/IR/AIEXAttrs.td Defines StateTableFunc enum attribute for UPDATE_REG function selection.
include/aie/Dialect/AIEX/IR/AIEX.td Adds scratchpad ops + documentation and pulls in AIEXAttrs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/npu-xrt/scratchpad_regwrite/test.cpp
Comment thread lib/Targets/AIETargetNPU.cpp
Comment thread include/aie/Dialect/AIEX/IR/CMakeLists.txt
Comment thread include/aie/Dialect/AIEX/IR/AIEX.td
Comment on lines +1158 to +1162
### Watch Out!

The firmware optimizes out repeated `load_pdi` operations if they refer to the same PDI.
To force a reload / device reset, intersperse a `load_pdi` to a different PDI between same-device reloads.
The easiest workaround is to create an empty `aie.device(npu2) @empty {}` and load that before the main device load.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting suggestion to force more configuration overhead 😄

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunately what I've had to do in a lot of places. I guess maybe we should raise this as an issue with firmware people because it really is silly

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it needed for?

Copy link
Copy Markdown
Collaborator Author

@andrej andrej May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test: (a) Cause core to re-execute from the start (alternative to while(true)which in this case did not work), basically a reset and (b) reset the scratchpad parameter buffer to zero (admittedly a little heavy-handed, if it was only for that a write32 would probably work too).

More generally: If I write load_pdi, I'd like a load_pdi to happen. The above two examples are two cases where just omitting causes unexpected different semantics (e.g. buffers not initialized to the values that are in the PDI that we supposedly loaded).

Copy link
Copy Markdown
Collaborator

@hunhoffe hunhoffe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check the modules, but otherwise looks good! I'm excited to see what you do with this feature.

Comment thread include/aie/Dialect/AIEX/IR/AIEXEnums.h Outdated
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
// (c) Copyright 2024 Advanced Micro Devices, Inc.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: year?

@ypapadop-amd
Copy link
Copy Markdown
Collaborator

Is this a new memory type, right?

@andrej
Copy link
Copy Markdown
Collaborator Author

andrej commented May 15, 2026

Is this a new memory type, right?

It's a new firmware feature. I've described how it works here.

@andrej andrej added this pull request to the merge queue May 15, 2026
@ypapadop-amd
Copy link
Copy Markdown
Collaborator

Is this a new memory type, right?

It's a new firmware feature. I've described how it works here.

Is there more information on it? Does it persist among dispatches / hwctx reconfiguration for example?

Merged via the queue into Xilinx:main with commit b410524 May 15, 2026
71 checks passed
@andrej andrej deleted the scratchpad branch May 15, 2026 18:24
@andrej
Copy link
Copy Markdown
Collaborator Author

andrej commented May 18, 2026

It persists across xrt::runs, for as long as the buffer object returned by get_ctrl_scratchpad_bo lives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants