You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create a scratchpad memory for data exchange between the host's main memory and the NPU command processor and copy its contents to the command processor's memory.
965
+
966
+
When the runtime (XRT) observes that this instruction is present in the runtime sequence, it will allocate a scratchpad memory of the specified size on the host.
967
+
When the command processor firmware executes this instruction, it copies the data in the runtime-allocated scratchpad region from the host's main memory to the NPU command processor's memory.
968
+
From there, you can write values from the copy of the scratchpad memory in the command processor to arbitrary locations in the NPU (with restrictions) via the `npu.update_from_scratchpad` op.
To get a handle on the allocated scratchpad memory from XRT, use the `run.get_ctrl_scratchpad_bo()` method on the `xrt::run` object in your host application (`test.cpp`).
982
+
An example can be found in `test/npu-xrt/scratchpad_regwrite`.
983
+
984
+
The `usage_type` attribute specifies the scratchpad layout; currently only a value of `0` is supported.
985
+
986
+
The `size` attribute specifies the size of the scratchpad in bytes.
987
+
988
+
The host's main memory address is patched into the instruction at runtime by XRT based on the `.ctrl.scratchpad` section in the ELF. The assembler (aiebu) generates patching information for this address when it encounters this opcode.
989
+
990
+
The scratchpad memory contains a `StateTable`, indexed by 32-bit words by the `npu.update_from_scratchpad` op. `StateTable` constraints: max 32 entries, max total scratchpad size 128 bytes.
Add a computed value based on scratchpad contents to an 8-byte section at the target address.
1012
+
1013
+
This instruction reads the contents of scratchpad memory created using the `npu.create_scratchpad` op (the `StateTable`), calculates a value, then adds the result to the memory location or register denoted by the given address as follows:
1014
+
1015
+
1. Reads the existing 64-bit value from the register pair:
2. Computes a delta based on the contents of the scratchpad memory created using `npu.create_scratchpad` (the `StateTable`) based on the selected function:
- This is always additive. It adds the computed delta to whatever value is already in the register pair. It cannot set an absolute value.
1043
+
- The lower 2 bits of the first register are always cleared (4-byte aligned).
1044
+
- The upper 16 bits of the second register are preserved unchanged.
1045
+
- Always writes 8 contiguous bytes (both registers in the pair).
1046
+
1047
+
### Rationale
1048
+
1049
+
The firmware instruction underpinning this operation was originally intended to patch shim buffer descriptor addresses only.
1050
+
Because of this, this always writes 48 bits (size of BD addresses) and the lower bits are zeroed (assuring the value is a multiple of the addressable word size).
1051
+
1052
+
### Attributes
1053
+
1054
+
- `state_table_idx` is a 32-bit-word index offset into the scratchpad memory. The source value will be read from this offset into the scratchpad.
1055
+
- `func` is the function applied to the state table value in firmware, which can be one of `mul`, `incr` or `decr`.
1056
+
- `func_arg` is the argument to the function.
1057
+
- `address`, `buffer`, `column` and `row` together resolve to the destination address that the value will be written to.
1058
+
Address resolution is the same as for `npu.write32`:
1059
+
- If `buffer` is specified, `address` is a word offset relative to that
1060
+
buffer's start address.
1061
+
- If `column` and `row` are present, `address` is a local offset within
1062
+
that tile's address space.
1063
+
- Otherwise, `address` is an absolute address into the AIE array.
0 commit comments