|
1 | 1 | # Reproducing `bionpu` results |
2 | 2 |
|
3 | | -> Status: shell — full reproduction notes will land during the v0.1 |
4 | | -> extraction. This document tracks what's required end-to-end so a |
5 | | -> reader can run the benchmarks on their own machine. |
| 3 | +End-to-end reproduction recipe for everything in `bionpu` that's |
| 4 | +runnable today. v0.1 ships the byte-equality harness end-to-end; the |
| 5 | +full scan / basecall pipelines are v0.2 scope (see |
| 6 | +[`STATUS.md`](STATUS.md)) — this document covers what works now and |
| 7 | +documents the v0.2 driver shape so you can drive the kernels manually |
| 8 | +in the meantime. |
6 | 9 |
|
7 | | -## Hardware |
| 10 | +## 1. Hardware |
8 | 11 |
|
9 | | -- AMD Ryzen AI 9 HX (Strix family) or other AIE2P-equipped silicon. |
10 | | -- Linux (kernel ≥ 6.10 with `amdxdna` available). |
| 12 | +- AMD Ryzen AI 9 HX (Strix family) or other AIE2P-equipped silicon |
| 13 | + with the `amdxdna` accelerator exposed at `/dev/accel/accel0`. |
| 14 | +- Linux kernel ≥ 6.10 with `amdxdna.ko` loaded. |
11 | 15 |
|
12 | | -## Software prerequisites |
| 16 | +## 2. Software prerequisites |
13 | 17 |
|
14 | | -- `xdna-driver` built and `amdxdna.ko` loaded. |
15 | | -- XRT (Xilinx Runtime) installed (`/opt/xilinx/xrt`). |
16 | | -- `mlir-aie` built; `aiecc` on `$PATH`. |
17 | | -- Peano (LLVM-AIE) installed; `$PEANO_INSTALL_DIR` set. |
18 | | -- Python ≥ 3.11. |
| 18 | +- **`xdna-driver`** built and `amdxdna.ko` loaded. Verify: |
| 19 | + ```sh |
| 20 | + lsmod | grep amdxdna |
| 21 | + ls -l /dev/accel/accel0 |
| 22 | + ``` |
| 23 | +- **XRT** installed at `/opt/xilinx/xrt`: |
| 24 | + ```sh |
| 25 | + source /opt/xilinx/xrt/setup.sh |
| 26 | + xrt-smi examine # must list the NPU device |
| 27 | + ``` |
| 28 | +- **`mlir-aie`** built; `aiecc` on `$PATH`. The recommended `mlir-aie` |
| 29 | + is the wheel-built `ironenv` — see the [opensensor/genetics |
| 30 | + bring-up |
| 31 | + guide](https://github.com/opensensor/genetics/blob/main/docs/xdna-driver-build.md) |
| 32 | + for the canonical setup. |
| 33 | +- **Peano (LLVM-AIE)** installed; `$PEANO_INSTALL_DIR` points at the |
| 34 | + install tree. |
| 35 | +- **Python ≥ 3.11**. |
19 | 36 |
|
20 | | -See `bionpu`'s upstream documentation in |
21 | | -[opensensor/genetics](https://github.com/opensensor/genetics) for the |
22 | | -NPU bring-up steps if the above are not yet on your system. |
23 | | - |
24 | | -## Install bionpu |
| 37 | +## 3. Install bionpu |
25 | 38 |
|
26 | 39 | ```sh |
27 | 40 | git clone https://github.com/opensensor/bionpu.git |
28 | 41 | cd bionpu |
29 | 42 | pip install -e ".[test]" |
30 | 43 | ``` |
31 | 44 |
|
32 | | -## Run a single benchmark |
| 45 | +This installs the `bionpu` CLI on `$PATH` and exposes the |
| 46 | +`bionpu.{verify,kernels,dispatch,bench,data,quant}` modules to your |
| 47 | +Python interpreter. |
| 48 | + |
| 49 | +## 4. Reproduce the byte-equality smoke check |
33 | 50 |
|
34 | | -CRISPR off-target scan against chr22, with byte-equality vs cas-offinder: |
| 51 | +This is the simplest end-to-end demonstration of the verify harness. |
| 52 | +No NPU required — just the committed reference data. |
35 | 53 |
|
36 | 54 | ```sh |
37 | | -benchmarks/crispr/run_chr.sh chr22 |
| 55 | +# Compare the canonical reference TSV against itself; expect EQUAL. |
| 56 | +bionpu verify crispr \ |
| 57 | + reference/crispr/casoffinder-canonical.tsv \ |
| 58 | + reference/crispr/casoffinder-canonical.tsv |
| 59 | +# Expected: result EQUAL, 422 records, matching SHA-256s. Exit code 0. |
| 60 | + |
| 61 | +# Negative control: mutate the input and expect DIVERGENT. |
| 62 | +sed -i.bak 's/^chr/chr_DIRTY_/' /tmp/dirty.tsv 2>/dev/null |
| 63 | +sed 's/^chr/chr_DIRTY_/' reference/crispr/casoffinder-canonical.tsv > /tmp/dirty.tsv |
| 64 | +bionpu verify crispr /tmp/dirty.tsv reference/crispr/casoffinder-canonical.tsv |
| 65 | +# Expected: result DIVERGENT, exit code 1, first divergence reported. |
38 | 66 | ``` |
39 | 67 |
|
40 | | -Basecalling against a small pod5 fixture, with byte-equality vs Dorado: |
| 68 | +## 5. Reproduce a kernel build (any one) |
| 69 | + |
| 70 | +Pick any kernel under `src/bionpu/kernels/`. For the CRISPR PAM filter: |
| 71 | + |
| 72 | +```sh |
| 73 | +cd src/bionpu/kernels/crispr/pam_filter |
| 74 | +export MLIR_AIE_DIR=<path/to/mlir-aie> |
| 75 | +export PEANO_INSTALL_DIR=<path/to/llvm-aie> |
| 76 | +make NPU2=1 |
| 77 | +# Expected: build/final.xclbin + build/insts.bin produced. |
| 78 | +``` |
| 79 | + |
| 80 | +Each kernel directory ships `MANIFEST.md` describing the inputs, |
| 81 | +outputs, expected on-tile placement, and any kernel-specific |
| 82 | +`make` flags. |
| 83 | + |
| 84 | +## 6. Run a kernel against silicon (manual driver, v0.2-scope) |
| 85 | + |
| 86 | +The v0.2 `bionpu scan` / `bionpu basecall` drivers are not yet wired, |
| 87 | +so end-to-end pipeline runs go through the per-kernel host runner: |
41 | 88 |
|
42 | 89 | ```sh |
43 | | -benchmarks/basecalling/run_pod5.sh reference/basecalling/smoke.pod5 |
| 90 | +cd src/bionpu/kernels/crispr/pam_filter |
| 91 | +# After 'make NPU2=1': |
| 92 | +./host_runner --xclbin build/final.xclbin \ |
| 93 | + --insts build/insts.bin \ |
| 94 | + --in <input.bin> \ |
| 95 | + --out /tmp/npu_hits.tsv |
| 96 | +bionpu verify crispr /tmp/npu_hits.tsv reference/crispr/casoffinder-chr22-10guides.tsv |
44 | 97 | ``` |
45 | 98 |
|
46 | | -Pre-computed results for chr1, chr19, chr22 are checked into |
47 | | -`benchmarks/results/`. To regenerate, run the same scripts and they |
48 | | -will overwrite the JSON snapshots in place. |
| 99 | +The `host_runner` argv shape varies per kernel — see each kernel's |
| 100 | +`MANIFEST.md`. |
| 101 | + |
| 102 | +## 7. Energy methodology |
| 103 | + |
| 104 | +For the per-device energy figures the bench harness produces, read |
| 105 | +[`ENERGY_METHODOLOGY.md`](ENERGY_METHODOLOGY.md) first. The TL;DR: |
| 106 | + |
| 107 | +- CPU rail = AMD RAPL package counter, package-only, no DRAM. |
| 108 | +- GPU rail = `nvidia-smi` total board (compute + memory + VRMs). |
| 109 | +- NPU rail = `xrt-smi` AIE-partition firmware-internal estimate. |
| 110 | + |
| 111 | +A figure caption that compares any two of these without listing the |
| 112 | +includes / excludes is not an honest comparison. |
| 113 | + |
| 114 | +## 8. Sanity-log discipline |
| 115 | + |
| 116 | +Calibration evidence — which counters are AVAILABLE / UNAVAILABLE on |
| 117 | +your host, what the probe path returned, what the resolution path |
| 118 | +was — is recorded in |
| 119 | +[`src/bionpu/bench/energy/SANITY-LOG.md`](../src/bionpu/bench/energy/SANITY-LOG.md). |
| 120 | +**Append, never overwrite.** A run on a different host (different |
| 121 | +kernel / driver / governor) is a different measurement; record it |
| 122 | +as a new entry rather than editing in place. |
| 123 | + |
| 124 | +## What's deferred to v0.2 |
| 125 | + |
| 126 | +- `bionpu scan` and `bionpu basecall` end-to-end drivers (the kernels |
| 127 | + are migrated and buildable; what's missing is the Python that |
| 128 | + drives them as a single CLI invocation). |
| 129 | +- Pre-computed `benchmarks/results/{crispr,basecalling}/*.json` |
| 130 | + snapshots for chr1 / chr19 / chr22 / a representative pod5. |
| 131 | +- A tagged `v0.1` GitHub release with the headline numbers. |
0 commit comments