Skip to content

Reduce Python Code Duplication#1476

Draft
hunhoffe wants to merge 7 commits into
Xilinx:mainfrom
hunhoffe:beautify-programming-examples
Draft

Reduce Python Code Duplication#1476
hunhoffe wants to merge 7 commits into
Xilinx:mainfrom
hunhoffe:beautify-programming-examples

Conversation

@hunhoffe
Copy link
Copy Markdown
Contributor

@hunhoffe hunhoffe commented Mar 28, 2026

  • Add air-specific helpers to reduce internal duplication
  • Use helpers from mlir-aie where appropriate to avoid cross-project duplication

There are still some easy things that could be done to clean up the Python further, but this PR is already large so I will defer those to a later time.

hunhoffe and others added 7 commits March 27, 2026 15:22
Adds 5 in-kernel construct helpers to _air_ops_ext.py (auto-exported via
`from air.dialects.air import *`):
  - l1_memref_type / l2_memref_type: memory-space-tagged MemRef types
  - vec_type: 1D VectorType factory
  - identity_map_attr: 1D identity AffineMapAttr for transfer_read/write
  - tile_offset_1d: replaces the 12-line AffineMap offset boilerplate

Adds 2 runtime helpers to xrt_runner.py:
  - make_air_parser: ArgumentParser with the 4 standard flags (-v, -p,
    --compile-mode, --output-format) pre-populated
  - run_on_npu: dispatches compile-only vs compile-and-run in one call

Adds programming_examples/utils.py as an alternative import path with
equivalent helpers (make_l1_memref, tiled_1d_offset, etc.) plus
vec_read/vec_write wrappers around subview+transfer_read/write.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All examples now use helpers from utils.py (or equivalently from
air.dialects.air and air.backend.xrt_runner) to eliminate boilerplate:

In-kernel patterns eliminated:
  - MemRefType.get([...], dtype, memory_space=...) -> l1_memref_type/make_l1_memref
  - VectorType.get([N], dtype) -> vec_type/make_vec_type
  - AffineMapAttr.get(AffineMap.get_identity(1)) -> identity_map_attr/identity_map_1d
  - 12-line AffineMap.get/affine_apply offset block -> tile_offset_1d/tiled_1d_offset
  - subview+transfer_read/write pairs -> vec_read/vec_write

Boilerplate eliminated in __main__:
  - import argparse + 4 add_argument calls -> make_air_parser()
  - if/elif compile_mode XRTRunner+XRTBackend dispatch -> run_on_npu()
  - inline sampled_indices/sampled_values block -> stochastic_check()
  - if args.print_module_only: print(); exit(0) -> check_print_module()

Files updated: all 86 Python example scripts in programming_examples/

Note: examples with non-standard XRTRunner params (trace_offset, debug_ir,
omit_pingpong, sequential test runs) retain their original runner instantiation;
only the import consolidation and argparse refactor apply there.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes from code review (3 FAIL files + 2 additional findings):

1. vector_reduce_add/vector_reduce_add.py: NameError fixes
   - Replace l1_memref_type() (Style B name) -> make_l1_memref() (Style A)
   - Add `from air.dialects import arith` (was used but not imported)
   - Replace identity_map_attr() -> identity_map_1d() (Style A name)

2. rms_norm/rms_norm.py: Add `import numpy as np`
   (np.* calls in __main__ but no numpy import)

3. layer_norm/layer_norm.py: Add `import numpy as np`
   (same issue as rms_norm)

4. conv2d/conv2d.py: Add `import numpy as np`
   (np.int32, np.random, np.zeros used but no numpy import)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The l1_memref_type, l2_memref_type, and tile_offset_1d helpers added in
the previous commit referenced MemorySpace and affine_apply which were
not imported in _air_ops_ext.py's module scope. Fixed by adding explicit
top-level imports:
  - from ._air_enum_gen import MemorySpace as _MemorySpace
  - from .affine import apply as _affine_apply

This ensures the helpers work correctly when invoked inside a
@module_builder context.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
xrt-smi reports "NPU Strix Halo" but the regex only matched single-word
NPU names (\w+ stops at the space). The model check also had a typo
(run_on_2npu instead of run_on_npu).

Fixes:
- Regex: (\w+) -> ([\w ]+?) to capture multi-word names like "Strix Halo"
- Model check: `model in ["npu4", "Strix"]` -> `"Strix" in model` for
  substring match, covering "Strix", "Strix Halo", "Strix Point", etc.
- Typo fix: run_on_2npu -> run_on_npu (NameError on Strix systems)

Without this fix, all ryzen_ai_npu2 tests were UNSUPPORTED on Strix Halo
systems despite the NPU being present and XRT detecting it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Delete the ~1100-line XRTBackend + XRTRunner implementation and replace
with a thin layer on top of mlir-aie's CachedXRTRuntime:

New public API (python/air/backend/xrt.py):
- compile_air(air_module, ...)  -> NPUKernel
  Compiles AIR dialect MLIR via aircc; no pyxrt dependency at compile time.
- AirRuntime(CachedXRTRuntime)
  Richer verify_results(): rtol/atol, mismatch budget, Pearson correlation,
  stochastic sparse sampling.
- get_air_runtime() -> AirRuntime (process-level singleton)
- aie.utils.tensor() used throughout; zero raw pyxrt calls in core files.

Backward-compat shims for XRTBackend, XRTRunner, XRTCompileArtifact kept
so out-of-tree code continues to work during transition.

Migrate 140 call sites (python/air/backend/xrt_runner.py, programming_examples/,
test/xrt/) to compile_air() + get_air_runtime() / run_on_npu(). Remove all
make_xrt_runner(), make_xrt_backend(), and backend.unload() calls. Clean up
stale XRTRunner/XRTBackend imports. Format all changed files with black.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant