Context
Performance tuning requires measuring cycle counts on real hardware (e.g., RISC-V boards). Currently, this involves manual scp, ssh, and parsing iree-benchmark-module output.
Objective
Create a HIL runner (src/tools/hil_runner.py) that treats remote hardware as a local execution function.
Scope of Work
- Device Manager: Abstraction for SSH/UART connection (using
paramiko or pyserial).
- Must handle "Copy Artifact" and "Execute Command".
- Profiler Wrapper:
- Wraps
iree-benchmark-module.
- Parses standard stdout (benchmark_min_time, cpu_time) into JSON.
- Perf Lock: Ensures only one benchmark runs on the board at a time.
Acceptance Criteria
- Test 1: Connection Check.
- Input: Device IP/User in
.env.
- Success: Tool can run
uname -a on the remote device and return the kernel version.
- Test 2: End-to-End Profile.
- Input: A compiled
model.vmfb.
- Condition: Run
hil_runner.py --module model.vmfb.
- Success: Returns a JSON object
{"mean_latency_ms": 12.5, "peak_memory_kb": 4096} derived from the remote execution.
Context
Performance tuning requires measuring cycle counts on real hardware (e.g., RISC-V boards). Currently, this involves manual
scp,ssh, and parsingiree-benchmark-moduleoutput.Objective
Create a HIL runner (
src/tools/hil_runner.py) that treats remote hardware as a local execution function.Scope of Work
paramikoorpyserial).iree-benchmark-module.Acceptance Criteria
.env.uname -aon the remote device and return the kernel version.model.vmfb.hil_runner.py --module model.vmfb.{"mean_latency_ms": 12.5, "peak_memory_kb": 4096}derived from the remote execution.