Commit e6a2a5f
committed
AIESW-6823, AIESW-6824: Optimize cmd_chain latency and throughput test configurations
[WHY]
- Default configurations showed high variance and suboptimal performance
- Throughput mode defaulted to depth=2, causing performance degradation
- Needed stable, repeatable configurations for validation testing
- Required sufficient iterations for framework warm-up and measurement stability
[HOW]
- Conducted extensive empirical testing (27+ samples per configuration)
- Set latency: 20 iterations x 1000 runs (depth=1 automatic in latency mode)
- Set throughput: 20 iterations x 1000 runs x depth=1 (explicit override)
- Validated depth=1 is mandatory for optimal throughput (depth>1 degrades 47-99%)
- Confirmed 20 iterations minimum for stable measurements (10 iters = +/-47% variance)
- Verified 1000 runs optimal for peak performance
[RESULTS]
- Latency: 14.2 us average, range 13-16 us, +/-3 us variance
- Throughput: 71,673 ops/s average, range 66,764-75,508 ops/s, +/-13% variance
- Peak throughput: 75,508 ops/s (highest observed across all testing)1 parent c92f20f commit e6a2a5f
5 files changed
Lines changed: 9935 additions & 11 deletions
File tree
- archive/npu3
- cmd_chain_latency
- cmd_chain_throughput
Lines changed: 2 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | | - | |
7 | | - | |
| 6 | + | |
8 | 7 | | |
9 | 8 | | |
0 commit comments