|
| 1 | +# Quick Test Reference Card |
| 2 | + |
| 3 | +**Option 5: Atomic CAS + Seqlock** |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## Run Tests |
| 8 | + |
| 9 | +```bash |
| 10 | +./run_comprehensive_tests.sh 8 |
| 11 | +``` |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## Expected Outcomes (8 Processes) |
| 16 | + |
| 17 | +### ✅ Initialization |
| 18 | + |
| 19 | +| Metric | Expected Value | Meaning | |
| 20 | +|--------|----------------|---------| |
| 21 | +| **INITIALIZER count** | 1 | Only one process wins CAS | |
| 22 | +| **SPIN-WAITER count** | 0-2 | Processes 1-2 may wait briefly | |
| 23 | +| **FAST PATH count** | 5-7 | Most processes skip initialization | |
| 24 | +| **Total time** | <5 seconds | Fast startup | |
| 25 | +| **Initializer time** | ~2000ms | Does full GPU enumeration | |
| 26 | +| **Spin-waiter time** | 50-200ms | Waits for initializer | |
| 27 | +| **Fast path time** | <50ms | Instant skip | |
| 28 | + |
| 29 | +### ✅ Memory Operations |
| 30 | + |
| 31 | +| Metric | Expected Value | Meaning | |
| 32 | +|--------|----------------|---------| |
| 33 | +| **Allocation failures** | 0 | No false OOMs | |
| 34 | +| **Completed processes** | 8/8 | All finish successfully | |
| 35 | +| **Memory accounting** | Within 10% | Accurate tracking | |
| 36 | + |
| 37 | +### ✅ High Contention |
| 38 | + |
| 39 | +| Metric | Expected Value | Meaning | |
| 40 | +|--------|----------------|---------| |
| 41 | +| **Thread completion** | 100% | No deadlocks | |
| 42 | +| **Failure rate** | 0% | All operations succeed | |
| 43 | +| **Throughput** | >1000 ops/sec | Excellent performance | |
| 44 | + |
| 45 | +### ✅ Seqlock (Partial Reads) |
| 46 | + |
| 47 | +| Metric | Expected Value | Meaning | |
| 48 | +|--------|----------------|---------| |
| 49 | +| **Inconsistency rate** | <5% | Seqlock working correctly | |
| 50 | +| **Warnings** | 0-20 | Minor inconsistencies acceptable | |
| 51 | +| **Failures** | 0 | No major torn reads | |
| 52 | + |
| 53 | +### ✅ Stress Test |
| 54 | + |
| 55 | +| Metric | Expected Value | Meaning | |
| 56 | +|--------|----------------|---------| |
| 57 | +| **Pass rate** | 18-20/20 | Stable over time | |
| 58 | +| **Orphaned processes** | 0 | Clean shutdown | |
| 59 | + |
| 60 | +--- |
| 61 | + |
| 62 | +## Visual Guide |
| 63 | + |
| 64 | +### Good Result Example |
| 65 | + |
| 66 | +``` |
| 67 | +┌──────────────────────────────────────────────────────┐ |
| 68 | +│ PHASE 3: Multi-Process Test (8 processes) │ |
| 69 | +└──────────────────────────────────────────────────────┘ |
| 70 | +
|
| 71 | +Expected: Exactly 1 process is INITIALIZER (CAS winner) |
| 72 | +✓ PASS: Exactly 1 INITIALIZER (atomic CAS working correctly) |
| 73 | +
|
| 74 | +Expected: 0-2 processes are SPIN-WAITERs (early arrivals) |
| 75 | +✓ PASS: SPIN-WAITER count acceptable: 1 |
| 76 | +
|
| 77 | +Expected: Remaining processes take FAST PATH (late arrivals) |
| 78 | +✓ PASS: Majority took FAST PATH: 6/8 |
| 79 | +
|
| 80 | +Expected: Initialization completes in <3 seconds |
| 81 | +✓ PASS: Total execution time: 2s (expected <5s) |
| 82 | +
|
| 83 | +Expected: All allocations succeed (no OOM false positives) |
| 84 | +✓ PASS: No allocation failures (0 false OOMs) |
| 85 | +
|
| 86 | +Expected: Seqlock retry rate: <1% |
| 87 | +✓ PASS: No seqlock warnings (perfect consistency) |
| 88 | +
|
| 89 | +Expected: All processes complete without deadlock |
| 90 | +✓ PASS: All 8 processes completed (no deadlocks) |
| 91 | +
|
| 92 | +Expected: Operations per second: >1000 ops/sec |
| 93 | +✓ PASS: Throughput excellent: 1234 ops/sec (>1000 expected) |
| 94 | +
|
| 95 | +╔══════════════════════════════════════════════════════╗ |
| 96 | +║ ║ |
| 97 | +║ ✓ ALL VALIDATIONS PASSED ║ |
| 98 | +║ ║ |
| 99 | +╚══════════════════════════════════════════════════════╝ |
| 100 | +``` |
| 101 | + |
| 102 | +### Warning Example (Still Acceptable) |
| 103 | + |
| 104 | +``` |
| 105 | +⚠ WARN: SPIN-WAITER count: 3 (expected ≤2) |
| 106 | +✓ PASS: Majority took FAST PATH: 4/8 |
| 107 | +⚠ WARN: 12 seqlock warnings (minor inconsistencies under load) |
| 108 | +✓ PASS: All 8 processes completed (no deadlocks) |
| 109 | +``` |
| 110 | + |
| 111 | +**Interpretation**: System under higher load, but still working correctly. |
| 112 | + |
| 113 | +### Failure Example |
| 114 | + |
| 115 | +``` |
| 116 | +✗ FAIL: Expected 1 INITIALIZER, found 3 |
| 117 | +``` |
| 118 | + |
| 119 | +**Action**: Atomic CAS broken. Check compiler version (need GCC 4.9+). |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +## Timing Cheat Sheet |
| 124 | + |
| 125 | +### Process Initialization Times |
| 126 | + |
| 127 | +``` |
| 128 | +INITIALIZER: ████████████████████ ~2000ms (Full init) |
| 129 | +SPIN-WAITER: ███ ~100ms (Brief wait) |
| 130 | +FAST PATH: █ <50ms (Instant skip) |
| 131 | +``` |
| 132 | + |
| 133 | +### By Process Rank (Typical) |
| 134 | + |
| 135 | +``` |
| 136 | +Rank 0: ████████████████████ 2000ms (INITIALIZER - First started) |
| 137 | +Rank 1: ███ 120ms (SPIN-WAITER - Lost CAS) |
| 138 | +Rank 2: ███ 115ms (SPIN-WAITER - Lost CAS) |
| 139 | +Rank 3: █ 35ms (FAST PATH - Late arrival) |
| 140 | +Rank 4: █ 28ms (FAST PATH - Late arrival) |
| 141 | +Rank 5: █ 22ms (FAST PATH - Late arrival) |
| 142 | +Rank 6: █ 18ms (FAST PATH - Late arrival) |
| 143 | +Rank 7: █ 15ms (FAST PATH - Late arrival) |
| 144 | +``` |
| 145 | + |
| 146 | +--- |
| 147 | + |
| 148 | +## Quick Validation Checklist |
| 149 | + |
| 150 | +**After running tests, verify:** |
| 151 | + |
| 152 | +- [ ] Compilation succeeded |
| 153 | +- [ ] Exactly 1 INITIALIZER found |
| 154 | +- [ ] At least 50% processes took FAST PATH |
| 155 | +- [ ] Total time <5 seconds |
| 156 | +- [ ] 0 allocation failures |
| 157 | +- [ ] Seqlock inconsistency rate <5% |
| 158 | +- [ ] All processes completed (no deadlock) |
| 159 | +- [ ] Throughput >500 ops/sec (>1000 = excellent) |
| 160 | +- [ ] Stress test pass rate ≥18/20 |
| 161 | + |
| 162 | +**If all checked**: Option 5 is working correctly! ✅ |
| 163 | + |
| 164 | +--- |
| 165 | + |
| 166 | +## Interpreting the Role Distribution |
| 167 | + |
| 168 | +### Perfect Distribution (Rank = Launch Order) |
| 169 | + |
| 170 | +``` |
| 171 | +Process Rank 0: INITIALIZER ← First to start, wins CAS |
| 172 | +Process Rank 1: SPIN-WAITER ← Second, loses CAS, waits |
| 173 | +Process Rank 2: FAST PATH ← Third, arrives late |
| 174 | +Process Rank 3: FAST PATH ← Fourth, arrives late |
| 175 | +Process Rank 4: FAST PATH ← Fifth, arrives late |
| 176 | +Process Rank 5: FAST PATH ← Sixth, arrives late |
| 177 | +Process Rank 6: FAST PATH ← Seventh, arrives late |
| 178 | +Process Rank 7: FAST PATH ← Eighth, arrives late |
| 179 | +``` |
| 180 | + |
| 181 | +**Why this is good**: Test script staggers starts (rank × 10ms), so later ranks should take fast path. |
| 182 | + |
| 183 | +### Good Distribution (Some Variation) |
| 184 | + |
| 185 | +``` |
| 186 | +1 INITIALIZER ← One process did initialization |
| 187 | +2 SPIN-WAITERs ← Two processes arrived early, waited |
| 188 | +5 FAST PATHs ← Five processes arrived late, skipped |
| 189 | +``` |
| 190 | + |
| 191 | +**Why this is acceptable**: System timing variations are normal. |
| 192 | + |
| 193 | +### Bad Distribution (Fast Path Not Working) |
| 194 | + |
| 195 | +``` |
| 196 | +1 INITIALIZER |
| 197 | +7 SPIN-WAITERs ← Everyone waiting! Fast path broken! |
| 198 | +0 FAST PATHs |
| 199 | +``` |
| 200 | + |
| 201 | +**Action**: Check atomic read implementation in fast path check. |
| 202 | + |
| 203 | +--- |
| 204 | + |
| 205 | +## Performance Thresholds |
| 206 | + |
| 207 | +### Initialization Time (8 processes) |
| 208 | + |
| 209 | +| Time | Grade | Status | |
| 210 | +|------|-------|--------| |
| 211 | +| <3s | A+ | Excellent - Optimal performance | |
| 212 | +| 3-5s | A | Good - Expected performance | |
| 213 | +| 5-8s | B | Acceptable - Some contention | |
| 214 | +| 8-12s | C | Slow - Investigate contention | |
| 215 | +| >12s | F | Failure - Fast path not working | |
| 216 | + |
| 217 | +### Throughput (ops/sec) |
| 218 | + |
| 219 | +| Rate | Grade | Status | |
| 220 | +|------|-------|--------| |
| 221 | +| >1500 | A+ | Excellent - Minimal contention | |
| 222 | +| 1000-1500 | A | Good - Expected performance | |
| 223 | +| 500-1000 | B | Acceptable - Moderate contention | |
| 224 | +| 200-500 | C | Slow - High contention | |
| 225 | +| <200 | F | Failure - Serialization issues | |
| 226 | + |
| 227 | +### Seqlock Inconsistency Rate |
| 228 | + |
| 229 | +| Rate | Grade | Status | |
| 230 | +|------|-------|--------| |
| 231 | +| 0% | A+ | Perfect - No retries needed | |
| 232 | +| <1% | A | Excellent - Rare retries | |
| 233 | +| 1-5% | B | Good - Acceptable retries | |
| 234 | +| 5-10% | C | High - Investigate load | |
| 235 | +| >10% | F | Failure - Torn reads occurring | |
| 236 | + |
| 237 | +--- |
| 238 | + |
| 239 | +## Troubleshooting Quick Guide |
| 240 | + |
| 241 | +| Symptom | Likely Cause | Quick Fix | |
| 242 | +|---------|--------------|-----------| |
| 243 | +| Multiple INITIALIZERs | Atomic CAS broken | Check GCC ≥4.9, verify C11 support | |
| 244 | +| No FAST PATH | Staggering not working | Check test script delays | |
| 245 | +| Allocation failures | Memory limit too low | Increase `CUDA_DEVICE_MEMORY_LIMIT` | |
| 246 | +| High inconsistency | Too much contention | Run with fewer processes | |
| 247 | +| Deadlock | Semaphore issue | Check semaphore timeout logs | |
| 248 | +| Compilation error | Missing atomics | Install GCC 4.9+ or Clang 3.1+ | |
| 249 | + |
| 250 | +--- |
| 251 | + |
| 252 | +## Log File Locations |
| 253 | + |
| 254 | +After test run, logs saved to `/tmp/hami_comprehensive_[timestamp]/`: |
| 255 | + |
| 256 | +``` |
| 257 | +compile.log - Compilation output |
| 258 | +single_process.log - Single process test |
| 259 | +multi_process.log - Main multi-process test |
| 260 | +stress_test.log - Stress test iterations |
| 261 | +init_times.txt - Extracted initialization times |
| 262 | +results_summary.txt - Final summary |
| 263 | +``` |
| 264 | + |
| 265 | +--- |
| 266 | + |
| 267 | +## One-Line Validation |
| 268 | + |
| 269 | +```bash |
| 270 | +# Quick pass/fail check |
| 271 | +./run_comprehensive_tests.sh 8 && echo "✅ PASS" || echo "❌ FAIL" |
| 272 | +``` |
| 273 | + |
| 274 | +--- |
| 275 | + |
| 276 | +**Keep this card handy when running tests!** |
| 277 | + |
| 278 | +For detailed explanations, see `TEST_SUITE_DOCUMENTATION.md` |
0 commit comments