|
| 1 | +--- |
| 2 | +name: functional-test |
| 3 | +description: > |
| 4 | + Use this skill when running functional tests to validate PerfSpect code changes, |
| 5 | + when the user says "run functional tests", "test my changes", "check for regressions", |
| 6 | + or when verifying a code change did not break existing functionality. |
| 7 | +--- |
| 8 | + |
| 9 | +> **Skill Loaded:** "Using functional-test skill." |
| 10 | +
|
| 11 | +# Functional Test Runner |
| 12 | + |
| 13 | +Run targeted PerfSpect functional tests on a remote target to validate code changes. Identify the specific tests affected by a change, run them, and verify output aligns with the change. |
| 14 | + |
| 15 | +## Test script |
| 16 | + |
| 17 | +`../tools/perfspect/functional_test.sh` (relative to the perfspect repo root). Verify the file exists before proceeding. |
| 18 | + |
| 19 | +## Prerequisites |
| 20 | + |
| 21 | +1. **Built binary.** Run `make` (x86_64) or `make perfspect-aarch64` (ARM64). Binary must be at `./perfspect` (or set `PERFSPECT_DIR`). |
| 22 | +2. **Remote target.** User must provide: hostname/IP (`TARGET`), SSH user (`USER_NAME`), private key path (`PRIVATE_KEY_PATH`). Password-less sudo must be configured on the target. |
| 23 | +3. **Target dependencies.** `stress-ng` on the target. For flame tests: `java` and `/tmp/primes.java` (copy from `../tools/perfspect/primes.java`). |
| 24 | + |
| 25 | +## Workflow |
| 26 | + |
| 27 | +### Step 1 — Analyze the code change |
| 28 | + |
| 29 | +Run `git diff main...HEAD` (or the appropriate base). Read the diff. Identify: |
| 30 | + |
| 31 | +- **What changed**: flag names, validation logic, error messages, output formats, collection behavior, report generation, table definitions, script content. |
| 32 | +- **Behavioral impact**: Does the change alter a CLI flag? A validation rule? An error message string? An output file format? A collection path? A report table? |
| 33 | + |
| 34 | +### Step 2 — Identify affected test categories |
| 35 | + |
| 36 | +Use the code-to-category mapping below to determine which `TEST_*` categories are affected. |
| 37 | + |
| 38 | +| Changed path | Categories | |
| 39 | +|---|---| |
| 40 | +| `cmd/config/` | `TEST_CONFIG` | |
| 41 | +| `cmd/flamegraph/` | `TEST_FLAME` | |
| 42 | +| `cmd/lock/` | `TEST_LOCK` | |
| 43 | +| `cmd/metrics/` | `TEST_METRICS` | |
| 44 | +| `cmd/report/` | `TEST_REPORT` | |
| 45 | +| `cmd/benchmark/` | `TEST_BENCHMARK` | |
| 46 | +| `cmd/telemetry/` | `TEST_TELEMETRY` | |
| 47 | +| `cmd/root.go` | All — trace the specific change to narrow | |
| 48 | +| `internal/app/` | All — trace the specific change to narrow | |
| 49 | +| `internal/workflow/` | All reporting commands — trace to narrow | |
| 50 | +| `internal/extract/` | `TEST_REPORT`, `TEST_TELEMETRY`, `TEST_METRICS` | |
| 51 | +| `internal/target/` | All — affects SSH/local execution | |
| 52 | +| `internal/script/` | All — affects script execution | |
| 53 | +| `internal/report/` | `TEST_REPORT`, `TEST_BENCHMARK`, `TEST_TELEMETRY`, `TEST_METRICS`, `TEST_FLAME` | |
| 54 | +| `internal/table/` | `TEST_REPORT`, `TEST_BENCHMARK`, `TEST_TELEMETRY` | |
| 55 | +| `internal/cpus/` | All — CPU detection used everywhere | |
| 56 | +| `internal/progress/` | All — progress UI used everywhere | |
| 57 | +| `internal/util/` | All — trace the specific change to narrow | |
| 58 | +| `main.go`, `go.mod`, `go.sum` | All | |
| 59 | +| `scripts/`, `tools/` | All — embedded resources | |
| 60 | + |
| 61 | +### Step 3 — Identify specific affected tests |
| 62 | + |
| 63 | +Read the test catalog for each affected category. Load **only** the doc files for affected categories: |
| 64 | + |
| 65 | +| Category | Test catalog | |
| 66 | +|---|---| |
| 67 | +| `TEST_CONFIG` | [docs/config-tests.md](docs/config-tests.md) | |
| 68 | +| `TEST_FLAME` | [docs/flame-tests.md](docs/flame-tests.md) | |
| 69 | +| `TEST_LOCK` | [docs/lock-tests.md](docs/lock-tests.md) | |
| 70 | +| `TEST_METRICS` | [docs/metrics-tests.md](docs/metrics-tests.md) | |
| 71 | +| `TEST_REPORT` | [docs/report-tests.md](docs/report-tests.md) | |
| 72 | +| `TEST_BENCHMARK` | [docs/benchmark-tests.md](docs/benchmark-tests.md) | |
| 73 | +| `TEST_TELEMETRY` | [docs/telemetry-tests.md](docs/telemetry-tests.md) | |
| 74 | + |
| 75 | +Within the loaded catalog, find every test whose behavior intersects with the change using these criteria: |
| 76 | + |
| 77 | +1. **Flag changes** — Tests that pass the changed flag in `t_args`. |
| 78 | +2. **Error message changes** — Tests whose `t_expect_stderr` matches the changed error string. |
| 79 | +3. **Output format changes** — Tests that exercise the changed format via `--format` in `t_args`. |
| 80 | +4. **Collection behavior changes** — Tests that exercise the changed collection path (scope, granularity, duration, live mode, workload-driven, etc.). |
| 81 | +5. **Shared infrastructure changes** — If the change is in shared code (`internal/target/`, `internal/script/`, `internal/workflow/`, `internal/app/`, `cmd/root.go`, `main.go`), trace the change to the specific behavior and find tests that trigger it across categories. Do not blindly run all tests. |
| 82 | +6. **stdout/stderr pattern changes** — Tests whose `t_expect_stdout` or `t_expect_stderr` contains text the change modifies. |
| 83 | +7. **Custom validation function changes** — Tests with `t_expect_func` that validate output artifacts affected by the change. |
| 84 | + |
| 85 | +Build a list of specific test names (`t_name` values) and their category. |
| 86 | + |
| 87 | +### Step 4 — Predict expected test outcomes |
| 88 | + |
| 89 | +For each identified test, determine whether the code change should: |
| 90 | + |
| 91 | +- **Not alter the test result** (regression check) — The test must still PASS with the same output patterns. |
| 92 | +- **Change the test's expected behavior** — The test's expectations (`t_expect_exit`, `t_expect_stdout`, `t_expect_stderr`, `t_expect_func`) no longer match the new code. Flag this to the user: the test script itself must be updated. Explain what the new expected values must be. |
| 93 | +- **Make a previously-skipped test runnable** — If the change adds support for something that was previously guarded. |
| 94 | + |
| 95 | +### Step 5 — Run the affected test categories |
| 96 | + |
| 97 | +Disable all categories except those containing affected tests: |
| 98 | + |
| 99 | +```bash |
| 100 | +TARGET=<host> USER_NAME=<user> PRIVATE_KEY_PATH=<key> \ |
| 101 | + PERFSPECT_DIR=. \ |
| 102 | + TEST_CONFIG=false TEST_FLAME=false TEST_LOCK=false TEST_METRICS=false \ |
| 103 | + TEST_REPORT=false TEST_BENCHMARK=false TEST_TELEMETRY=false \ |
| 104 | + <enable affected categories here>=true \ |
| 105 | + ../tools/perfspect/functional_test.sh -q -v |
| 106 | +``` |
| 107 | + |
| 108 | +Add `NO_ROOT=true` if the remote user does not have password-less sudo. |
| 109 | + |
| 110 | +### Step 6 — Verify output aligns with the change |
| 111 | + |
| 112 | +Do not stop at PASS/FAIL. For each affected test: |
| 113 | + |
| 114 | +1. **Read the test output.** Examine `test/output/<N>-<test_name>/stdout.txt`, `stderr.txt`, and `perfspect.log`. |
| 115 | +2. **Verify the change is reflected.** Follow the output verification guidance in the category's doc file. Examples: |
| 116 | + - Error message changed → confirm `stderr.txt` contains the new text. |
| 117 | + - New output field added → confirm it appears in `stdout.txt` or generated report files. |
| 118 | + - Chart/report generation changed → confirm output HTML/JSON/CSV contains expected new content. |
| 119 | + - Bug fix that eliminated ERROR log entries → confirm `perfspect.log` no longer contains `level=ERROR` for the affected path. |
| 120 | + - Collection behavior changed → confirm `stderr.txt` shows expected collection messages and `stdout.txt` shows expected output files. |
| 121 | +3. **Check for unintended side effects.** Scan output of non-target tests in the same category for unexpected ERRORs or changed output patterns. |
| 122 | + |
| 123 | +### Step 7 — Report to user |
| 124 | + |
| 125 | +Provide: |
| 126 | +- The list of tests identified as affected and why. |
| 127 | +- PASS/FAIL status of each. |
| 128 | +- For each affected test: what was verified in the output and whether the change is reflected correctly. |
| 129 | +- Any tests whose expectations must be updated in the test script (with the specific `t_expect_*` values that must change). |
| 130 | +- Any tests that passed but whose output reveals a concern. |
| 131 | + |
| 132 | +## Environment variable reference |
| 133 | + |
| 134 | +| Variable | Default | Purpose | |
| 135 | +|---|---|---| |
| 136 | +| `PERFSPECT_DIR` | `.` | Path to directory containing the `perfspect` binary | |
| 137 | +| `ROOT_OUTPUT_DIR` | `test/output` | Output directory for test artifacts | |
| 138 | +| `TARGET` | _(empty)_ | Remote target hostname/IP (empty = local) | |
| 139 | +| `USER_NAME` | _(empty)_ | SSH username for remote target | |
| 140 | +| `PRIVATE_KEY_PATH` | _(empty)_ | SSH private key path for remote target | |
| 141 | +| `NO_ROOT` | `false` | Set to `true` to run without root | |
| 142 | +| `TEST_CONFIG` | `true` | Run config tests | |
| 143 | +| `TEST_FLAME` | `true` | Run flame tests | |
| 144 | +| `TEST_LOCK` | `true` | Run lock tests | |
| 145 | +| `TEST_METRICS` | `true` | Run metrics tests | |
| 146 | +| `TEST_REPORT` | `true` | Run report tests | |
| 147 | +| `TEST_BENCHMARK` | `true` | Run benchmark tests | |
| 148 | +| `TEST_TELEMETRY` | `true` | Run telemetry tests | |
0 commit comments