-
Notifications
You must be signed in to change notification settings - Fork 0
test: improve test suite patterns and fix structural weaknesses #19
Description
Summary
Static analysis of the GSD extension test suite (336 files) identified a legacy pattern, several structural weaknesses, and a set of improvements. 4 items are active correctness issues (tests that can silently pass when they should fail); the rest are structural debt.
Issues
1. test-helpers.ts — createTestContext() legacy pattern (18 files) — correctness issue
18 test files use createTestContext() from test-helpers.ts. Assertions run at module level and call process.exit(1) on failure. The test runner (node --test) sees a crash, not a structured test failure — no test name, no timing, no ability to rerun individually. A failure in assertion 1 silently aborts all subsequent assertions in the file.
Affected files (sample): planning-crossval.test.ts, regex-hardening.test.ts, and ~16 others using createTestContext().
Fix: Migrate all 18 files to test() blocks with node:assert/strict. Delete test-helpers.ts when migration is complete.
2. git-service.test.ts — bare assertions inside describe() body — correctness issue
30+ assert.deepStrictEqual(inferCommitType(...), ...) calls sit directly in the describe('git-service', ...) callback, not inside test() blocks. The first failure throws and silently aborts all remaining assertions. None of these show up as individual named test cases in CI output.
Fix: Wrap each assertion in a test('inferCommitType: "X" → fix', () => { ... }).
3. Temp dir cleanup not in t.after() — correctness issue
Multiple integration tests call rmSync(base, ...) inline at the end of the test body. If the test throws before reaching that line, the temp directory leaks and subsequent test runs may encounter stale state.
Affected files (sample): integration/doctor.test.ts (inner suites at lines 97–161, 163–243+).
Fix: Replace inline rmSync with t.after(() => rmSync(base, { recursive: true, force: true })) in all affected tests. Use integration/auto-secrets-gate.test.ts as the reference pattern.
4. debug-logger.test.ts — unguarded global state mutation — correctness issue
enableDebug/disableDebug toggle a process-level singleton. If a test throws between enableDebug(tmp) and disableDebug(), subsequent tests in the file start with debug mode active and a stale log path.
Fix: Add afterEach(() => disableDebug()) as an unconditional guard. Or refactor debug-logger.ts to accept a state object instead of using a module singleton.
5. diff-context.test.ts — vacuous assertions against process.cwd()
Tests using process.cwd() include the comment "The result may be empty if the repo is totally clean" — i.e., they pass regardless of what the function returns, as long as it doesn't throw. These tests provide no signal.
Fix: Replace with a controlled temp git repo (use makeTempRepo from test-utils.ts), commit known files, and assert the function returns exactly those files.
6. assertEq in test-helpers.ts uses JSON.stringify equality — correctness issue
if (JSON.stringify(actual) === JSON.stringify(expected)) {False positives on: undefined values (omitted in JSON), NaN (serializes to null), non-enumerable properties, arrays with undefined holes.
Fix: Use assert.deepStrictEqual wrapped in try/catch during the migration (#1 above). Moot once test-helpers.ts is deleted.
7. Fixture duplication between integration test files
generateDecisionsMarkdown and generateRequirementsMarkdown are copy-pasted between integration/token-savings.test.ts and integration/integration-lifecycle.test.ts. The duplication is acknowledged in a comment. If the fixture format changes, both files must be updated.
Fix: Extract to tests/fixtures/context-store-fixtures.ts.
8. Two divergent ESM loaders
resolve-ts.mjs and resolve-ts-hooks.mjs overlap in purpose but have different logic. resolve-ts-hooks.mjs is strictly more capable (handles .tsx, dist→src rewriting, @gsd/ remapping). Both exist and are used in different contexts. Divergent resolution can hide import errors that only surface in one test mode.
Fix: Standardize all test commands on resolve-ts-hooks.mjs. Delete resolve-ts.mjs.
9. Coverage threshold is meaningless
Current floor: --statements=40 --lines=40 --branches=20 --functions=20. With 330+ test files this is trivially easy to pass and provides no safety signal.
Fix: Raise to --statements=70 --lines=70 --branches=50 --functions=70, or add per-module floors for critical paths (state.ts, gsd-db.ts, auto.ts).
10. DI harness pattern nearly unused
atomic-write.test.ts demonstrates the right way to test I/O error paths: inject an ops interface, drive failures deterministically, no real filesystem. This pattern exists but is used in only one file. session-lock, git-service, and safe-fs all have complex error handling tested with real processes/filesystem — slow and fragile.
Fix: Extend the harness pattern from atomic-write.test.ts to session-lock, safe-fs, and the merge-conflict paths in git-service.
Priority
| # | Type | Impact |
|---|---|---|
| 1 | Correctness | High — 18 files report crashes instead of failures |
| 2 | Correctness | Medium — 30+ assertions invisible to test runner |
| 3 | Correctness | Medium — temp dir leaks on test failure |
| 4 | Correctness | Medium — test state pollution |
| 5 | Signal | Medium — vacuous assertions pass always |
| 6 | Correctness | Low — moot after #1 is fixed |
| 7 | Maintainability | Low |
| 8 | Reliability | Low |
| 9 | Signal | High — threshold provides no safety net |
| 10 | Architecture | High — best pattern for I/O error testing is barely used |