Skip to content

Add CLI usability improvements: unknown argument detection and failed scenario re-run#104

Merged
lambdalisue merged 2 commits intomainfrom
feat/unknown-args-detection
Jan 27, 2026
Merged

Add CLI usability improvements: unknown argument detection and failed scenario re-run#104
lambdalisue merged 2 commits intomainfrom
feat/unknown-args-detection

Conversation

@lambdalisue
Copy link
Member

Summary

  • Add unknown CLI argument detection with helpful hints and typo suggestions
  • Add --failed / -F flag to re-run only scenarios that failed in previous run
  • Persist run state to .probitas/last-run.json for failure tracking

Why

Unknown Argument Detection
Users often mistype command-line options (e.g., --tag instead of -s tag:api or --name instead of a selector). Without proper feedback, these mistakes result in silent failures where arguments are treated as file paths, causing confusion. This change provides immediate, actionable feedback with suggestions for similar options using Levenshtein distance.

Failed Scenario Re-run
During iterative development and debugging, developers frequently need to re-run only the scenarios that failed. Previously, users had to manually note which scenarios failed and use selectors to target them. The --failed flag automates this workflow by persisting failure state between runs, significantly improving the debugging experience.

Both features follow the principle of providing helpful, contextual feedback to improve developer experience without requiring deep knowledge of CLI internals.

Test Plan

  • Unknown argument detection tests (13 test cases)
  • State persistence tests (11 test cases covering load/save/round-trip)
  • Integration with existing run command tests
  • All 39 existing tests continue to pass
  • Manual verification:
    • Run scenarios with intentional failures
    • Use --failed to re-run only failed scenarios
    • Test unknown argument with typo suggestions

Detect unknown options in `run` and `list` commands using parseArgs
unknown callback. Provides contextual hints for common mistakes like
--tag, --name, --filter, and suggests similar options for typos using
Levenshtein distance.
Adds -F/--failed option to `probitas run` command that filters execution
to only scenarios that failed in the previous run. State is persisted to
.probitas/last-run.json after each run.

- Add state persistence module (src/cli/state.ts) with load/save functions
- Add FailedScenarioFilter to run protocol for subprocess communication
- Apply failed filter after selector filtering (AND logic)
- Add .probitas/ to .gitignore for machine-specific state
Copilot AI review requested due to automatic review settings January 27, 2026 14:51
@lambdalisue lambdalisue merged commit 2b85fa4 into main Jan 27, 2026
8 checks passed
@lambdalisue lambdalisue deleted the feat/unknown-args-detection branch January 27, 2026 14:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two CLI usability improvements to enhance the developer experience: unknown argument detection with helpful suggestions and a --failed flag to re-run only scenarios that failed in previous runs.

Changes:

  • Unknown CLI argument detection with typo suggestions using Levenshtein distance
  • --failed / -F flag to re-run only previously failed scenarios
  • State persistence to .probitas/last-run.json for tracking failures between runs

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/cli/unknown_args.ts Implements unknown argument detection with contextual hints for common mistakes
src/cli/unknown_args_test.ts Comprehensive tests (13 cases) for unknown argument detection
src/cli/state.ts Manages state persistence to .probitas/ directory for last run tracking
src/cli/state_test.ts Tests (11 cases) for state save/load operations
src/cli/commands/run.ts Integrates unknown args handler and --failed flag, saves run state after execution
src/cli/commands/list.ts Adds unknown args handler for consistent CLI experience
src/cli/_templates/run_protocol.ts Adds FailedScenarioFilter interface for filtering by name and file
src/cli/_templates/run.ts Implements failed scenario filtering in subprocess
deno.json / deno.lock Adds @std/text dependency for Levenshtein distance calculations
assets/usage-run.txt Documents --failed flag with examples
README.md Adds quick example for --failed flag
.gitignore Excludes .probitas/ directory from version control

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +92 to +101
const failedSet = new Set(
failedFilter.map((f) => `${f.name}|${f.file}`),
);
scenarios = scenarios.filter((s) => {
const scenarioFile = s.origin?.path ?? "unknown";
// Check both absolute path and relative path matching
return failedSet.has(`${s.name}|${scenarioFile}`) ||
failedFilter.some((f) =>
f.name === s.name && scenarioFile.endsWith(f.file)
);
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failed filter matching logic has a bug. The Set on line 92-94 uses ${f.name}|${f.file} where f.file is a relative path from the state file. However, on line 98, it checks against ${s.name}|${scenarioFile} where scenarioFile is an absolute path from s.origin?.path. Since these formats don't match (relative vs absolute), the Set lookup on line 98 will always fail.

Additionally, the endsWith fallback check on line 100 doesn't respect path separators, which could cause false positives. For example, if the stored path is "api/test.probitas.ts", both "/project/api/test.probitas.ts" and "/project/myapi/test.probitas.ts" would incorrectly match.

To fix this, you should either:

  1. Convert scenarioFile to a relative path (relative to cwd) before comparison
  2. Or normalize both paths to absolute paths before building the Set

The safer approach is option 1, converting scenarioFile to relative path for comparison.

Copilot uses AI. Check for mistakes.
Comment on lines +232 to +237
console.warn(
"No previous run state found. Running all matching scenarios.",
);
} else if (lastRunState.failed.length === 0) {
console.log("No failed scenarios from previous run.");
return EXIT_CODE.SUCCESS;
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent use of console methods for user messages. Line 232 uses console.warn() to output "No previous run state found. Running all matching scenarios." but line 236 uses console.log() for "No failed scenarios from previous run."

Both messages are informational (not errors or warnings), so they should use consistent output methods. Using console.warn() may be confusing since it outputs to stderr and typically indicates a problem. Consider using console.log() for both messages, or use console.info() if you want to distinguish informational messages from regular output.

Copilot uses AI. Check for mistakes.
Comment on lines +227 to +248
// Handle --failed flag: load previous run state and build filter
let failedFilter: FailedScenarioFilter[] | undefined;
if (parsed.failed) {
const lastRunState = await loadLastRunState(cwd);
if (!lastRunState) {
console.warn(
"No previous run state found. Running all matching scenarios.",
);
} else if (lastRunState.failed.length === 0) {
console.log("No failed scenarios from previous run.");
return EXIT_CODE.SUCCESS;
} else {
failedFilter = lastRunState.failed.map((f) => ({
name: f.name,
file: f.file,
}));
logger.debug("Loaded failed filter from previous run", {
count: failedFilter.length,
scenarios: failedFilter,
});
}
}
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --failed flag functionality lacks integration test coverage. While unit tests exist for state persistence (state_test.ts), there are no automated tests verifying:

  1. The actual filtering behavior when --failed is used with the run command
  2. The interaction between --failed and selectors (e.g., "probitas run -F -s tag:api")
  3. That scenarios are correctly matched based on name and file path
  4. Edge cases like scenarios with the same name in different files

Given that the PR description mentions manual verification is still pending, adding automated integration tests would provide better confidence in the implementation and prevent regressions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants