audit v1.6.0
✨ Highlights
Promptfoo evaluation coverage for the audit pipeline's stage prompts, plus a deterministic safety net in the Feedback stage and supporting skills/standards.
Added
- Promptfoo eval suites for the pure-reasoning stages. New
evals/harness that loads each stage's shippedprompts/*.mdverbatim and grades real model
output against the stage schemas — covering Dedupe (05), Report (08), Gapfill (04), and Feedback (07). Each suite pairs schema validation
with behavioral assertions (root-cause clustering, reachable-only reporting, coverage-driven task generation, and sibling-targeting) for deterministic,
regression-proof grading. - Cross-file schema validation helper (
evals/lib/validate-schema.cjs) that resolves relative$refs (e.g.gapfill_output/feedback_output→
hunt_task.schema.json) viaajv, which promptfoo's built-inis-jsoncannot do alone. - No-retest floor in the Feedback stage.
partitionRetests()deterministically drops any generated hunt task that re-targets an already-proven
finding.file, backstopping a semantic rule no JSON schema can express. Drops are logged per-task and counted in the stage summary — never silent — and are
covered by 6 new unit tests. - New skills/standards:
promptfoo-evalsskill (with cheatsheet) andredteam-plugin-developmentstandards for authoring redteam plugins and graders.
Changed
- Tightened stage prompt contracts.
04-gapfill.md,05-dedupe.md,07-feedback.md, and08-report.mdnow pin exact output key names and re-list
every required field, eliminating schema-compliance drift (e.g. wrong root keys, droppedrationale, renamedcoverage_analysisarrays) observed across
models. 07-feedback.mdno longer permits bundling a proven sink file into a broader "sweep" task — it must target only new sibling locations.- Fixed a
report.schema.json/trace.schema.jsoninconsistency that would have rejected admin-gated findings (auth_requiredon entry points).
Dependencies
- Bumped
@anthropic-ai/claude-agent-sdkto v0.3.181.