INTENT.mdat repo root — Hermes Labs convention; one-page invariants doc covering accepts/refuses/non-goals + verification contract.evals/sample-detection-rate.sh— runnable detection-rate check that scans the bundled samples and asserts the expected outcome (4 known-bad files flagged, 1 known-clean file passes). The smallest reproducible eval surface for the README's flagship claims.tests/test_docs_consistency.py— mechanical CI gate (three assertions) that fails the build if the README opener / latest CHANGELOG entry /pytest --collect-onlycount drift apart. Catches the fabrication-class pattern where a chisel pass updates one surface but leaves a stale figure on another. Replaces manual eyeball-grep audits withpip install lintlang && pytest tests/test_docs_consistency.py-checkable invariant.
- README chiseled to Hermes Labs Flagship Standard v1. Quantified opener with named benchmarks (154 tests, 7 H1–H7 detectors, 6 HERM v1.1 dimensions, validated against 28 comparison files, ~2ms per file scan). Added a "How it differs from LLM-based config review" anti-pattern section with concrete cost/time/determinism comparison. Expanded "When NOT to use" to 5 named scenarios. Added a reproduce-yourself line pointing at
evals/sample-detection-rate.sh.
- Chisel pass — README + structural docs only. No detector changes.
- Tier B coverage against
flagship-standard.md: 6/7 (B6 plugin path is the acknowledged miss; queued for v0.3 when a formalProtocol/register()extension surface lands). - An in-progress E1–E5 epistemic-failure detector set lives on local branch
wip/eseries-integration(commitb199987, session0214f8112026-04-22). Merge into v0.3 requires (a) porting six broader E1 sycophancy patterns from the olderepistemic.pyinto the canonicaldetectors_epistemic.py, (b) stripping the in-session-invented "B09 adversarial-school / attack V16" framework references from code comments (they reference no external corpus), (c) adding a--include-epistemicopt-in flag so existing CIs are not surprised by new default detectors.
- H5 layered exemption system — three-layer filtering reduces false positives on negatives:
- Layer 1: Structural exemptions (HTML comments, code blocks, generated-file markers)
- Layer 2: Phrase-level exemptions (privacy disclaimers, UI labels, descriptive text, idiomatic expressions)
- Layer 3: Safety-context keyword window (existing behavior, now the fallback)
- Expanded vague qualifier detection — catches figurative verbs (
lean into,err on the side of,double down on,keep it simple), broader ambiguous conditionals (if appropriate,when possible) - H6 code-aware format detection — strips fenced code blocks, inline code, filenames, and CLI flags before counting format keywords (prevents
--jsonflag from triggering mixed-format warnings) - Multi-file summary table — box-drawing table with per-file verdict, findings breakdown, and scan timing (terminal output only, shown when >1 file scanned)
- Vague qualifier deduplication — identical matches within a file are reported once
- Development status upgraded from Alpha to Production/Stable
- Author email updated to rbosch@lpci.ai
- Breaking: Replaced numeric HERM score with PASS/REVIEW/FAIL verdict in terminal and markdown output
- ❌ FAIL — any CRITICAL or HIGH finding
⚠️ REVIEW — any MEDIUM finding- ✅ PASS — only LOW/INFO findings or none
- Terminal output now leads with verdict + severity summary instead of dimension bars
- Markdown report restructured around verdict + findings (no score in header)
- JSON output: verdict at top level, HERM score moved under
hermkey (preserved for programmatic use) patternscommand simplified to show H1-H7 detectors only
--fail-on fail|reviewCLI flag for verdict-based CI gatingcompute_verdict()function in public APItest_verdict.pywith 10 dedicated verdict logic tests.mdextension support inscan_directory(SKILL.md files were silently skipped)- Expanded
is_prompt_likeregex to recognize SKILL.md format (description/purpose/role patterns)
- SKILL.md files now get proper coverage instead of defaulting to 65% (low confidence)
- Scanning directories with .md instruction files now includes them automatically
--fail-under(HERM score threshold) still works but--fail-onis preferred
- Updated project URLs for PyPI backlinks (Homepage, Documentation, Repository, Bug Tracker, Changelog)
- Standardized package metadata (author: Hermes Labs, email: lpcisystems@gmail.com)
- Fixed publish workflow to use API token authentication
- Added community health files (CONTRIBUTING.md, SECURITY.md, CODE_OF_CONDUCT.md)
- Added dependabot configuration
First public release.
- HERM v1.1 scoring engine (6 dimensions, 8 signal categories, coverage/confidence)
- H1-H7 structural detectors with Finding dataclass
- YAML, JSON, and plain text parsers with auto-detection
- Terminal (ANSI), Markdown, and JSON output formats
--fail-underflag for CI gating
lintlang scan— scan files or directorieslintlang patterns— list available patterns and dimensionspython -m lintlangsupport via__main__.py--format,--patterns,--min-severity,--no-suggestionsflags- Dynamic pattern choices from registry
- H1: Empty/short/vague tool descriptions, duplicate names, word overlap (Jaccard + stopwords)
- H2: Missing constraint scaffolding, unbounded retry loops
- H3: Phantom required fields, missing param descriptions, generic names, nested object inspection
- H4: Context boundary erosion, missing scope signals
- H5: Negative instruction density, vague qualifiers
- H6: Mixed output formats, missing format specs, template variable detection
- H7: System message placement, consecutive roles, orphan tool results
scan_file(),scan_directory(),scan_config()ScanResult,HermResult,AgentConfig,Finding,Severityexports- PEP 561
py.typedmarker