Releases · razroo/iso

19 Apr 05:36

iso-eval-v0.1.0

7699c97

@razroo/iso-eval v0.1.0

Initial release of @razroo/iso-eval — behavioral eval runner for AI coding agents.

agentmd lints prompt structure, isolint lints prompt prose, iso-harness fans out the compiled source into every harness file layout. None of them answer did the agent actually do the task? — that's what iso-eval scores.

You give it a suite of tasks (baseline workspace + prompt + checks); it snapshots the workspace per trial, hands it to a runner, and verifies the resulting filesystem / command state against your checks.

v0.1 scope

Deterministic `fake` runner (executes `$ …` lines as shell in the snapshotted workspace) — exercises the orchestration layer offline and in CI
Checks: `command`, `file_exists`, `file_contains`, `file_not_contains`, `file_matches`, `llm_judge`
Real-agent runners (`claude-code`, `codex`, `cursor-agent`) coming in v0.2; the library API already accepts any `RunnerFn` today

Install

```bash
npm install -D @razroo/iso-eval
iso-eval run eval.yml
```

See the package README for the full suite shape and library API.

Assets 2

18 Apr 16:29

CharlieGreenman

iso-v0.1.1

ddb92ac

iso-v0.1.1

Full Changelog: iso-harness-v0.2.0...iso-v0.1.1

Assets 2

18 Apr 16:29

CharlieGreenman

iso-harness-v0.3.0

ddb92ac

iso-harness-v0.3.0

Full Changelog: iso-harness-v0.2.0...iso-harness-v0.3.0

Assets 2

18 Apr 15:38

CharlieGreenman

iso-harness-v0.2.0

78183d5

iso-harness v0.2.0

What's new

`iso-harness validate`

Schema-check the iso/ source directory without writing anything:

iso-harness validate --source iso/
iso-harness validate --source iso/ --format json

Catches: missing command on an MCP server, non-string env values, duplicate agent names, unknown target-override keys (typos like cursor: skip written as Cursor: skip), non-string model fields, and empty descriptions/bodies.

`build` now gates on validation

iso-harness build runs the validator first and refuses to write output if the source has schema errors. Warnings (empty description, empty body, unknown-harness overrides) are surfaced in the build summary but do not block.

This is a behavior change: if your iso/ source had a latent schema bug, previous versions would silently generate wrong output across all four harnesses. 0.2.0 fails fast instead.

Test coverage

18 unit tests (was: 1 smoke-build script). Covers validation, build, source loading, frontmatter skip rules, TOML escaping, and the refuse-to-emit-on-error guarantee.

Full changelog: packages/iso-harness/CHANGELOG.md

Assets 2

18 Apr 15:37

CharlieGreenman

agentmd-v0.3.0

78183d5

agentmd v0.3.0

What's new

`lint --format sarif`

agentmd lint can now emit SARIF 2.1.0 for upload to GitHub code scanning. Same dialect as isolint --format sarif — driver name agentmd, rule IDs are the L-codes, severities map to error / warning / note.

agentmd lint prompts/*.md --format sarif > agentmd.sarif
# then upload with github/codeql-action/upload-sarif@v3

Full changelog: packages/agentmd/CHANGELOG.md

Assets 2

18 Apr 15:21

CharlieGreenman

agentmd-v0.2.0

7b502b0

agentmd v0.2.0

Highlights

lint accepts multiple files, shell globs, and stdin (-).
Machine-readable lint output: --format json (for CI tools) and --format github (workflow annotations).
render - reads source from stdin for pipeline composition.
--flag=value works anywhere alongside --flag value.
test --timeout <ms> kills hung backends instead of stalling.
New: agentmd --version.

Parser

Strip UTF-8 BOM and normalize CRLF / bare CR so Windows and editor-exported files parse correctly (previously the # Agent: heading would silently fail).
Flag duplicate # Agent: headings as warning L12 instead of silently dropping them.

Linter

L9 split into L9a (agent heading), L9b (procedure), L9c (at least one rule) — each check is now individually filterable.
L3 duplicate-ID diagnostics now carry a line number and point at the first definition.

Tests

85 tests (was 69). New subprocess CLI coverage for stdin, multi-file, globs, JSON/github formats, --flag=value.

Assets 2

18 Apr 14:54

CharlieGreenman

agentmd-v0.1.0

e4a2e71

agentmd v0.1.0

First release of @razroo/agentmd — a structured-markdown dialect for authoring agent prompts, with a linter for structure and a fixture-driven harness that measures per-rule adherence against a target model.

Assets 2

Releases: razroo/iso

@razroo/iso-eval v0.1.0

v0.1 scope

Install

Uh oh!

iso-v0.1.1

Uh oh!

iso-harness-v0.3.0

Uh oh!

iso-harness v0.2.0

What's new

iso-harness validate

build now gates on validation

Test coverage

Uh oh!

agentmd v0.3.0

What's new

lint --format sarif

Uh oh!

agentmd v0.2.0

Highlights

Parser

Linter

Tests

Uh oh!

agentmd v0.1.0

Uh oh!

`iso-harness validate`

`build` now gates on validation

`lint --format sarif`