Demo Corpus Methodology

How a demo is structured

Every repo follows this layout:

<repo>/
├── README.md              ← what the app pretends to do, plus the warning
├── DEMO.md                ← seeded-finding summary, prose
├── demo.yaml              ← seeded-finding inventory, machine-readable
├── src/                   ← application code with seeded SAST findings
├── infra/
│   ├── terraform/         ← AWS/Azure/GCP IaC with seeded misconfigs
│   └── k8s/               ← K8s manifests with seeded misconfigs
├── Dockerfile             ← container with seeded misconfigs
├── <manifest>             ← requirements.txt | package.json | go.mod | Cargo.toml | <project>.csproj
├── .github/workflows/     ← CI/CD with seeded misconfigs
└── tests/                 ← minimal smoke tests, no security assertions

Capability coverage matrix

Every repo must cover, at minimum:

Capability	Floor	Ceiling	Realised via
SAST	8 findings	15 findings	Application code in `src/`
IaC	5 findings	10 findings	`infra/terraform/`, `infra/k8s/`, `Dockerfile`
SCA	3 vulnerable deps	6 vulnerable deps	Manifest pinned to historical CVE'd versions
SBOM	1 generated SBOM	1 generated SBOM	Derived from manifest, written to `dist/sbom.json` during scan
Pipeline misconfig	3 findings	6 findings	`.github/workflows/<name>.yml`

Total per repo: 20–37 seeded findings. Across corpus v0.1.0: 246 seeded findings (106 SAST · 66 IaC · 40 SCA · 34 pipeline misconfig).

Token budget per scan (measured against corpus v0.1.0)

The whole corpus is 217 files / ~3,300 LOC / ~110 KB of scannable text, which means a full-corpus run is:

Layer	Model	Input tokens, full corpus	Output tokens (est.)
Layer 3 triage	Claude Haiku 4.5 (Bedrock EU)	~28,000	~6,000
Layer 4 validation	Claude Sonnet 4.5 (Bedrock EU)	~5,500	~1,500

That's ~2,800 input tokens per repo for triage — low token counts because the apps are small by design. At Bedrock EU pricing the cost per full-corpus scan is a few pence; a weekly cadence plus rule/model-triggered re-scans runs at well under £100/year for indefinitely-rerunnable continuous validation.

The corpus deliberately stays in this size envelope. Adding more repos is fine; growing existing repos past the 1,500-LOC cap requires a charter amendment so the cadence cost doesn't drift.

Naming convention

All demo repos use invented brand names that do not map to any real company. Every repo README.md includes the warning banner. None of the apps reference real customer data, real internal hostnames, or any DevSecAI internal infrastructure.

What the demo is not

The demo corpus is not:

Real customer code (none of these are based on any real customer's systems)
A measurement of real-world false-positive rate (use customer scan data for that)
A penetration testing surface (these apps don't run; static analysis only)
A way to inflate scan volume — the cadence is published, the trigger reasons are tagged, and the workload is reproducible by anyone

It is:

A repeatable way to validate the engine end-to-end on every change
A reproducible artifact a prospective customer can clone and inspect
A regression baseline that lets us catch detection-rate drops within minutes of a release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Demo Corpus Methodology

How a demo is structured

Capability coverage matrix

Token budget per scan (measured against corpus v0.1.0)

Naming convention

What the demo is not

FilesExpand file tree

METHODOLOGY.md

Latest commit

History

METHODOLOGY.md

File metadata and controls

Demo Corpus Methodology

How a demo is structured

Capability coverage matrix

Token budget per scan (measured against corpus v0.1.0)

Naming convention

What the demo is not