Reference application corpus used to validate and regression-test the ARKO decision engine across SAST, IaC, SCA, SBOM, and CI/CD pipeline misconfiguration — published as the ARKO Coverage Demos.
Ten deliberately vulnerable applications spanning ten industries — pharma, fintech, energy, healthcare, retail, insurance, logistics, automotive, telco, and govtech. Each demo repo triggers validation scans on push; the full corpus is re-scanned on rule updates, model identifier changes, and on a weekly cadence. Detection-rate timeseries are tracked over time.
Comparable public corpora from security vendors serve similar roles for their engines. This corpus serves three roles:
- Regression signal — rule or model changes surface detection regressions quickly against a fixed baseline.
- Coverage proof — each demo exercises all shipped capabilities in one scan; failures isolate broken layers of the engine.
- Independent verification — evaluators can clone a demo application, run their preferred scanner, and compare results to published methodology.
| Repo | Industry | Stack | Frameworks emphasised |
|---|---|---|---|
halix-clinical-platform |
Pharma / clinical trials | Python · FastAPI · Postgres | HIPAA, GDPR, 21 CFR Part 11 |
quaylink-payments |
Fintech / payments | Node · Express · Postgres | PCI DSS 4.0, SOC 2 |
gridcore-telemetry |
Energy / smart grid | Go · Kafka · TimescaleDB | NIS2, IEC 62443, NIST CSF |
medvane-records |
Healthcare / patient records | Java · Spring Boot · MySQL | HIPAA, HITECH, ISO 27799 |
bazaarly-checkout |
Retail / e-commerce | Python · Django · Stripe | PCI DSS 4.0, GDPR |
assurix-underwriting |
Insurance | C# · .NET 8 · SQL Server | SOC 2, NAIC Model Law |
palletwise-fleet |
Logistics / fleet tracking | Node · MongoDB · MQTT | GDPR, ISO 28000 |
axleware-telematics |
Automotive / IoT | Rust · MQTT · TimescaleDB | ISO/SAE 21434, UN R155 |
ringnode-provisioning |
Telecom / subscriber API | Python · Flask · Redis | NIST 800-53, ETSI TS 103 |
civicore-identity |
GovTech / citizen ID | Go · Postgres · OIDC | NIST 800-53 High, eIDAS |
- Application source — small but functional service with seeded SAST findings (SQLi, command injection, hardcoded secrets, weak crypto, insecure deserialisation, path traversal, SSRF, IDOR, etc.) appropriate to the stack.
infra/terraform/— provisioning code with seeded misconfigurations (open security groups, unencrypted storage, public buckets, missing flow logs, weak IAM).Dockerfileandinfra/k8s/— container/orchestration config with seeded misconfigs (root user,:latesttag, no resource limits, privileged: true).requirements.txt/package.json/go.mod/Cargo.toml— dependency manifests pinned to versions with known CVEs, exercising SCA and CycloneDX SBOM generation..github/workflows/— CI/CD workflow with seeded misconfigurations (no permission scoping, hardcoded secrets, unsafepull_request_targetpatterns).demo.yaml— manifest declaring the seeded findings. ARKO's regression suite uses this to grade detection.
See _meta/CHARTER.md for methodology, _meta/METHODOLOGY.md for re-scan cadence, and each repo's DEMO.md for that repo's seeded-finding inventory.
Every demo repo is also a fully self-contained ARKO test harness:
git clone https://github.com/DevSecAI/halix-clinical-platform
cd halix-clinical-platform
arko scan --tenant=demos --report=json > report.json
arko demo verify --expected=demo.yaml --actual=report.jsonThese applications contain intentional vulnerabilities for the purpose of security tool evaluation. They are not intended for production use and do not represent ARKO, DevSecAI, or any real customer's security posture. Every file is licensed under the Vulnerability Disclosure Agreement inherited from the OWASP WebGoat tradition.