Hi,
I came across your paper "The Last Harness You'll Ever Build" (arXiv:2604.21003) and wanted to reach out — I'm the author of forge-harness, a Claude Code plugin for harness engineering.
Reading your paper, I noticed we've independently arrived at several of the same core ideas:
Points of convergence
- Self-evolution as first-class: both systems treat harness evolution as a structured loop, not an ad-hoc process
- Outer-loop architecture: observe → critique → synthesize → integrate → verify
- Harness as primary product: the harness infrastructure itself is what needs to be engineered and validated, not just the tasks it orchestrates
What forge-harness adds
forge-harness focuses on the knowledge validation layer that sits alongside (not instead of) automated optimization:
steel-quench — multi-wave adversarial validation separating attack from defense
source-grounding-audit — phantom claim detection (file phantom / capability phantom / measurement phantom)
harvest-loop — session learning capture with devil×innovator synthesis gate
sim-conductor — pre-deployment transfer validation
Preprint: Kwon, Sungjin. forge-harness: Engineering Methods for Robust AI Collaboration Harnesses. Zenodo, 2026. https://doi.org/10.5281/zenodo.20397566
Proposal
Would you be open to a related-work cross-citation? I believe the two projects are complementary:
- adal-cli / your approach: automated self-evolution with team-wide learning
- forge-harness: structured adversarial validation + phantom detection, zero infrastructure
Also, if there's a more appropriate venue for this discussion (paper repo, Discord, etc.), happy to move there.
Thanks for the paper — it's validating to see the same outer-loop architecture arrive independently from different directions.
Hi,
I came across your paper "The Last Harness You'll Ever Build" (arXiv:2604.21003) and wanted to reach out — I'm the author of forge-harness, a Claude Code plugin for harness engineering.
Reading your paper, I noticed we've independently arrived at several of the same core ideas:
Points of convergence
What forge-harness adds
forge-harness focuses on the knowledge validation layer that sits alongside (not instead of) automated optimization:
steel-quench— multi-wave adversarial validation separating attack from defensesource-grounding-audit— phantom claim detection (file phantom / capability phantom / measurement phantom)harvest-loop— session learning capture with devil×innovator synthesis gatesim-conductor— pre-deployment transfer validationPreprint: Kwon, Sungjin. forge-harness: Engineering Methods for Robust AI Collaboration Harnesses. Zenodo, 2026. https://doi.org/10.5281/zenodo.20397566
Proposal
Would you be open to a related-work cross-citation? I believe the two projects are complementary:
Also, if there's a more appropriate venue for this discussion (paper repo, Discord, etc.), happy to move there.
Thanks for the paper — it's validating to see the same outer-loop architecture arrive independently from different directions.