When the research skill identifies that a system implements a well-known distributed pattern (leader election, replication, consensus, distributed transactions, etc.), it could instantiate a parameterized formal model for that pattern rather than discovering properties purely through heuristic code analysis.
The idea is to maintain a library of human-written, verified specification templates for common patterns. The research skill identifies which patterns a system implements, instantiates the appropriate model with system-specific parameters, and extracts properties for the catalog. This sidesteps the trust problem of LLM-generated specs because the models are human-authored and verified — the LLM's job is pattern recognition and parameter mapping, not formal reasoning.
Open questions:
- What specification language? TLA+ has the largest existing library of distributed systems specs, but a lighter-weight DSL mapping directly to Antithesis assertion types might be more practical.
- What's the right granularity for "common patterns"? Too broad (e.g., "consensus") and the template doesn't capture enough; too narrow and the library becomes unmanageably large.
- How does the research skill reliably identify which pattern a system implements? Pattern recognition in code is fuzzy — what's the confidence/review model?
- How do we handle systems that implement variations on standard patterns?
Context: discussion about using formal specifications to improve the research → workload pipeline's property catalog precision.
When the research skill identifies that a system implements a well-known distributed pattern (leader election, replication, consensus, distributed transactions, etc.), it could instantiate a parameterized formal model for that pattern rather than discovering properties purely through heuristic code analysis.
The idea is to maintain a library of human-written, verified specification templates for common patterns. The research skill identifies which patterns a system implements, instantiates the appropriate model with system-specific parameters, and extracts properties for the catalog. This sidesteps the trust problem of LLM-generated specs because the models are human-authored and verified — the LLM's job is pattern recognition and parameter mapping, not formal reasoning.
Open questions:
Context: discussion about using formal specifications to improve the research → workload pipeline's property catalog precision.