Skip to content

feat: Architect Fitness Function Agent for verifying LLM-generated code correctness#549

Draft
Copilot wants to merge 2 commits intomasterfrom
copilot/think-architects-fitness-function
Draft

feat: Architect Fitness Function Agent for verifying LLM-generated code correctness#549
Copilot wants to merge 2 commits intomasterfrom
copilot/think-architects-fitness-function

Conversation

Copy link
Contributor

Copilot AI commented Mar 11, 2026

LLMs produce plausible code that compiles and passes tests but misses correctness invariants — e.g., a SQLite reimplementation that passes all tests yet is 20,000x slower due to a missing is_ipk check causing O(n) full table scans instead of O(log n) B-tree lookups. Fitness function-driven development (ThoughtWorks) addresses this by defining measurable acceptance criteria before code generation.

New: FitnessFunctionAgent SubAgent

A SubAgent<FitnessFunctionContext, ToolResult.AgentResult> that evaluates code against architect-defined fitness functions via LLM analysis.

Data model:

  • FitnessFunctionTypePERFORMANCE, ALGORITHM_COMPLEXITY, DEPENDENCY_COUNT, CODE_VOLUME, API_CORRECTNESS, TEST_COVERAGE, ARCHITECTURE, CUSTOM
  • FitnessFunction — name, description, type, threshold, rationale
  • FitnessFunctionResult — pass/fail, score, actual vs expected, evidence, recommendation
  • FitnessReport — aggregated pass/fail with action items

Built-in fitness functions (auto-selected by context):

  • Performance: actual throughput vs reference (flags 20,000x regressions)
  • Algorithm complexity: indexed lookups must be O(log n), not O(n)
  • Dependency count: solution scope must match problem scope
  • Code volume: proportionality to problem complexity
  • API correctness (database context): fdatasync vs fsync, primary key B-tree routing
  • Test coverage: performance invariants must be benchmarked
FitnessFunctionAgent(projectPath = "./src", llmService = llm)
// or via tool: /fitness-agent projectPath="./db-impl" fitnessFunctions=["performance", "primary-key-lookup"] context="database implementation"

Other changes

  • AgentType.ARCHITECT_FITNESS — accessible via "fitness" / "architect-fitness" strings
  • ToolType.FitnessAgent — registered in tool registry with FitnessFunctionAgentSchema

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dl.google.com
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-opens=java.base/java.nio.charset=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.xml/javax.xml.namespace=ALL-UNNAMED -XX:MaxMetaspaceSize=2g -Xmx10g -Dfile.encoding=UTF-8 -Duser.country -Duser.language=en -Duser.variant -cp (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

…d code correctness

Co-authored-by: phodal <472311+phodal@users.noreply.github.com>
Copilot AI changed the title [WIP] Discuss architects fitness function agent feat: Architect Fitness Function Agent for verifying LLM-generated code correctness Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants