Skip to content

Agent Team v2: Self-Improving Closed Loop, Performance Optimization & Deploy Pipeline#10

Merged
LeftTwixWand merged 36 commits into
masterfrom
orchestration-v5
Mar 24, 2026
Merged

Agent Team v2: Self-Improving Closed Loop, Performance Optimization & Deploy Pipeline#10
LeftTwixWand merged 36 commits into
masterfrom
orchestration-v5

Conversation

@LeftTwixWand
Copy link
Copy Markdown
Contributor

@LeftTwixWand LeftTwixWand commented Mar 23, 2026

Summary

Agent team redesign with performance optimization. 88% token reduction for simple tasks via direct agent routing that bypasses CodeOrchestrator.

Agent Team Redesign

  • SendToAgent fast path — Thread routes directly to agents, no more AgentSelector + CodeOrchestrator for simple tasks
  • DotNet owns all .NET operations — build, run, test, publish with auto-discovery of .csproj/.sln
  • Shell scoped to raw CLI only — npm, pip, cargo, scripts
  • Aspire agent — typed tools for restart, traces, logs, health (keeps MCP for observability)
  • Engineered [Description] attributes on all 15+ interface methods
  • Three-layer instructions (Identity → Rules → Reference) for all agents
  • Workspace restriction removed — full PC access
  • Model tier optimization — FileSystem/Git on Fast (Nano), DotNet on Sonnet, Shell on Haiku

Performance Optimization

  • 4KB result truncation in SendToAgent (prevents Thread context bloat)
  • 8KB output truncation in GetResponse (prevents all agent context bloat)
  • Actionable error messages with agent-specific recovery suggestions
  • Non-LLM pre-filters for deterministic build errors in CodeOrchestrator
  • Filter LLM wrapper agents from AgentSelector candidates
  • Filter orchestrator catalog to selected agents only

Bug Fixes

  • RunDotnetAsync 120s timeout — GUI apps no longer block forever
  • GetResponseStream [ResponseTimeout] — streaming methods now have 5-min timeout
  • StreamResponseCore cancellation — clean handling, not counted as errors
  • Per-provider 10s timeout in BuildContextBlock
  • Telegram webhook — 5-min CancellationTokenSource instead of CancellationToken.None

Self-Improvement Foundation (partial — deploy via Aspire SDK planned separately)

  • SelfImprove tool — Thread coordinates agents to read code, analyze, write fixes, build
  • Aspire log monitor — recurring 30-min health check via DurableJobs
  • Deploy capability is a TODO — will use Aspire SDK WithCommand("deploy") + ResourceCommandService

Performance Impact

Metric Before After
Simple task tokens ~21K+ ~3-5K
Simple task time 2+ min 5-20s
Models for simple tasks Opus 4.6 gpt-5.4-mini + nano
CodeOrchestrator for builds Always Never (direct routing)
Stuck on GUI apps Forever 120s timeout + kill

Test plan

  • dotnet build IAW.slnx — 0 errors
  • dotnet test test/Core.Tests — 409 passed (2 pre-existing failures)
  • Simple task routing verified via Aspire traces (DotNet build, File read, Git status, Aspire health)
  • Token usage verified: <5K tokens for simple tasks vs 21K+ before
  • No AgentSelector or CodeOrchestrator involvement in simple tasks

🤖 Generated with Claude Code

LeftTwixWand and others added 30 commits March 23, 2026 11:28
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rever

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a new method that resolves agent interfaces by their AgentDisplayName static virtual property (case-insensitive), enabling lookup by human-readable names like "Shell" or "dotnet".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…to Orchestrate

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- IThread instructions now explicitly route dotnet build/run/publish to Shell
- DotNet reserved for structured operations (test with filters, format)
- DotNetAgent.BuildAsync resolves directory paths to .csproj/.sln automatically
- FindSolutionPath now falls back to .csproj when no .sln exists
- Reverted DotNet from [Llm<Fast>] to default — Nano too weak for tool reasoning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oop, prompt engineering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s for all agents

Add System.ComponentModel [Description] attributes to all tool methods on IDotNet,
IShell, IGit, and IFileSystem interfaces so the LLM receives precise per-tool context.
Replace flat AgentInstructions strings with Identity → Rules → Reference three-layer
structure on all four interfaces.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends IDotNet interface with RunAsync (120s timeout, kills process tree on timeout) and ListProjectsAsync (synchronous directory enumeration of .csproj/.sln/.slnx). Adds using IAW.Agents.System to both files to resolve CommandResult.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds RestartResourceAsync, ListResourcesAsync, GetTracesAsync, and
GetLogsAsync typed interface methods to IAspire/AspireAgent, backed by
Aspire MCP CallToolAsync calls. Switches agent to Sonnet46 model and
updates AgentInstructions with deployment-focused rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ent, self-improvement flow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…inutes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…strator

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New agent in IAW.Agents.Fun namespace:
- IEmoji interface with TranslateAsync method
- EmojiAgent uses Claude 4.5 Haiku for fast emoji translation
- Registered and discoverable via AgentInterfaceResolver
- Thread routes to it via SendToAgent("Emoji", ...)

Verified: ☕💻🌅😍✨🎉

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rdcoded pipeline

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… logic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds POST /deploy to IAW.MCP that runs dotnet build IAW.slnx, parses
output via DeployVerifier, and reverts via git checkout on failure.
5-minute timeout. Stop/start of the assistant is handled by the Aspire agent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- DeployAsync stops assistant, POSTs to MCP /deploy to rebuild, then starts with fresh binary
- deploy-verify DurableJob fires 60s after activation to check all resources are Running
- Added Microsoft.Extensions.Http package reference to Agents.csproj for IHttpClientFactory
- Updated IAspire instructions: Deploy for code changes, RestartResource for simple restarts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LeftTwixWand and others added 5 commits March 23, 2026 19:20
Updated SelfImproveAsync prompt to direct the agent to call Aspire Deploy (which stops,
rebuilds, and starts fresh) rather than RestartResource when deploying code changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oji agent

Remaining changes from IAW self-improvement testing:
- EmojiAgent created by IAW (IEmoji.cs + EmojiAgent.cs in src/Agents/Fun/)
- DeployEndpoint.cs fixed by IAW to build assistant project only
- IAW-modified files from self-improvement iterations

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…spire SDK

Removed:
- /deploy endpoint in MCP server (wrong approach)
- DeployVerifier + tests (only needed for endpoint)
- EmojiAgent files (IAW-created, will be recreated via proper deploy)
- Deploy plans (superseded by Aspire SDK approach)

Cleaned:
- AspireAgent.DeployAsync → TODO stub (will use Aspire WithCommand)
- Removed IHttpClientFactory dependency from AspireAgent
- Removed Microsoft.Extensions.Http package from Agents.csproj
- MCP Program.cs cleaned of deploy endpoint registration
- SelfImprove prompt simplified

The self-improving deploy will be redesigned using:
- Aspire SDK WithCommand("deploy") in AppHost
- ResourceCommandService for stop→build→start
- WithExplicitStart() for shadow test instances

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…scriptions

Critical:
- C1: Cache AgentInterfaceResolver.ScanInterfaces() to avoid full assembly scan on every SendToAgent call
- C2: Fix misleading CleanLogsAsync description (it reads logs, doesn't clean)

Important:
- I1: Fix DeployAsync description to not claim rebuild capability (TODO stub)
- I3: Remove deploy-verify job from OnActivateAsync (only schedule from DeployAsync)
- I4: Add Aspire to error message's available agents list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@LeftTwixWand
Copy link
Copy Markdown
Contributor Author

Code review

Found 3 issues:

  1. TryDeterministicFix returns "add_using_system" but the retry loop only handles "skip" and "retry" -- the action is dead code that falls through to the LLM retry path without applying any fix

static string? TryDeterministicFix(string buildOutput)
{
if (buildOutput.Contains("CS0246") && buildOutput.Contains("IAW.Agents"))
return "skip";
if (buildOutput.Contains("CS0103") && buildOutput.Contains("'Console'"))
return "add_using_system";
if (buildOutput.Contains("The process cannot access the file"))
return "skip";
if (buildOutput.Contains("timed out"))
return "retry";
return null;
}

  1. DotNetAgent.RunAsync reads stdout then stderr sequentially via ReadToEndAsync. If the child process fills the stderr pipe buffer while the parent drains stdout, both deadlock until the 120s timeout fires. Should use Task.WhenAll to read both concurrently.

try
{
var output = await process.StandardOutput.ReadToEndAsync(timeoutCts.Token);
var error = await process.StandardError.ReadToEndAsync(timeoutCts.Token);
await process.WaitForExitAsync(timeoutCts.Token);
sw.Stop();

  1. SelfImproveAsync hardcodes E:\IAW as the source path. This is machine-specific and will break in CI or on other developer machines. Should use an environment variable or resolve from AppContext.BaseDirectory.

You must accomplish this by calling SendToAgent multiple times. The IAW source code is at E:\IAW.
Available agents: FileSystem (read/write files), DotNet (build/test), Git (commit), Aspire (restart to deploy), Roslyn (analyze code).

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

…, remove dead code

Review fixes:
- Remove SelfImprove tool from ThreadAgent — self-improvement will be a separate agent
- Remove self-improvement references from IThread instructions
- Fix DotNetAgent.RunAsync stdout/stderr deadlock — use Task.WhenAll for concurrent reads
- Fix TryDeterministicFix dead "add_using_system" — return null to trigger LLM retry
- Cache AgentInterfaceResolver scan results (from earlier commit)

Self-improvement loop will be properly designed as a dedicated agent with Aspire SDK.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@LeftTwixWand LeftTwixWand merged commit dd2f9d6 into master Mar 24, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant