Skip to content

ApproverAgent authorization loop + ProposeOptions interactive UX#36

Merged
LeftTwixWand merged 1 commit into
masterfrom
phase1-approver-agent
Apr 9, 2026
Merged

ApproverAgent authorization loop + ProposeOptions interactive UX#36
LeftTwixWand merged 1 commit into
masterfrom
phase1-approver-agent

Conversation

@LeftTwixWand
Copy link
Copy Markdown
Contributor

@LeftTwixWand LeftTwixWand commented Apr 9, 2026

Summary

Replaces three disconnected approval systems with a single agent-based primitive and adds a ProposeOptions tool that makes Telegram inline keyboards actually render.

  • ApproverAgent — per-user [Reentrant] Orleans grain backed by a Fast-tier LLM. Decides tool authorization dynamically from a natural-language policy list stored in durable state. No hardcoded risk levels, no hardcoded denylist, no timeouts — the LLM is the judge.
  • GatedAIFunction — wraps every agent tool at discovery time. Every tool call blocks on approver.Authorize(...) before invocation. Policy/UI tools are exempt so they don't waste LLM turns.
  • Event-driven Telegram deliveryApprovalRequested → Telegram subscriber renders a localized inline keyboard → user taps → CallbackRouter resolves directly via IApprover.ResolveApprovalApprovalResolved → original message edited.
  • 4-key decision modelallow_once / allow_thread / allow_user / deny. The Approver LLM produces labels in the user's conversation language (pulled from the Thread grain's history, not the sub-agent's).
  • Policy management is conversationalAddApproverPolicy / RemoveApproverPolicy / ListApproverPolicies tools on Thread that the LLM calls when the user says things like "don't ask me about builds anymore".
  • ProposeOptions — new Thread tool, void impl that pushes an OptionsPart onto a per-turn hint list. ResponseStreamer drains it and renders buttons. The < 200 chars short-circuit is dropped. RichContentParser gains an A)/B) letter fallback. Thread.AgentInstructions gains a USER INTERACTION section.

Deleted

ApprovalGateGrain, IApprovalGate, ApprovalRequest, ApprovalDecision, PendingApproval, ApprovalResult, UISession.RegisterApproval/ResolveApproval, ApprovalGateTests — all replaced by the single ApproverAgent path.

Security hardening (post-code-review)

  • [Reentrant] on ApproverAgent — without this, Authorize awaiting its TCS would deadlock against the ResolveApproval grain call that needs to complete it.
  • TCS waiter registered before publishing the event (race window closed).
  • ExtractUserIdFromGrainKey requires a numeric head — non-user grains (CodeOrchestrator, AgentRegistry) no longer bind to bogus Approvers.
  • ExtractThreadIdFromGrainKey strips sub-agent interface suffixes so Thread-scoped policies actually match sub-agent tool calls.
  • Approver LLM pulls recent turn snippets from the Thread grain (not the sub-agent's empty history) for correct language detection.
  • DiscoverInterfaceToolsEnabled = false on ApproverAgent prevents auto-exposed IApprover methods from becoming self-callable tools.
  • Approval callback ownership check — only the user who owns an approval can tap its buttons.
  • GatedAIFunction.BuildArgumentsPreview redacts api_key / token / password / authorization / Bearer … before sending args to the Approver LLM.

Test plan

  • dotnet build IAW.slnx — 0 warnings, 0 errors
  • dotnet test test/Core.Tests — 467 passed, 1 skipped (includes 5 new ApproverAgentTests)
  • dotnet test test/Integration.Tests — 7 passed
  • aspire start — all 13 resources Running & Healthy
  • Telegram logs confirm: Subscribed to notification, job completed, orchestration progress, and approval streams
  • Manual E2E on real Telegram bot: send a prompt that proposes options via ProposeOptions → verify inline keyboard renders
  • Manual E2E on real Telegram bot: trigger a tool that needs approval → verify ApprovalRequested keyboard shows localized labels → tap allow_thread → verify policy stored + tool proceeds
  • Manual E2E: ask assistant "don't ask me about dotnet builds" → verify AddApproverPolicy tool is called and persists

🤖 Generated with Claude Code

Replaces three disconnected approval systems (ApprovalGateGrain,
UISession.RegisterApproval, orphaned NotificationService.SendApprovalAsync)
with a single agent-based primitive: ApproverAgent is a per-user reentrant
Orleans grain that decides tool authorization dynamically via its own Fast-tier
LLM, stores natural-language policies in durable state, and drives an
end-to-end Telegram inline-keyboard flow (ApprovalRequested → Telegram ↔
callback → ResolveApproval).

Every tool invocation now flows through a GatedAIFunction wrapper that blocks
execution until the Approver allows or denies. There are no hardcoded risk
levels, denylists, or timeouts — the LLM is the judge, guided only by its
system prompt and stored policies, which are written and removed conversationally
via new Thread tools (AddApproverPolicy / RemoveApproverPolicy /
ListApproverPolicies).

ProposeOptions is a new Thread-level tool whose implementation is ~10 lines:
it pushes an OptionsPart onto a per-turn hint list that ResponseStreamer drains
and renders as inline-keyboard buttons. The < 200 chars short-circuit in
ResponseStreamer is dropped; RichContentParser gains an A)/B) fallback so
lettered LLM prose still renders as buttons. Thread.AgentInstructions gains a
USER INTERACTION section telling the LLM to always use ProposeOptions instead
of inlining choices.

Security hardening from the post-implementation review:
- [Reentrant] on ApproverAgent so the blocking Authorize TCS can be completed
  from a concurrent ResolveApproval call on the same grain.
- TCS waiter registered before publishing the event to close the race window.
- ExtractUserIdFromGrainKey requires a numeric head so non-user grains
  (CodeOrchestrator, AgentRegistry) don't accidentally bind to a bogus
  IApprover.
- ExtractThreadIdFromGrainKey strips sub-agent interface suffixes so
  thread-scoped policies actually match sub-agent tool calls.
- Approver LLM pulls recent turn snippets from the Thread grain (not the
  sub-agent) so localized button labels reflect the user's actual language.
- DiscoverInterfaceToolsEnabled = false on ApproverAgent prevents
  self-resolution loops via auto-exposed IApprover methods.
- Approval callback ownership check — only the user who owns an approval can
  resolve it; other users' taps are rejected.
- GatedAIFunction redacts api_key / token / password / authorization / bearer
  strings from the args preview before it leaves the silo for the Approver LLM.

Deleted: ApprovalGateGrain, IApprovalGate, ApprovalRequest, ApprovalDecision,
PendingApproval, ApprovalResult, UISession.RegisterApproval/ResolveApproval,
ApprovalGateTests. The two UISessionTests approval cases are removed; the two
Phase2IntegrationTests approval cases are removed; 5 new ApproverAgentTests
cover AddPolicy, ListPolicies, RemovePolicy-empty, ResolveApproval no-op,
Thread-scoped policy storage.

Full verification: build clean (0 warnings, 0 errors), Core.Tests 467 passed,
Integration.Tests 7 passed, Aspire AppHost boots all 13 resources Running &
Healthy, Telegram logs "Subscribed to ... approval streams".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@LeftTwixWand LeftTwixWand self-assigned this Apr 9, 2026
@LeftTwixWand LeftTwixWand added the enhancement New feature or request label Apr 9, 2026
@LeftTwixWand LeftTwixWand merged commit ebaa4ff into master Apr 9, 2026
1 check failed
@LeftTwixWand LeftTwixWand deleted the phase1-approver-agent branch April 9, 2026 14:18
LeftTwixWand added a commit that referenced this pull request Apr 9, 2026
Replace discovery-time GatedAIFunction with MAF function-calling middleware,
introduce automatic memory via a QdrantClient-backed MessageAIContextProvider,
delete the over-partitioned memory/preference/knowledge agents, and migrate
all context providers to MAF's AIContextProvider pipeline.

## Authorization
- Delete GatedAIFunction and its per-tool wrap logic.
- Add Agent.Authorization.cs with ToolApprovalMiddleware: registered via
  `.AsBuilder().Use(...).Build()` on the AIAgent produced by AsAIAgent.
  Denies terminate the call with context.Terminate = true; any grain exception
  fails closed as Deny + telemetry (never silently Allow).
- ApproverAgent adds a memo table keyed by SHA-256(toolName|argsJson) fingerprint.
  Memo hits return cached Allow decisions without an LLM call. Memo writes only
  happen on ResolveApproval with scope Thread/User.
- Approver holds DelayDeactivation(5min) across the await tcs.Task HITL wait
  and asserts the pending entry's UserId matches its grain key before resolving.
- Override ResolveApproverGrainKey() => null on Approver so its own LLM calls
  bypass the authorization middleware (prevents infinite recursion).
- Simplify ToolAuthorizationRequest to
  (AgentId, AgentDisplayName, ToolName, ArgumentsJson, RecentMessages).
  Threadid is derived at judgment time from the sub-agent grain key.
- New telemetry counters: ApproverFailures, ApproverDenies, ApproverMemoHits,
  ApproverLlmJudgments.
- Remove ToolAuthorizationRequested / ToolAuthorized event constants (only
  ToolDenied is published now).

## Memory
- Delete the five over-partitioned memory agents (User/Project/Episode/
  Pattern/Code) plus MemoryAgentBase, IMemoryAgent, MemoryEntry, and
  MemoryContextProvider. Delete PreferenceAgent / IPreference /
  PreferenceRule / PreferenceContextProvider. Delete KnowledgeAgent /
  IKnowledge — its typed data folds into memory with optional tags.
- Add src/Core/Memory: MemoryHit, IMemoryLookup, IawMemoryProvider.
  IawMemoryProvider : MessageAIContextProvider + IMemoryLookup.
  Uses QdrantClient DIRECTLY (same pattern as RAGContextProvider and
  Agent.IngestChunksAsync) — no new NuGet packages, no SK adapters.
  Per-user collection `user-memory-{userId}` with payload fields
  content/userId/threadId/role/createdAtTicks/sourceTelegramMsgId.
- ProvideMessagesAsync injects top-5 relevant memories as a single
  ChatRole.System "## Memories" message. StoreAIContextAsync persists
  each new request/response message with embedding + metadata.
- LookupOriginAsync searches top-1 for explainability and maps to MemoryHit.
- Add ForwardMessageHint UIPart; Thread adds an Explain tool that calls
  IMemoryLookup.LookupOriginAsync, pushes the hint to pending UI hints, and
  returns the stored text + timestamp.
- TelegramMessageSender gains ForwardMessageAsync; ResponseStreamer calls it
  for every ForwardMessageHint emitted during a turn so the user sees the
  original message forwarded back to them.
- Telegram layer stamps `message.MessageId` onto
  `ChatMessage.SourceTelegramMsgId` when handing off to Thread; Agent.cs
  carries it through `ProduceLlmStreamAsync` by constructing an
  M.E.AI ChatMessage with AdditionalProperties["iaw.sourceTelegramMsgId"].
- Register IawMemoryProvider + IMemoryLookup in IAWSiloExtensions, resolve
  it in Agent.OnActivateAsync and add to ChatClientAgentOptions.AIContextProviders.
- Remove IUserProfile.RememberFact / RecallFacts (and all callers) — memory
  is now a concern of IawMemoryProvider, not UserProfile.
- Wire IMemoryLookup into ExplainabilityAgent alongside Approver policy search.

## Context providers → MAF AIContextProvider
- Delete IAgentContextProvider interface entirely. Single pipeline, no parallel
  plumbing.
- Migrate UserContextProvider, RAGContextProvider, AgentRoutingContextProvider,
  and PolicyContextProvider to `MessageAIContextProvider`. All read userId /
  threadId from `AIAgent.CurrentRunContext?.Session?.StateBag` via a shared
  ContextProviderIdentity helper.
- Delete orphan providers (Project, Task, TaskLedger, TaskStream, TaskResult)
  that had no wire-up — their tests went with them. EventFlowIntegrationTests
  now calls ledger.GetContextBlockAsync directly.
- Delete Agent.GetContextProviders/BuildContextBlock + the instructions
  concatenation in StreamResponseCore. Providers are now registered via
  ChatClientAgentOptions.AIContextProviders at activation, plus a
  `GetAdditionalAIContextProviders()` virtual hook for thread-specific
  providers (Thread adds RAG + AgentRouting).

## Identity plumbing
- Agent.OnActivateAsync parses userId/threadId from the grain key and pushes
  them to `session.StateBag["iaw.userId"]` / `"iaw.threadId"` right after
  CreateSessionAsync, so every provider can read the same identity regardless
  of which grain it runs inside.

## Tests
- Rewrite ApproverAgentTests: allow/deny/ask judgment coverage, ask-flow
  pending entry, ResolveApproval memo write, second Authorize hits memo
  without reissuing ApprovalRequested, ResolveApproval cross-user rejection.
- Remove MemoryAgentBase / MemoryAgent / PreferenceAgent / Explainability /
  MemoryEntry / MemoryBase / orphan context-provider tests.
- Trim UserProfileTests (RememberFact/RecallFacts gone) and
  ArchitectureGuardV2Tests (memory base check gone).
- EventFlowIntegrationTests exercises ledger.GetContextBlockAsync directly.

Builds cleanly, `dotnet test test/Core.Tests` and
`dotnet test test/Integration.Tests` green. Tested live through the Aspire
MCP: assistant / mcp / devui / telegram rebuilt and run Healthy with the new
wiring. Precursor: #36 (ApproverAgent authorization loop + ProposeOptions UX).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LeftTwixWand added a commit that referenced this pull request Apr 20, 2026
* MAF middleware authorization + IawMemoryProvider architectural refactor

Replace discovery-time GatedAIFunction with MAF function-calling middleware,
introduce automatic memory via a QdrantClient-backed MessageAIContextProvider,
delete the over-partitioned memory/preference/knowledge agents, and migrate
all context providers to MAF's AIContextProvider pipeline.

## Authorization
- Delete GatedAIFunction and its per-tool wrap logic.
- Add Agent.Authorization.cs with ToolApprovalMiddleware: registered via
  `.AsBuilder().Use(...).Build()` on the AIAgent produced by AsAIAgent.
  Denies terminate the call with context.Terminate = true; any grain exception
  fails closed as Deny + telemetry (never silently Allow).
- ApproverAgent adds a memo table keyed by SHA-256(toolName|argsJson) fingerprint.
  Memo hits return cached Allow decisions without an LLM call. Memo writes only
  happen on ResolveApproval with scope Thread/User.
- Approver holds DelayDeactivation(5min) across the await tcs.Task HITL wait
  and asserts the pending entry's UserId matches its grain key before resolving.
- Override ResolveApproverGrainKey() => null on Approver so its own LLM calls
  bypass the authorization middleware (prevents infinite recursion).
- Simplify ToolAuthorizationRequest to
  (AgentId, AgentDisplayName, ToolName, ArgumentsJson, RecentMessages).
  Threadid is derived at judgment time from the sub-agent grain key.
- New telemetry counters: ApproverFailures, ApproverDenies, ApproverMemoHits,
  ApproverLlmJudgments.
- Remove ToolAuthorizationRequested / ToolAuthorized event constants (only
  ToolDenied is published now).

## Memory
- Delete the five over-partitioned memory agents (User/Project/Episode/
  Pattern/Code) plus MemoryAgentBase, IMemoryAgent, MemoryEntry, and
  MemoryContextProvider. Delete PreferenceAgent / IPreference /
  PreferenceRule / PreferenceContextProvider. Delete KnowledgeAgent /
  IKnowledge — its typed data folds into memory with optional tags.
- Add src/Core/Memory: MemoryHit, IMemoryLookup, IawMemoryProvider.
  IawMemoryProvider : MessageAIContextProvider + IMemoryLookup.
  Uses QdrantClient DIRECTLY (same pattern as RAGContextProvider and
  Agent.IngestChunksAsync) — no new NuGet packages, no SK adapters.
  Per-user collection `user-memory-{userId}` with payload fields
  content/userId/threadId/role/createdAtTicks/sourceTelegramMsgId.
- ProvideMessagesAsync injects top-5 relevant memories as a single
  ChatRole.System "## Memories" message. StoreAIContextAsync persists
  each new request/response message with embedding + metadata.
- LookupOriginAsync searches top-1 for explainability and maps to MemoryHit.
- Add ForwardMessageHint UIPart; Thread adds an Explain tool that calls
  IMemoryLookup.LookupOriginAsync, pushes the hint to pending UI hints, and
  returns the stored text + timestamp.
- TelegramMessageSender gains ForwardMessageAsync; ResponseStreamer calls it
  for every ForwardMessageHint emitted during a turn so the user sees the
  original message forwarded back to them.
- Telegram layer stamps `message.MessageId` onto
  `ChatMessage.SourceTelegramMsgId` when handing off to Thread; Agent.cs
  carries it through `ProduceLlmStreamAsync` by constructing an
  M.E.AI ChatMessage with AdditionalProperties["iaw.sourceTelegramMsgId"].
- Register IawMemoryProvider + IMemoryLookup in IAWSiloExtensions, resolve
  it in Agent.OnActivateAsync and add to ChatClientAgentOptions.AIContextProviders.
- Remove IUserProfile.RememberFact / RecallFacts (and all callers) — memory
  is now a concern of IawMemoryProvider, not UserProfile.
- Wire IMemoryLookup into ExplainabilityAgent alongside Approver policy search.

## Context providers → MAF AIContextProvider
- Delete IAgentContextProvider interface entirely. Single pipeline, no parallel
  plumbing.
- Migrate UserContextProvider, RAGContextProvider, AgentRoutingContextProvider,
  and PolicyContextProvider to `MessageAIContextProvider`. All read userId /
  threadId from `AIAgent.CurrentRunContext?.Session?.StateBag` via a shared
  ContextProviderIdentity helper.
- Delete orphan providers (Project, Task, TaskLedger, TaskStream, TaskResult)
  that had no wire-up — their tests went with them. EventFlowIntegrationTests
  now calls ledger.GetContextBlockAsync directly.
- Delete Agent.GetContextProviders/BuildContextBlock + the instructions
  concatenation in StreamResponseCore. Providers are now registered via
  ChatClientAgentOptions.AIContextProviders at activation, plus a
  `GetAdditionalAIContextProviders()` virtual hook for thread-specific
  providers (Thread adds RAG + AgentRouting).

## Identity plumbing
- Agent.OnActivateAsync parses userId/threadId from the grain key and pushes
  them to `session.StateBag["iaw.userId"]` / `"iaw.threadId"` right after
  CreateSessionAsync, so every provider can read the same identity regardless
  of which grain it runs inside.

## Tests
- Rewrite ApproverAgentTests: allow/deny/ask judgment coverage, ask-flow
  pending entry, ResolveApproval memo write, second Authorize hits memo
  without reissuing ApprovalRequested, ResolveApproval cross-user rejection.
- Remove MemoryAgentBase / MemoryAgent / PreferenceAgent / Explainability /
  MemoryEntry / MemoryBase / orphan context-provider tests.
- Trim UserProfileTests (RememberFact/RecallFacts gone) and
  ArchitectureGuardV2Tests (memory base check gone).
- EventFlowIntegrationTests exercises ledger.GetContextBlockAsync directly.

Builds cleanly, `dotnet test test/Core.Tests` and
`dotnet test test/Integration.Tests` green. Tested live through the Aspire
MCP: assistant / mcp / devui / telegram rebuilt and run Healthy with the new
wiring. Precursor: #36 (ApproverAgent authorization loop + ProposeOptions UX).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* CLAUDE.md: add brainstorming & design conversations section

Codifies the working mode for design conversations — sharpening
questions, honest pushback, distinct prototypes, closing the loop into
phase-1 plans. Switch to execution mode only on explicit "go"/"build".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant