Skip to content

feat: voice-to-text reliability — eager init, health checks, ORT feed#14

Merged
LeftTwixWand merged 45 commits into
masterfrom
feat/proper-voice2text-hosted-service
Mar 30, 2026
Merged

feat: voice-to-text reliability — eager init, health checks, ORT feed#14
LeftTwixWand merged 45 commits into
masterfrom
feat/proper-voice2text-hosted-service

Conversation

@LeftTwixWand
Copy link
Copy Markdown
Contributor

@LeftTwixWand LeftTwixWand commented Mar 30, 2026

Summary

  • Add nuget.config with ORT Azure DevOps feed so Microsoft.ML.OnnxRuntime.Foundry native binaries restore on any machine (fixes silent failure on fresh PCs)
  • Replace lazy SemaphoreSlim init with eager IHostedService — Whisper model downloads/loads at startup, failures surface in Aspire Dashboard immediately instead of hanging silently on first voice message
  • Add WhisperHealthCheck reporting Healthy/Degraded/Unhealthy in Aspire Dashboard based on IWhisperReadiness state
  • Add timeouts (5min download, 2min load) via Task.WhenAny since Foundry Local SDK doesn't accept CancellationToken
  • Upgrade AddWhisperProvider<T> to register as singleton + interface alias + hosted service + health check (single shared instance)
  • Organize LLM/embedding models into provider namespacesCore.AI.Models.OpenAI, .Anthropic, .GitHub, .Ollama, .XAI, .Google with 4 complete provider blocks in AppHost for easy switching
  • Add new models: Gpt54 (OpenAI reasoning), Gpt41Nano/Gpt41Mini/O4Mini (GitHub with tool calling), GitHub TextEmbedding3Small
  • Fix deprecated GitHub Models endpoint — migrated to models.github.ai/inference with openai/ prefixed model IDs

Problem

Voice-to-text via Foundry Local silently failed on second PC — no logs, no errors, no response after sending voice messages. Root cause: lazy initialization held a SemaphoreSlim forever when model download/load stalled, and the fire-and-forget webhook handler swallowed the timeout.

Additionally, all LLM models lived in a flat Core.AI.Models namespace making it impossible to switch between providers (OpenAI, Anthropic, GitHub Models, Ollama) without hunting through individual model files. The GitHub Models endpoint was deprecated (Oct 2025).

Test plan

  • dotnet restore IAW.slnx succeeds on fresh machine (ORT feed resolves native binaries)
  • dotnet build — zero C# compilation errors
  • WhisperModel tests pass (6/6)
  • LLMModel tests pass (20/20)
  • EmbeddingModel tests pass (7/7)
  • Aspire Dashboard shows "Initializing Foundry Local for Whisper..." at startup
  • Health check /health reports whisper status (Healthy or Unhealthy with error)
  • Voice message transcription works (or fails fast with clear error, no hang)
  • AppHost builds with each provider block uncommented (OpenAI, Anthropic, GitHub, Ollama)

🤖 Generated with Claude Code

LeftTwixWand and others added 30 commits March 24, 2026 07:27
Add dedicated IAWSystem agent (Opus 4.6) that autonomously diagnoses,
fixes, builds, tests, and deploys changes to the IAW system. Uses
SendToAgent to delegate to Aspire, DotNet, FileSystem, Roslyn, Git.

Fix Aspire DeployAsync: build solution via DotNet agent before restart
(Aspire runs --no-build on start). Schedule deploy-verify health check.

Update Thread routing to include IAWSystem for self-improvement requests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Orleans default 5-minute ResponseTimeout kills IAWSystem requests
before the multi-step orchestration (LLM calls + sub-agent delegation)
can complete. Override GetResponse with 30-minute timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Downgrade from Opus 4.6 to Sonnet 4.6 (3-5x faster LLM responses)
- Rewrite instructions: simple edits = FileSystem + DotNet + Git only
- Explicitly ban Shell/Roslyn for file edits (was causing misrouting)
- Minimize tool call chain for common operations

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Private [Description] methods in Agent.Scheduling.cs were never discovered
by tool registration. Added RegisterSchedulingTools() that scans for
non-public [Description] methods on the Agent base class.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Thread generates a 2-5 word title from the first user+assistant exchange
via a lightweight LLM call. Title is cached in durable state. Excluded
from AI tool discovery to prevent LLM self-invocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Creates a new Telegram forum topic, registers it via SetTopicId,
and auto-renames the topic after the first response using
thread.GetTitle(). Uses TryAutoRenameTopicAsync helper called from
both exit paths of StreamResponseAsync.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lists all custom topics with message counts and inline Delete buttons.
Delete callback closes the Telegram forum topic, clears thread history,
and removes the project from UserProfile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…firmation topic

TryAutoRenameTopicAsync now tracks renamed topics in-memory to avoid
calling EditForumTopicAsync on every message in chat- topics.
HandleCleanupDeleteAsync sends confirmation to the invoking topic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CodeValidator.Sanitize used TrimStart() which left \r on Windows,
causing EndsWith(';') to always fail — usings were never removed.
Fixed by using Trim() instead.

Added 2-minute per-build timeout to CodeOrchestratorAgent.TryBuild
to prevent test hangs when dotnet build takes too long.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ad safety

CodeValidator.ValidIawNamespaces was missing IAW.Agents.Fun, causing
Sanitize() to strip usings for the new Fun agents.

Replaced HashSet<string> with ConcurrentDictionary for _renamedTopics
since TelegramBotService is a singleton handling concurrent updates.
Also fixed naming to follow _camelCase convention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update dependency versions and add planning/spec files for the /newchat scheduling fix. Changes: bumped multiple Aspire packages (Hosting.Orleans, Hosting.JavaScript, Hosting.Qdrant, Qdrant.Client, Azure.Storage.Blobs, Hosting.Azure.CosmosDB, Hosting.Azure.Storage, etc.) to 13.2.0 in Directory.Packages.props; upgraded Aspire.AppHost SDK to 13.2.0 in src/IAW.AppHost/Aspire.csproj; added planning and design docs for the newchat/cleanup + scheduling-tool fix (docs/superpowers/plans/... and docs/superpowers/specs/...); and added an mcp entry to .claude/settings.local.json. These changes prepare the codebase for the new Telegram topic management and scheduling tool registration work described in the docs.
mcp-call.sh and mcp-client.sh are manual curl wrappers that duplicate
what the native IAW MCP server (.mcp.json) already provides.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix .mcp.json: add type:http to iaw, fix aspire args, add microsoft-learn MCP
- Add Orleans gateway connection retry filter to IAWClientExtensions
- Remove manual delay in Telegram StreamSubscriber (handled by retry filter)
- Clean up settings.local.json permissions, enable all project MCP servers
- Add Aspire skill definitions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Eugene <59283295+ScientistFromMars@users.noreply.github.com>
After Whisper transcribes a voice message, reply with the transcribed
text so the user can see what was understood before the LLM responds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LeftTwixWand and others added 15 commits March 30, 2026 17:14
…d agents

- Add Qwen25_7B/14B model classes with explicit Ollama tags for GPU sizing
- Add WithEmbedding<T>() builder chain mirroring WithLLM<T>() pattern
- Add NoOpEmbeddingGenerator fallback (replaces crash on missing cloud keys)
- Rewrite AddEmbeddingProvider() to support Ollama/OpenAI/GitHub/NoOp
- Fix Ollama connection string resolution (strip tag before sanitizing)
- Switch 6 agents from concrete models to tiers (Fast/Balanced/Reasoning)
- Fix /cleanup command: DeleteForumTopicAsync instead of CloseForumTopicAsync
- Fix CodeOrchestrator: discover files instead of hardcoding Program.cs
- Remove dead ScriptExecutor and tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…reporting

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
DownloadAsync/LoadAsync don't accept CancellationToken, so use
Task.WhenAny with Task.Delay to enforce the 5min/2min timeouts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move all LLM and embedding models into provider sub-namespaces under
Core.AI.Models (OpenAI, Anthropic, GitHub, Ollama, XAI, Google).
AppHost now has 4 complete provider blocks — uncomment one to switch:
OpenAI, Anthropic, GitHub Models, or Ollama local.

- Add new models: Gpt54 (OpenAI reasoning), Gpt41Nano, Gpt41Mini,
  O4Mini (GitHub with tool calling), GitHub TextEmbedding3Small
- Fix deprecated GitHub Models endpoint (models.github.ai/inference)
- Update GitHub model IDs to use openai/ prefix (new API requirement)
- Add / normalization to ServiceKey for prefixed model IDs
- Move embedding models to provider namespaces (MxbaiEmbedLarge → Ollama,
  TextEmbedding3Small → OpenAI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@LeftTwixWand LeftTwixWand merged commit c6d473c into master Mar 30, 2026
1 check passed
@LeftTwixWand LeftTwixWand deleted the feat/proper-voice2text-hosted-service branch March 30, 2026 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant