Backend.AI GO v1.6.0
Backend.AI GO v1.6.0 ships the Squad Overview tab with live Agent Activity Grid and Execution Timeline, a full CLI suite for headless operation, and massive squad, agent, and chat state persistence fixes across navigation.
Backend.AI GO v1.6.0
202 commits since v1.5.4. (171,055 lines added, 60,288 lines deleted)
New Features
- Squad Overview tab with live Agent Activity Grid and chronological Execution Timeline (E1-8:, E1-9:, parent,)
- TokenUsageBar, AgentActivityCard common components, activity summary and stall-detection selectors, and
squadTimelineSliceZustand slice powering the new Squad Overview - Full CLI suite for headless operation: Agent Runtime , Node/Mesh networking , Squad Management , and Supervisor policy/monitoring
- Startup section in Settings → General with auto-load model policy (none / lastUsed / explicit), sequential loader with RAM budget and timeout, landing page selector, and auto-restore chat session opt-out (–)
- Auto-restore most recent chat session on
/chatentry with opt-out - Unified agent activity state model with dynamic liveness thresholds per activity type and token-stream heartbeat for streaming inference steps
- Background execution hardening for long-running jobs
- Tray residency policy with onboarding prompt
- Approval-waiting pinned section with cross-squad aggregation
- Rich Team Dashboard with agent row list, name filter, and status chip toggles
- Independent headless build without desktop feature dependency , plus headless diffusion browser, file picker, directory selection
- Standalone
.baimodelpackager script and fast-head manifest reader that inspects bundles without scanning the full archive - Auto-select model after loading with Start Chat notification action
--parallelparameter configuration for inference servers- Engine auto-update toggle in Settings → General → Updates
- Support new GGUF split format (
name-NNNNN-of-NNNNN.gguf)
Improvements
- Complete shared-runtime refactor closing headless/desktop parity across engine, router, scheduler, squad, supervisor, process, diffusion, plugin, channel, provider, translation, mcp, and creation flows
- Lock-free routing table for the inference hot path eliminates chat freezes during concurrent model loads
- Canonical API layer architecture with frozen legacy
tauri.ts, domain adapter methods, and component-level call migration - Native
async fnin traits replacing theasync-traitcrate - Typed Zustand stores replace
window.dispatchEventinternal protocol tokio::fsreplaces syncstd::fsin async contexts- State persistence audit across squad/agent navigation — scroll, drafts, and log streams now survive tab switches
- Major codebase decomposition:
adapter.ts(3978 → 281 lines),squadStore.ts(1515 → 32 lines barrel),modelStore.ts,hfStore.ts,agentStore.ts,clawStore.ts,chatApi.ts,chatStore.ts,src/types/squad.ts,lib.rs,management_api/server.rs,models/manager.rs,settings/types.rs,SettingsPage,ModelsPage,ApiSettingsPage(,,,,–,,,–) - Consolidate shared domain types into core and unify state adapter layer
- Enforce domain service access pattern for extension areas
- Single source of truth mandate for Tauri/REST parity with
api-parityrules extended to filesystem operations - Common UI component usage promoted to lint/PR checklist
- Emphasize Backend.AI GO as an Agentic AI Platform on webpage
Bug Fixes
- Squad/agent/chat state persistence: SquadChat container log streams, monitored squad / activity feed / token usage, SquadMonitor UI state, BudgetMeter subscriptions, squad event subscriptions, AgentChat mount-time refetch, squad monitor drafts, ChannelsTab and AgentPage/CoworkPage cleanup handlers no longer wipe state on unmount (–,,,)
- Chat freeze during model load caused by outer Mutex on
InferenceCoordinator - Chat cancellation now propagates to backend and frees inference server slots
- Graceful shutdown on macOS Cmd+Q and SIGTERM cleanly releases inference slots and cancels in-flight chat streams
- GGUF model deletion now cleans up orphan directories and stale model cards
- Sharded MLX model stability fixes — disappearing models, wrong IDs, empty structure cards
- MLX capability auto-detection (vision, audio, tool calling) from
config.json - Null
audio_configguard and tool-calling detection from tokenizer special tokens - Progress race condition between timer task and coordinator
- Squad approval context propagation through agent runtime
- Tokio runtime entry before wiring suspend detector
- Headless data directory isolation and REST response shape alignment
- Headless flows: translation file handling, directory selection, session restore notification with model loading
- Register
/startup/apply-model-policyroute scope in Management API - Repair auto-load review regressions in the Startup flow
- Auto-select and Start Chat not working after model loading
- Pass
--jinjaflag to mlxcel engine for tool calling support - Security: enforce originating key scopes on session-authenticated requests , align Secure cookie flag with TLS state , fail startup when setup token generation fails , propagate real caller identity to registry audit entries , propagate TLS config to ServerConfig
- Numeric GPU temperature on macOS with all-smi 0.19.0
- Missing i18n strings for aria-labels and node titles
- Chat context not shared between demo playbook main prompt and follow-up questions
- Fall back to chat model when utility model is unavailable for title generation
- Filter active downloads from orphaned download detection
- Linux keyring dependency leaking into headless builds
- Explicit headless graceful shutdown on SIGTERM
CI/CD Improvements
- Criterion smoke benchmarks covering critical performance paths
- Security regression suite for the integrated Tauri + headless architecture
- Architecture check script with file size threshold warnings
- Automated Team Dashboard acceptance tests
- API parity verification promoted from documentation to automated tests
- Bump GitHub Actions to Node 24 compatible versions
- Teams release notification added to packaging workflow
make watch-servertarget for bgo-server hot rebuild- libdrm installed in CI so headless GPU monitoring works on Linux runners
- Resolve lint, format, and clippy warnings across the codebase
Technical Details
- Shared runtime bridges and service adapters route all domain services (engine, router, scheduler, squad, supervisor, process, diffusion, plugin, channel, provider, translation, mcp, creation) through a single runtime, enabling full parity between the desktop app and headless
aigo-server. - Canonical API layer architecture freezes legacy
tauri.tsand promotes all backend calls through domain adapter methods — component-leveltauriInvokecalls were migrated to the adapter layer and ESLint rules enforce the domain boundary. - Unified agent activity state model provides a single source of truth for agent status across Squad Overview, Team Dashboard, and the new chronological Execution Timeline. Dynamic liveness thresholds per activity type and token-stream heartbeat during inference steps eliminate false-positive stall detection on long-running jobs.
- Lock-free inference routing table removes the outer Mutex on
InferenceCoordinatorthat previously serialized chat completions behind background model loads. - State persistence audit rewired subscription lifecycles at the store level so squad/agent/chat UI state (scroll, drafts, log streams, container logs, budget, approvals) now survives navigation across tabs and pages.
- Codebase decomposition broke up oversized modules (
adapter.ts3978 → 281-line barrel,squadStore.ts1515 → 32-line barrel,lib.rs,management_api/server.rs,models/manager.rs,settings/types.rs,SettingsPage,ModelsPage,ApiSettingsPage,chatStore.ts, etc.) into subdomain slices, service modules, and section containers, aided by a new architecture check script with file size threshold warnings. - DTO validation and API parity tests promoted from documentation to automated test suites now guard Rust↔TypeScript serialization and Tauri/REST endpoint parity.
Dependencies
- Upgrade all Cargo dependencies and fix pre-existing test failures
- llama.cpp → b8665
- mlxcel → 0.0.23 (from 0.0.15)
- all-smi → 0.19.0
- GitHub Actions bumped to Node 24 compatible versions
async-traitcrate removed in favor of nativeasync fnin traits
Breaking Changes
bgo/bago→aigorename . The CLI binary, URL scheme, and internal identifiers have been renamed toaigo.bgo://deep links and the legacybgoCLI command are no longer supported. Update scripts, launch shortcuts, and integrations accordingly.mlx-server→mlxcel-serverrename across docs and architecture notes .- Legacy
tauri.tsfrozen . All new backend calls must go through the canonical domain adapter layer. DirecttauriInvokeimports in components and pages are now flagged by ESLint; Squad/Plugin/Cowork pages have boundary regression tests that block direct transport imports. - Major store decomposition (–):
adapter.ts,squadStore.ts,modelStore.ts,hfStore.ts,agentStore.ts,clawStore.ts,chatApi.ts,chatStore.ts, andsrc/types/squad.tsare now barrels that re-export from subdomain slices. Internal imports should use the barrel path; deep imports into former internal files may break.
Known Issues
None.
What's Changed
- fix(security): enforce originating key scopes on session-authenticated requests by @inureyes
- fix: align Secure cookie flag with actual TLS state for external bindings by @inureyes
- fix: fail startup when setup token generation fails during bootstrap by @inureyes
- fix: propagate real caller identity to registry audit entries by @inureyes
- fix: propagate TLS config to ServerConfig and harden bootstrap failure handling by @inureyes
- feat: add engine auto-update toggle in Settings > General > Updates by @inureyes
- refactor: define canonical API layer architecture and freeze legacy tauri.ts by @inureyes
- refactor: migrate backend calls from legacy tauri.ts to domain adapter methods by @inureyes
- fix: pass --jinja flag to mlxcel engine for tool calling support by @inureyes
- refactor: move component-level tauriInvoke calls to adapter layer by @inureyes
- fix: correct timer loop bug and rebalance progress bands for model loading by @inureyes
- fix: populate MLX model structure card metadata from config.json by @inureyes
- refactor: unify test mocking strategy around adapter interface by @inureyes
- fix: detect MLX model capabilities (vision, audio, tool_calling) from config.json by @inureyes
- refactor: decompose chatStore.ts into domain sub-modules by @inureyes
- fix: guard null audio_config and detect tool calling from tokenizer special tokens by @inureyes
- refactor: decompose ApiSettingsPage into section containers by @inureyes
- refactor: decompose SettingsPage into section containers by @inureyes
- refactor: decompose ModelsPage into section containers by @inureyes
- fix: resolve progress race condition between timer task and coordinator by @inureyes
- refactor: replace window.dispatchEvent internal protocol with typed Zustand stores by @inureyes
- fix: resolve sharded MLX model displaying shard filename as model ID by @inureyes
- refactor: decompose lib.rs into bootstrap composition root modules by @inureyes
- refactor: organize generate_invoke_handler into domain-grouped sections by @inureyes
- refactor: decompose management_api/server.rs into domain route builders by @inureyes
- fix: prevent sharded MLX models from disappearing after app restart by @inureyes
- fix: resolve MLX model structure card showing empty values by @inureyes
- refactor: split settings/types.rs into domain sub-modules by @inureyes
- refactor: split models/manager.rs into scanner/index/service/gguf_cache modules by @inureyes
- refactor: convert sync std::fs to tokio::fs in async contexts by @inureyes
- fix: close remaining chat adapter bypasses and /mesh route regressions by @inureyes
- fix: restore chat adapter memory and stats contracts by @inureyes
- fix: handle macOS Cmd+Q and SIGTERM for graceful inference shutdown by @inureyes
- feat: auto-select model after loading and add Start Chat notification action by @inureyes
- feat: add --parallel parameter configuration for inference servers by @inureyes
- fix: propagate chat cancellation to backend and free inference server slots by @inureyes
- refactor: audit and catalog core/* vs root module overlap by @inureyes
- refactor: remove dead core modules and re-export shared domain types by @inureyes
- refactor: create adapter layer for Tauri-specific state and AppHandle access by @inureyes
- feat: achieve independent headless build without desktop feature dependency by @inureyes
- chore: promote common UI component usage to lint/PR checklist by @inureyes
- fix: resolve auto-select and Start Chat not working after model loading by @inureyes
- refactor: consolidate shared domain types into core and unify state adapter layer by @inureyes
- refactor: reduce frontend-backend DTO duplication by @inureyes
- test: promote API parity verification from documentation to automated tests by @inureyes
- refactor: enforce domain service access pattern for new extension areas by @inureyes
- fix: fall back to chat model when utility model is unavailable for title generation by @inureyes
- fix: filter active downloads from orphaned download detection by @inureyes
- feat: support new GGUF split format (name-NNNNN-of-NNNNN.gguf) by @inureyes
- fix: share chat context between demo playbook main prompt and follow-ups by @inureyes
- refactor: advance shared runtime parity across desktop and headless by @inureyes
- feat: add Start Chat button to ModelLoadingStatus popup on successful load by @inureyes
- feat: add standalone .baimodel packager script by @inureyes
- feat: fast-head manifest reader for .baimodel packages by @inureyes
- fix: add missing i18n strings for aria-labels and node titles by @inureyes
- fix: isolate bgo-server headless data dir and align REST response shapes by @inureyes
- chore: add make watch-server for bgo-server hot rebuild by @inureyes
- fix(pool): remove outer Mutex from InferenceCoordinator to prevent chat freeze during model load by @inureyes
- refactor(pool): introduce lock-free routing table for inference hot path by @inureyes
- chore: bump mlxcel to v0.0.18 by @inureyes
- chore: update all-smi to 0.19.0 by @inureyes
- fix(models): GGUF model deletion leaves orphan directory and stale model card in UI by @inureyes
- chore: add architecture check script with file size threshold warnings by @inureyes
- test: add security regression suite for integrated Tauri+headless architecture by @inureyes
- test: add criterion smoke benchmarks for critical performance paths by @inureyes
- feat(oobe): persist downloaded model id to lastSelectedChatModelId by @inureyes
- feat(settings): add startupLandingPage setting with route tracking and resolver by @inureyes
- feat(settings): add startupModelPolicy setting and applyStartupModelPolicy action by @inureyes
- feat(chat): auto-restore most recent session on /chat entry with opt-out setting by @inureyes
- feat(models): add per-model autoLoadOnStartup flag by @inureyes
- feat(common): add global ModelLoadingBanner component by @inureyes
- fix(monitor): show numeric GPU temperature on macOS by @inureyes
- feat(settings): add Startup section to Settings → General by @inureyes
- feat(headless): apply startup model policy on Management API boot by @inureyes
- feat(models-ui): auto-load toggle on model card and detail drawer by @inureyes
- feat(startup-loader): sequential model loader with RAM budget and timeout by @inureyes
- feat(settings): implement startupModelPolicy "explicit" branch by @inureyes
- feat(common): extend ModelLoadingBanner with multi-model status and failure handling by @inureyes
- feat(settings): show auto-load models indicator in Startup section by @inureyes
- fix(security): register /startup/apply-model-policy route scope by @inureyes
- chore(ci): bump actions to Node 24 compatible versions by @inureyes
- fix(startup): repair autoload review regressions by @inureyes
- fix: restore quality gate regression coverage by @inureyes
- docs: mandate single source of truth for Tauri/REST parity by @inureyes
- refactor: split adapter.ts into per-domain modules by @inureyes
- refactor: split squadStore.ts into subdomain slices by @inureyes
- refactor: split modelStore.ts into slices by @inureyes
- refactor: split hfStore.ts into state/actions/selectors slices by @inureyes
- refactor: split agentStore.ts into state/actions/selectors slices by @inureyes
- refactor: split src/types/squad.ts into feature-based type files by @inureyes
- refactor: split clawStore.ts into state/actions/selectors slices by @inureyes
- refactor: split chatApi.ts into focused feature modules by @inureyes
- refactor: delete dead legacy exports from tauri.ts by @inureyes
- refactor: rename all bgo/bago identifiers to aigo by @inureyes
- fix: make session restore notification functional with model loading by @inureyes
- fix: make directory selection work in headless mode by @inureyes
- update: upgrade all Cargo dependencies and fix pre-existing test failures by @inureyes
- refactor: replace async-trait with native async fn in traits by @inureyes
- feat: add Agent Runtime CLI commands for headless agent execution by @inureyes
- feat: add Node/Mesh networking CLI commands for distributed inference by @inureyes
- feat: add Squad Management CLI commands for multi-agent orchestration by @inureyes
- feat: add Supervisor CLI commands for policy and monitoring management by @inureyes
- fix: complete headless translation file flows by @inureyes
- fix: address CLI follow-ups for node and supervisor by @inureyes
- docs: state persistence audit across squad/agent navigation by @inureyes
- feat: unified agent activity state model by @inureyes
- docs: rename mlx-server to mlxcel-server and expand architecture notes by @inureyes
- feat: Overview tab in Squad page with live subscription by @inureyes
- feat: tray residency policy with onboarding prompt by @inureyes
- feat: dynamic liveness thresholds by activity type by @inureyes
- feat: token-stream heartbeat for streaming inference steps by @inureyes
- feat: approval-waiting pinned section with cross-squad aggregation by @inureyes
- feat: background execution hardening by @inureyes
- feat: agent row list with name filter and status chip toggles by @inureyes
- fix: close Team Dashboard acceptance blockers by @inureyes
- fix: restore squad approval context for Team Dashboard aggregation by @inureyes
- test: automate Team Dashboard acceptance for by @inureyes
- fix: resolve lint, format, and clippy warnings across codebase by @inureyes
- fix: enter tokio runtime before wiring suspend detector by @inureyes
- fix: remove SquadChat local squad refetch and mount-time loadHistory flash (F3) by @inureyes
- fix: stop wiping monitored squad / activity feed / token usage on unmount by @inureyes
- fix: persist SquadChat container log streams in store keyed by squadId by @inureyes
- fix: move squad event subscriptions to store-level lifecycle by @inureyes
- fix: lift SquadMonitor UI state out of local useState by @inureyes
- fix: Move BudgetMeter subscriptions and status to store layer by @inureyes
- fix: remove AgentChat mount-time refetch and persist input drafts by @inureyes
- fix: persist squad monitor drafts and clear deleted squad state by @inureyes
- feat: add TokenUsageBar common component for token usage display by @inureyes
- feat: add AgentActivityCard common component for grid and card-based views by @inureyes
- feat: add activity summary and stall-detection selectors by @inureyes
- feat: add squadTimelineSlice for agent state transitions by @inureyes
- feat: add AgentActivityGrid card-grid view for Squad Overview by @inureyes
- feat: add ExecutionTimeline chronological state transition view by @inureyes
- fix: remove ChannelsTab cleanup-on-unmount tearing down channel subscriptions by @inureyes
- fix: remove AgentPage/CoworkPage cleanupEventListeners on unmount by @inureyes
- fix: harden recent merge follow-ups by @inureyes