You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generated: 2026-04-19 | Last updated: 2026-05-11 | Branch: feature/990-direct-model-migration
Strategy Direction (2026-05-10)
Mandatory MAF direct-model invocation. Portal-managed Foundry Agent records (V2 prompt-agent path) are retired from framework runtime code. Each service's MAF Agent is now constructed in-process inside the existing FastAPI handler, over a pluggable ChatClient (FoundryChatClient by default, but provider-agnostic). Foundry remains the model-deployment plane, the telemetry backend (via OTel → Application Insights), and the evaluation surface - it is no longer an agent-runtime intermediation layer. The retired FoundryAgentInvoker JSON-text tool-call parser has been replaced by native MAF function-calling.
Why now. The 2026-04-28 “Mandatory Foundry Invocation Policy” in ADR-005 was reversed on 2026-05-10 (same ADR, append-only amendment). Drivers: (1) the 2–5s per-request overhead did not yield a proportionate operational return; (2) tool-calling fidelity is structurally cleaner with native MAF function-calling than with the hand-rolled JSON-text parser; (3) the inventory hosted-agent precedent (commit 4cf0e546, 2026-04-25) demonstrated the direct-model shape end-to-end — it was deleted only because it had been shipped as a parallel entry point alongside main.py, which is the dual-runtime anti-pattern the new policy explicitly forbids.
Single-architecture guardrail. No service may ship a second entry point (e.g., hosted_main.py, parallel ResponsesHostServer, secondary port) alongside main.py. The MAF Agent is constructed inside the existing FastAPI handler. This is the explicit lesson from the deleted inventory hosted_main.py.
Portal-tracking manifests. Each active agent service now carries agent.yaml and .foundry/agent-metadata.yaml as local Foundry portal-tracking metadata only. These artifacts do not introduce hosted_main.py, ResponsesHostServer, entrypoint.sh, port 8088, or any parallel runtime; deployment and invocation still run through DirectModelInvoker inside the existing FastAPI app.
Tool calling. Tools register in two places, additively: (a) MCP server for A2A (unchanged), and (b) the in-process MAF Agent(tools=[...]) for native function-calling. The _inject_tool_prompt / _extract_tool_calls_from_text / schema_tools_injected JSON-parsing path in lib/src/holiday_peak_lib/agents/foundry.py was deleted in Wave 4c of the cutover.
Cutover plan status: doc/ADR (Wave 0), lib DirectModelInvoker + AgentBuilder.with_direct_models() (Wave 1), inventory-health-check pilot (Wave 2), and all remaining agent services (Wave 3) are complete. Wave 4a/4b removed legacy app-factory/builder wiring plus ensure-foundry-agents workflow/scripts. Wave 4c removed FoundryAgentInvoker, /foundry/agents/ensure, and the V2 provisioning code path from framework runtime code. Portal-managed agents in project aipholidaris are intentionally left untouched by this repository change and will be deprovisioned manually by the owner.
Status of historical 2026-04-19 hotfix notes that touch the now-retired path: “FoundryAgentInvoker replaces legacy FoundryInvoker” and “reasoning_effort parameter support in Foundry pipeline” remain accurate historical descriptions of PR #802-era runtime behavior, but that path has been removed from framework runtime code. Do not extend them; future work belongs in DirectModelInvoker.
Current Main Snapshot (2026-05-10)
Audience-IA cutover wave (April–May 2026)
A large block of the audience-segmented IA, design-system cleanup, and
deploy-portal preview work landed on main in this wave. Each row is a
merged squash commit on main.
PR
Epic / Issue
Scope
#1075
#1060
A11y + performance quality gates: bundle-budget gate (apps/ui/budgets.json, scripts/check-bundle-budgets.mjs), Web Vitals reporter wiring, Lighthouse CI workflow, ESLint outline-none rules, deletion of legacy HomeSplitHero.
#1076
#1046
Retailer pages: /retailers/{value,agents,roi,comparators,case-studies} with ROICalculator, AgentCatalog, ComparatorMatrix, CaseStudyEmptyState molecules. ROI methodology pinned in docs/methodology/retailer-roi.md (75% buyer-time savings, 22% dispute reduction, ±40% CI band).
Audience-router foundation (CODEOWNERS, tokens, route groups)
✅ Closed
Landed in earlier wave.
#1020
Audience-segmented IA
✅ Effective closure
Route groups (#1015), dual tokens (#1016), LaneSwitch (#1017), per-section SEO + sitemap (#1018), CODEOWNERS 5-second-test (#1019), axe-core CI all in main.
#1026
mkdocs-as-/docs sub-path
✅ Effective closure
#1021 #1023 #1024 #1025 shipped non-strict in PR #1079; #1022 Pagefind v1 + ?q= cross-link in PR #1081; --strict gate flip in PR #1084 (130 → 0 warnings via the rewrite_external_links.py hook + GitHub-style toc slugifier).
#1039
Deploy-portal one-click preview
🟡 v1 in PR #1078
Sub-issues #1027–#1038 scaffolded; real ARM kickoff + SignalR client + production GA gated on third-party / Microsoft Red Team pen-test (#1027).
#1046
Retailer pages
✅ v1 in PR #1076
#1040–#1045 shipped.
#1053
Builder pages
✅ v1 in PR #1077
#1047–#1052 shipped.
#1060
A11y + perf quality gates
✅ Closed (#1075)
Bundle-budget advisory at v1; flip strict via vars.MKDOCS_STRICT_BUILD-style toggle once dependency-trim follow-up lands.
#1061
UI design-system cleanup + roll-forward
✅ Effective closure
F1–F6 (#1055–#1060) all merged; ADR-035 in place.
Tracking-only epics (out of session)
Epic
Title
State
Notes
#990
R1 — MAF backend cutover (superseded 2026-05-10)
🟡 Re-scoped
Original plan rolled FoundryAgentInvoker (PR #802) and the strict-4s pipeline (PR #859) to the remaining services. Re-scoped on 2026-05-10 by the ADR-005 Mandatory MAF Invocation Policy amendment: the rollout target is now DirectModelInvoker (MAF Agent + FoundryChatClient, single FastAPI entry point per service). Wave 3 migrated all 26 agent services. Wave 4c removed the legacy portal-agent runtime/provisioning path from code. The 42 V2 portal agents in project aipholidaris are intentionally left for manual deprovisioning outside this code cleanup.
Executive demo homepage refactor (2026-04-28): apps/ui/app/page.tsx now routes to a single-page, scroll-driven executive demo where the robots are the primary narrative device. The homepage now stages the cold open, hero search, CRM boardroom, discovery duo, truth pipeline, catalog galaxy, cart/checkout, inventory, logistics, returns/support, platform telemetry, and scenario close as one continuous presentation surface. Supporting additions include route-local demo components, a reusable agent profile drawer, and additive AgentRobot scene props (facing, pointAt, lookAt, toolOverride, scenePeer).
Commerce journey agent continuity (2026-04-28): The storefront drill-down routes now keep visible agent presence past the homepage. AgentRobotOverlay accepts semantic size presets and scene-staging props, and category, orders, order detail, cart, search, product, and checkout surfaces now stage primary and secondary agents along the buyer journey.
Commerce stage contract + drawer/operator follow-through (2026-04-28): Shopping-flow routes now share apps/ui/components/templates/CommerceAgentLayout.tsx so the primary-stage robot, side-cast robot, and telemetry chip are declared uniformly. AgentProfileDrawer now renders input/output schemas, curated sample payloads, a live sample-run action, and an in-place trace explorer modal; scenario drill-down drawers also receive live monitor metrics instead of static placeholders.
Drawer streaming + telemetry completion (2026-04-28): AgentProfileDrawer sample runs now stream over /invoke/stream, commerce telemetry chips now hydrate from persisted _telemetry across search, product enrichment, and product-graph summary invoke seams, and /order/[id] now keeps logistics-route-issue-detection as the default side cast until return flow activation swaps in returns + support.
Frontend regression hardening (2026-04-28): Added focused drawer unit coverage plus Playwright specs for the executive demo narrative, light/dark visual regression, and admin cockpit readiness (apps/ui/tests/e2e/demo-narrative.spec.ts, apps/ui/tests/e2e/dark-mode-regression.spec.ts, apps/ui/tests/e2e/cockpit-readiness.spec.ts).
Scenario drill-down briefs (2026-04-28): Added /scenarios/[id] route briefs for discovery, customer 360, truth, and checkout flows, backed by shared scenario metadata so the homepage close scene can jump into a dedicated brief before sending users to live product or operator surfaces.
MAF direct-model Wave 4c cleanup (2026-05-11): removed the retired portal-agent FoundryAgentInvoker, JSON-text tool-call parser, V2 provisioning code path, and /foundry/agents/ensure endpoint from framework runtime code. Readiness now validates direct-model maf-direct targets.
Catalog-search strict 4s pipeline (v6): Entire intelligent pipeline runs inside asyncio.wait_for(timeout=4.0). Intent classification via GPT-5-nano with reasoning_effort="minimal" (1.5s budget). Parallel keyword + hybrid search fan-out. Direct product construction from AI Search documents (no CRUD round-trip). Fire-and-forget history writes. Deployed as strict-4s-v6 on 2 AKS replicas.
Historical reasoning_effort support in the retired Foundry pipeline: FoundryAgentInvoker plumbed reasoning_effort through _PreparedInvocation to MAF options during the PR #802 runtime era. The active direct-model path owns future runtime parameter work.
Live integration test suite: 11 tests (10 parametrized queries + summary report) validate the live APIM→AKS→AI Foundry→AI Search pipeline. 10/10 within budget, avg ~2.77s.
Historical FoundryAgentInvoker migration: PR #802 replaced legacy FoundryInvoker with FoundryAgentInvoker and fixed silent tool-dropping during the portal-agent runtime era. That runtime path is now retired in favor of DirectModelInvoker.
agent-framework upgraded from unpinned to >=1.0.1 GA across all 27 Python service packages; resolves ContextProvider vs BaseContextProvider import incompatibility.
Memory tier operations parallelized with asyncio.gather for reduced latency; new memory tools (get_memory, set_memory, search_memory) and gather_adapters helper available.
AKS deployments now reconcile through Flux CD GitOps (ADR-017); kubectl-apply path removed.
ADR-017 Phase 2 (in progress): migrating from rendered YAML to Flux HelmRelease CRDs for in-cluster Helm rendering. Pilot: ecommerce-catalog-search deployed via HelmRelease; remaining 25 services migrate incrementally. New HelmRelease manifests in .kubernetes/releases/agents/.
CRUD and agent services run in separate Kubernetes namespaces (ADR-026).
API Center governance and APIM MCP strategy implemented (ADR-027).
Self-healing runtime completed with incident lifecycle state machine, remediation policy, and audit trail.
UI route smoke validation: yarn test --runInBand tests/unit/pagesRender.test.tsx in apps/ui passes after the executive demo refactor and commerce continuity rollout.
Drawer/admin/page focused UI validation: yarn test --runInBand --runTestsByPath tests/unit/AgentProfileDrawer.test.tsx tests/unit/AdminServiceDashboardPage.test.tsx tests/unit/pagesRender.test.tsx passes after the drawer enrichment and commerce layout rollout.
Final telemetry seam validation: yarn test --runInBand --runTestsByPath tests/unit/productService.test.ts tests/unit/ProductGraphCanvas.test.tsx passes after wiring enrichment-backed product loads and graph-summary invocations into persisted telemetry.
Full local test run: 1796 passed (1136 lib + 660 app), 0 failures.
Catalog-search unit tests: 34 passed (including 14 intelligent-mode tests), ~3.3s.
Live integration tests: 10/10 queries within 7s wall-clock budget, avg ~2.77s.
CI gate: lint, test, CodeQL, pip-audit, contract-gate all passing on main.
35 Architecture Decision Records (ADR-001 through ADR-027).
Note
Historical sections below preserve earlier v1.1.0 and v2.0.0 planning/trackers for audit context and may include superseded planning entries.
Frontend API Integration: Enhanced checkout, order tracking, inventory pages
Test Coverage: 635 tests passing (up from 386)
Runtime Hotfix Notes (2026-03-06)
Truth Export Compatibility: Added a truth_export.schemas_compat fallback to keep truth-export functional when runtime images resolve an older holiday-peak-lib package that does not expose holiday_peak_lib.schemas.truth.
Notebook Live Checks: Updated the Product Truth Layer notebook live integration cell to support Cosmos SDK query compatibility differences and improved PostgreSQL sample payload parsing.
Lib Config Test Determinism: Resolved issue #29 by isolating env-file loading in lib/tests/test_config.py via _env_file=None, preventing local .env leakage into test assertions (python -m pytest lib/tests/test_config.py -q → 20 passed).
Runtime Hotfix Notes (2026-03-12)
Mandatory CI Gate Enforcement (Issue #30): Required smoke/health checks in deploy workflows now fail deterministically on transport failures and non-200 responses; transport failures are normalized as hard failures in required checks; permissive handling is retained only for advisory/non-gating diagnostics and cleanup paths.
Azure AI Search Provisioning/Runtime Activation (Issue #32): Shared infrastructure now provisions the Azure AI Search service, azdpostprovision ensures the catalog-products index after service readiness, deploy workflow and Helm rendering propagate AI_SEARCH_ENDPOINT, AI_SEARCH_INDEX, and AI_SEARCH_AUTH_MODE, and catalog-search falls back to adapter retrieval when Search is unavailable or empty.
Runtime Hotfix Notes (2026-03-14)
Main Branch Stabilization (Issue #246): CRUD semantic search fallback now degrades safely to repository search when semantic payloads are non-canonical; product enrichment now returns None when enrichment URL is not explicitly configured, restoring deterministic fallback behavior expected by tests.
Quality Gate Revalidation: Repository validation on main now reports 1430 passed, 2 skipped for pytest and a passing pylint score (9.87/10, above fail-under threshold).
Branch Hygiene (Issue #248): temporary remediation branches/artifacts were pruned from local operations clones to keep stabilization flow aligned to main.
Runtime Hotfix Notes (2026-03-17)
AGC Subnet Drift Realignment: Shared infra defaults now pin the delegated agc subnet to 10.0.12.0/24 so azd provision matches the live dev VNet layout and stops attempting a destructive subnet replacement during canonical deploy runs.
GitHub OIDC Hook Refresh: Root POSIX azdpostprovision and postdeploy hooks now refresh Azure CLI login from the live GitHub Actions OIDC token immediately before Azure/AKS operations, retrying empty or malformed token responses explicitly so remote deploy failures surface as actionable OIDC refresh errors instead of opaque JSON parsing crashes.
Runtime Hotfix Notes (2026-03-19)
Dependency Management Hardening (Issue #316): Agent service pyproject.toml dependencies now use minimum version constraints for key runtime libraries and include a local editable holiday-peak-lib source mapping for development workflows.
Lockfile and Build Reproducibility: App .dockerignore files now include uv.lock in Docker build context, service Dockerfiles use frozen uv sync installs, and CI validates yarn.lock / uv.lock freshness before linting.
Infrastructure Secret and Region Hygiene: Shared infrastructure now uses a random newGuid()-seeded fallback for PostgreSQL admin password generation and parameterizes Azure AI Foundry location instead of hardcoding the region.
UI Product Enrichment Monitoring (Issue #352): Admin now exposes a dedicated enrichment monitoring route (/admin/enrichment-monitor) with real-time stage/job visibility, retry controls on monitor failures/log entries, and throughput + approval-rate visual indicators; search now surfaces explicit mode/intent signal chips; top navigation includes a pipeline status indicator that links directly to the monitor.
Runtime Hotfix Notes (2026-04-05)
Catalog Search Strict AI Search Runtime Hardening (Issue #675): Deployment workflow now propagates CATALOG_SEARCH_REQUIRE_AI_SEARCH=true for ecommerce-catalog-search and false for other services, Helm render hooks pass the variable into container manifests, and runtime/docs now codify fail-closed readiness with bounded startup/readiness seeding instead of continuity fallback in AKS strict mode.
Runtime Hotfix Notes (2026-04-06)
CRUD Product Listing Availability Guardrail: GET /api/products now fails open to an empty list when repository access times out or is temporarily unavailable, preventing repeated 503 responses while preserving a stable read contract for storefront and admin surfaces.
CRUD Redis Auth Secret Wiring: Shared infrastructure now persists the Redis primary key into Key Vault (redis-primary-key) and propagates REDIS_PASSWORD_SECRET_NAME through azd/Helm hooks; CRUD startup resolves the secret at runtime and readiness now authenticates Redis pings using the configured secure URL path.
Runtime Hotfix Notes (2026-04-10)
AKS Redis Runtime Contract Hardening: Helm render hooks now suppress passwordless REDIS_URL injection when REDIS_HOST is already available, preserving the intended Key Vault secret-resolution path for Redis credentials in deployed services.
CRUD Deployment Readiness Smoke Tightening: deploy-azd rollout checks now validate CRUD via /api/ready instead of /api/health, so PostgreSQL and Redis dependency regressions fail deployment validation instead of passing shallow liveness checks.
CRUD Dev Readiness Probe Alignment: Dev/local Helm rendering no longer downgrades CRUD readiness to /health, so dependency outages are surfaced consistently in-cluster instead of being masked by process-only liveness.
PostgreSQL Auth Contract Alignment: POSTGRES_AUTH_MODE now drives the Flexible Server auth policy in Bicep and a pre-rollout workflow guard verifies the live server matches the configured runtime auth mode before CRUD is redeployed.
CRUD Entra Principal Alignment: Entra-mode deployment outputs now resolve POSTGRES_USER to the CRUD workload identity principal (<project>-<env>-crud-identity), matching the pod identity used for token acquisition instead of the legacy agentpool principal.