Skip to content

Latest commit

 

History

History
295 lines (216 loc) · 17.7 KB

File metadata and controls

295 lines (216 loc) · 17.7 KB

Holiday Peak Hub — A Reference Architecture for Agentic Microservices

Version: 1.0 Last Updated: 2026-04-12


What Are Agentic Microservices?

Agentic Microservices extend the traditional microservices pattern by embedding AI agent capabilities directly inside each service boundary. Instead of bolting a monolithic AI layer on top of existing services, each microservice carries its own agent runtime that can:

  • Plan dynamically at runtime based on context (not just execute static call sequences)
  • Use tools via the Model Context Protocol (MCP) to call other services
  • Maintain memory across interactions using a tiered persistence strategy
  • Route between models (SLM for simple queries, LLM for complex ones) based on real-time complexity assessment
  • Self-heal by detecting, classifying, and remediating infrastructure incidents autonomously

This pattern is distinct from both traditional microservices (deterministic, code-defined flows) and monolithic AI gateways (single-point bottleneck for all AI interactions).


Why This Repository Is Both Framework AND Product

Holiday Peak Hub is a framework AND a product, not just a reference implementation. lib/holiday_peak_lib/ is an opinionated agentic-microservices framework with stable, versioned seams that architects fork and adopt. apps/ is a production-grade retail platform built on it — 26 agents running in production-shape under real SLOs, AGC weighted-canary routing, continuous evaluation, three-tier memory, and full observability. Both halves ship together. Distribution under Azure-Samples/ is a channel, not a quality tier.

Canonical positioning: .github/instructions/repository-purpose.instructions.md.

Characteristic Implementation
Framework layer holiday-peak-lib provides BaseRetailAgent, AgentBuilder, ModelTarget, FastAPIMCPServer, three-tier memory, guardrails, routing strategy, evaluation runners, telemetry — versioned, contracted, designed for adoption
Product layer 1 transactional microservice (crud-service) + 26 agent services across CRM/eCommerce/Inventory/Logistics/Product Management/Search/Truth Layer + 1 Next.js frontend
Microsoft Agent Framework (MAF) agent-framework>=1.2.0 runs agent_framework.Agent in-process over a pluggable ChatClient (default agent_framework_foundry.FoundryChatClient). The retired portal-agent runtime path was removed in Wave 4c; Foundry remains the model-deployment, telemetry, and evaluation backend.
SLM-first routing Every request starts with GPT-5-nano (fast, cheap); complex queries escalate to GPT-5 (rich)
Three-tier memory Hot (Redis, <50ms) → Warm (Cosmos DB, 100-500ms) → Cold (Blob, archival)
Agent-to-agent communication MCP protocol for structured tool calls between agents
Event-driven async Azure Event Hubs for decoupled CRUD → Agent processing
GitOps deployment Flux CD reconciles rendered Helm manifests to AKS with namespace isolation
Self-healing runtime Incident lifecycle: detect → classify → remediate → verify → escalate
1796 automated tests 1136 lib + 660 app tests, 89% coverage, CI/CD enforced

Microsoft Technology Stack

This reference architecture is built entirely on Microsoft's AI and cloud platform:

Layer Technology Purpose Documentation
AI Runtime Microsoft Agent Framework Agent execution, tool forwarding, message protocol MAF Python API
AI Models Azure AI Foundry Model hosting, Agents V2 API, prompt governance Foundry Agents quickstart
Search Azure AI Search Vector + hybrid search, semantic ranking AI Search overview
Warm Memory Azure Cosmos DB User profiles, search history, agent state Cosmos DB for NoSQL
Hot Memory Azure Cache for Redis Session state, real-time context Redis quickstart
Cold Memory Azure Blob Storage Archival, catalog snapshots, images Blob Storage overview
Messaging Azure Event Hubs Async CRUD → Agent event processing Event Hubs Python SDK
Compute Azure Kubernetes Service Namespace-isolated agent pods with KEDA autoscaling AKS overview
API Gateway Azure API Management Traffic management, auth, AI policies APIM overview
CRUD Data Azure Database for PostgreSQL Transactional data (orders, products, users) PostgreSQL Flexible Server
Secrets Azure Key Vault Connection strings, API keys, certificates Key Vault overview
Observability Azure Monitor + Application Insights Distributed tracing, KQL queries OpenTelemetry for Azure
GitOps Flux CD on AKS Continuous reconciliation, drift detection Flux on AKS
Frontend Azure Static Web Apps Next.js 15 hosting with managed SSL SWA overview
Identity Microsoft Entra ID JWT validation, RBAC, managed identity Entra ID overview
CI/CD GitHub Actions Build, test, deploy workflows Actions documentation

Architectural Patterns Demonstrated

1. Agentic Microservices (Core Pattern)

Each agent service is a self-contained FastAPI application that:

  • Owns its domain logic
  • Hosts its own agent runtime (via BaseRetailAgent)
  • Exposes REST endpoints for Frontend/CRUD and MCP tools for agents
  • Subscribes to Event Hub topics for async processing
  • Maintains independent memory namespaces

2. SLM-First Model Routing

Every request starts with the fast (SLM) model. The agent evaluates complexity and only escalates to the rich (LLM) model when needed. This reduces cost by 60-80% while maintaining quality for complex queries.

3. Three-Tier Memory Architecture

Agents operate on context assembled from three tiers with different latency and cost profiles, read in parallel via asyncio.gather:

  • Hot (Redis): Session state, <50ms reads, TTL-based expiry
  • Warm (Cosmos DB): User profiles, search history, enrichment state, 100–500ms
  • Cold (Blob Storage): Interaction logs, catalog snapshots, archival

4. Event-Driven Agent Processing

CRUD operations publish domain events to Azure Event Hubs. Agents subscribe to relevant topics and process asynchronously:

Frontend → CRUD Service → Event Hubs → Agent(s)
                           ↑ publish       ↓ consume
                      (order.created)  (enrich, classify, alert)

5. MCP Tool Protocol for Agent-to-Agent Communication

Agents expose tools via FastAPIMCPServer that other agents can invoke:

@mcp.tool()
async def get_inventory_status(sku: str) -> dict:
    """Check real-time stock level for an SKU."""
    return await adapter.check_stock(sku)

This enables compositional intelligence: the checkout agent can call the inventory agent's tools without tight coupling.

6. Enrichment Guardrails

The EnrichmentGuardrail validates agent outputs before they reach downstream consumers:

  • Schema conformance (Pydantic model validation)
  • Content policy enforcement (no hallucinated data)
  • Confidence thresholds (reject low-confidence enrichments)
  • Audit trail (every decision logged with evidence)

7. GitOps Deployment with Flux CD

All services are deployed via rendered Kubernetes manifests reconciled by Flux CD:

  1. CI builds container images → pushes to ACR
  2. Helm templates rendered → committed to manifests branch
  3. Flux detects changes → applies to AKS namespaces
  4. Health checks validate rollout → auto-rollback on failure
  5. Drift detection ensures desired state is maintained

8. Self-Healing Runtime

The platform includes autonomous incident management:

  • Detection: Health probes, Azure Monitor alerts, APIM diagnostics
  • Classification: Incident type mapping (pod crash, memory pressure, model degradation)
  • Remediation: Strategy-specific handlers (AKS restart, APIM circuit break, Redis flush)
  • Verification: Post-remediation health check with configurable thresholds
  • Escalation: Human notification when automated remediation fails

Domain Architecture

The 26 agent services are organized into 7 bounded contexts:

%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#FFB3BA','primaryTextColor':'#000','primaryBorderColor':'#FF8B94','lineColor':'#BAE1FF','secondaryColor':'#BAE1FF','tertiaryColor':'#FFFFFF'}}}%%
graph TB
    subgraph CRUD["CRUD Service (PostgreSQL)"]
        API["REST API<br/>31 endpoints"]
    end

    subgraph ECOM["eCommerce Domain"]
        CS["Catalog Search"]
        PDE["Product Detail Enrichment"]
        CI["Cart Intelligence"]
        CKS["Checkout Support"]
        OS["Order Status"]
    end

    subgraph CRM["CRM Domain"]
        PA["Profile Aggregation"]
        SP["Segmentation &<br/>Personalization"]
        CAM["Campaign Intelligence"]
        SA["Support Assistance"]
    end

    subgraph INV["Inventory Domain"]
        AT["Alerts & Triggers"]
        HC["Health Check"]
        JIT["JIT Replenishment"]
        RV["Reservation Validation"]
    end

    subgraph LOG["Logistics Domain"]
        CSel["Carrier Selection"]
        ETA["ETA Computation"]
        RS["Returns Support"]
        RID["Route Issue Detection"]
    end

    subgraph PM["Product Management Domain"]
        ACP["ACP Transformation"]
        ASO["Assortment Optimization"]
        CSV["Consistency Validation"]
        NC["Normalization &<br/>Classification"]
    end

    subgraph SRCH["Search Domain"]
        SEA["Search Enrichment Agent"]
    end

    subgraph TRUTH["Truth Layer Domain"]
        TI["Truth Ingestion"]
        TE["Truth Enrichment"]
        TH["Truth HITL"]
        TX["Truth Export"]
    end

    UI["Next.js Frontend"] --> API
    API -->|events| EH["Azure Event Hubs"]
    EH --> ECOM & CRM & INV & LOG & PM & SRCH & TRUTH
    API <-->|sync + circuit breaker| ECOM
Loading

How to Use This Reference

For Platform Architects

Start with Architecture Overview and the ADR Index to understand the decision landscape. Review the MAF Integration Rationale for the agent runtime design.

For Service Developers

Read the lib README to understand the micro-framework, then look at any app's main.py + adapters.py + agents.py for the service pattern. The Standalone Deployment Guide covers single-service deployment.

For DevOps Engineers

Start with the Infrastructure README and Deployment Guide. Review ADR-017 (Flux CD) and ADR-026 (Namespace Isolation) for the GitOps model.

For AI/ML Engineers

Review the Foundry Agent Invocation Flow and MAF Integration Rationale. The SLM-first routing logic is in lib/src/holiday_peak_lib/agents/base_agent.py.


Comparison with Alternative Architectures

Approach Pros Cons When to Use
Agentic Microservices (this repo) Domain isolation, independent scaling, per-agent memory More services to operate, cross-agent latency Multi-domain platforms with distinct AI capabilities per domain
Monolithic AI Gateway Single deployment, centralized model management Single point of failure, no domain isolation, all-or-nothing scaling Simple chatbot or single-purpose AI applications
AI Sidecar Pattern Minimal code changes to existing services Limited agent autonomy, no inter-agent communication Adding AI to legacy services without rewrite
Orchestrator Agent Centralized planning, simpler routing Bottleneck at orchestrator, hard to scale horizontally Workflows with a single decision-maker

Microsoft Documentation Cross-References

Topic Official Documentation
Building AI agents with Microsoft Agent Framework MAF Python SDK
Azure AI Foundry agent creation and management Foundry Agents overview
Model Context Protocol (MCP) for tool integration MCP specification
Microservices architecture on Azure Azure Architecture Center — Microservices
Event-driven architecture patterns Azure Architecture Center — Event-driven
AKS best practices AKS baseline architecture
Cosmos DB data modeling Cosmos DB data modeling
Azure Well-Architected Framework WAF overview

Related Documents

4. Enrichment Guardrails

The EnrichmentGuardrail enforces that AI-generated content is always grounded in company-owned data (PIM, DAM, CRM). Agents never generate without a verifiable internal data source.

5. GitOps with Namespace Isolation

Flux CD reconciles rendered Helm manifests from a manifests branch. Each domain gets its own Kubernetes namespace with RBAC boundaries and network policies (ADR-017, ADR-026).

6. Self-Healing Runtime

An autonomous incident lifecycle (detect → classify → remediate → verify → escalate) handles infrastructure misconfigurations without human intervention, with audit trails and allowlisted remediation actions.


Getting Started

  1. Understand the architecture: Start with Solution Architecture Overview
  2. Explore the lib: Read lib/README.md for the shared framework
  3. Deploy a single service: Follow the Standalone Deployment Guide
  4. Deploy everything: Use azd up with the Deployment Guide
  5. Run tests: python -m pytest at the repository root (1796 tests)

Related Microsoft Guidance