Skip to content

M6: Enhanced Health Check with LLM and RAG Status #655

@kovtcharov

Description

@kovtcharov

Priority: P2 — Operational reliability

Effort: Small (1-2 days)

Problem

/health returns {"status": "ok"} when the FastAPI process is running, even if Lemonade Server is down or no model is loaded. Webapps that check health before accepting traffic get false positives.

Deliverable

Extended health endpoint that reports component status:

GET /health
{
  "status": "ok",
  "components": {
    "api": { "status": "ready" },
    "llm": {
      "status": "ready",
      "backend": "lemonade",
      "model": "Qwen3.5-35B-A3B-GGUF",
      "url": "http://localhost:8000"
    },
    "rag": {
      "status": "ready",
      "documents": 3,
      "total_chunks": 247
    }
  }
}

Status values: ready, unavailable, error, not_configured

Files to change

Acceptance criteria

  • /health returns component-level status
  • Returns "status": "degraded" if LLM is unavailable but API is up
  • Response time under 500ms (short timeout on Lemonade ping)
  • Backward compatible — top-level "status" field still works for simple checks

Context

Full milestone plan: docs/plans/webapp-integration.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    domain:surfacesAgent UI, Telegram, WhatsApp, Slack/Discord, mobileenhancementNew feature or requesttrack:consumer-appHermes-competitor consumer product — mobile-first, voice + messaging + memory + skills

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions