All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- API package split — monolithic
api/main.pyrefactored intoapi/app.py,lifespan.py,middleware.py,dependencies.py,router_factory.py, andapi/endpoints/*. - Domain packs namespaced by business domain —
domain_packs/research/,productivity/,hr/,finance/,legal/,common/(HR packs moved fromdomain_packs/rh/todomain_packs/hr/). - Built-in pack registration — consolidated in
pack_kernel/builtin_packs.py(called fromapi/lifespan.py). - Documentation updated for the new layout (
README.md,domain_packs/README.md,docs/architecture.md,CONTRIBUTING.md,connectors/README.md).
- Third and fourth domain packs —
summariser(SummariserPack, single-agent bullet summary) andanalysis_only(AnalysisOnlyPack, AnalystAgent on pre-supplied research). Registered inplatform/__init__.pywith control-plane policies and typed API routes (POST /packs/summariser/run,POST /packs/analysis_only/run). domain_packs/README.md— catalogue of built-in packs and authoring guide.- API helpers
_pack_primary_text,_invoke_pack_run,_iter_pack_stream_events,_serialize_pack_resultso typed bodies withtext(not onlyquery) work on pack routes. - Nine vertical domain packs —
meeting_prep,rfp_assistant,support_triage,executive_brief,contract_reviewer,financial_memo, plus HR packs underdomain_packs/hr/:talent_screening,job_description_writer,hr_policy_qa. Shared base:domain_packs/common/structured_llm.py(StructuredLLMPack). pack_kernel/builtin_packs.py— single registration source for all built-in packs.
- Platform kernel renamed to
pack_kernel/— the former top-levelplatform/package shadowed Python's stdlibplatformmodule (required fragileconftest.pybootstrap and asys.pathscan hack). All imports now usepack_kernel; the oldplatform/path is removed (no compat shim — any shim would still shadow the stdlib). - Proxy-aware rate limiting — honour
X-Forwarded-For/Forwardedonly from trusted peers (TRUST_PROXY_HEADERS,FORWARDED_ALLOW_IPS); per-Bearer-token buckets whenAPI_KEYis set; Helm prod values and uvicorn--forwarded-allow-ipsdocumented. - Regulated vertical pack compliance scaffolding —
PackPolicy.human_review_required, server-injected mandatorydisclaimer+human_review_requiredon outputs for HR/legal/finance packs;COMPLIANCE.mdper regulated pack;tests/test_compliance.py. api/dependencies.py— injects shared connector into any pack whose__init__acceptsconnector=(not onlyresearch_analysis).control_plane/__init__.py— policy table covers all registered packs.
validate_pack_body— scans every string field on typed pack requests for injection/SSRF patterns (closes gap where only the primaryquery/textlabel was validated).InputValidator.check_content_safety— per-field length + pattern checks for document-sized inputs.- Bandit CI scope extended to
platform/,domain_packs/,connectors/,control_plane/. - LLM JSON cap —
StructuredLLMPackrejects parsed responses over 512 KiB. - Prompt injection guards —
domain_packs/common/prompt_safety.pywraps untrusted content in delimiters; vertical packs useformat_vertical_prompt. - Production auth —
SettingsrejectsENVIRONMENT=productionwithoutAPI_KEY. - Bandit clean-up — retry jitter via tenacity
wait_random_exponential(stdlibrandom, notsecrets); pack traffic split usesrandom.choiceswith# nosec B311; no bareexcept: passon connection close. - Pyright —
domain_packs/included in typecheck scope; agent init narrowed for optional members.
- Platform kernel —
platform/(BaseDomainPack,PackRegistry),domain_packs/research_analysis/(ResearchAnalysisPack);core/graph.pyshim (MultiAgentGraphalias);DEFAULT_PACK_ID; contract and API tests. - Per-run cost attribution (
core/cost.py):CostTrackercallback handler accumulates token costs per model run using a configurable pricing table;BudgetExceededErrorandAgentBudgetExceededErrorraised whenbudget_usdis exceeded;pack_run_cost_usd_totalPrometheus counter emitted per run.cost_usdexposed on agent, pack, and API responses; HTTP 402 returned when budget is exceeded. New settings:PACK_DEFAULT_BUDGET_USD,LLM_COST_TABLE_PATH. - Typed pack schemas + auto API wiring (
platform/base_pack.py,platform/registry.py):input_schema/output_schema/versionClassVars onBaseDomainPack;get_schemas()andlist_packs_with_metadata()onPackRegistry; dynamic_build_pack_router()generates per-pack endpoints at startup. New endpoints:GET /packs,GET /packs/{pack_id}/versions,PATCH /packs/{pack_id}/versions/{version}/weight.ResearchAnalysisInput/ResearchAnalysisOutputPydantic schemas indomain_packs/research_analysis/schemas.py. - Pack versioning + traffic split (
platform/registry.py):PackVersiondataclass;_registryrefactored todict[str, list[PackVersion]];set_weights()for traffic-split configuration.X-Pack-Versionrequest header pins to a specific version;X-Pack-Version-Usedresponse header reports the actual version used. Sticky session support viaget_pack_version_for_session()(SQLite backend).save_runstorespack_versionmetadata. - Second domain pack
research_only(ResearchOnlyPack) with typed/packs/research_only/runroutes. - Retrieval connectors —
example_memory,http(CONNECTOR_HTTP_URL), andrag(RAG_ENABLED); API injection viaCONNECTOR_ENABLEDintoResearchAnalysisPack; optionalconnector=on the pack constructor. - Control plane enforcement —
PolicyRegistry,control_plane/enforce.py(per-pack query limits, budget ceiling, stream timeout); foundation types inconnectors/andcontrol_plane/. - Sticky pack versions on Redis and Postgres run-history backends (
get_pack_version_for_session). examples/custom_pack/:SummariserPackreference implementation showing how to author a third-party domain pack.
agents/base_agent.py:budget_usdconstructor parameter added;cost_usdproperty reads from attachedCostTracker.domain_packs/research_analysis/pack.py: cost propagation through the pipeline;cost_usdproperty on the pack instance; optional connector merge in the research phase.api/models.py:cost_usdfield added toRunResponseandResearchResponse.infra/Dockerfile: copiesplatform/,domain_packs/,connectors/,control_plane/into the runtime image.
- Mock LLM provider (
LLM_PROVIDER=mock) — run the full pipeline without any API key using deterministicFakeListChatModelresponses from langchain-core - Real-time SSE streaming via
MultiAgentGraph.stream_events()using LangGraph'sastream_events(version="v2")— replaces the batchphase_completedevents with truephase_started,phase_completed, andtokenevents streamed as nodes execute - Redis run history backend (
RedisRunHistory) — stores run history in Redis hashes with sorted sets for chronological and session-based ordering; eliminates the need for a SQLite PVC whenMEMORY_BACKEND=redis - PostgreSQL run history backend (
PostgresRunHistory) — stores run history in arun_historytable with native JSONB session filtering create_run_history(settings)factory incore/memory.py— auto-selects the matching backend with graceful fallback to SQLiteRunHistoryStoreprotocol for type-safe backend interchangeability- Terraform: private cluster endpoints for all three clouds
- EKS:
endpoint_private_access = true, configurablepublic_access_cidrs - GKE:
private_cluster_config+master_authorized_networks_config - AKS:
api_server_access_profilewithauthorized_ip_ranges
- EKS:
- Terraform: dedicated VPC for EKS (replaces default VPC) with private/public subnets, NAT Gateway, and proper route tables
- Prometheus metrics:
llm_request_duration_seconds(Histogram) andllm_tokens_total(Counter by direction) in_invoke_llm_with_retry contextvars.copy_context()in_run_in_executor— request_id now propagates into thread pool workers for log correlation- Helm NetworkPolicy: namespace-scoped ingress + egress rules (DNS + HTTPS),
enabled by default in
values.prod.yaml - 7 unit tests for
extract_text_content(multi-modal, list, fallback paths) - 6
pytest-asynciotests for SSE streaming withhttpx.AsyncClient - 2 tests for Mock LLM provider
- Terraform module for Azure AKS (
infra/terraform/modules/aks/) with Log Analytics workspace, auto-scaling node pool, System-Assigned Managed Identity, and Helm chart deployment - AKS entry point (
infra/terraform/aks/) withsubscription_id(mandatory since AzureRM 4.x) andredis_urlvariables redis_urlvariable added to EKS and GKE modules/entry points for secret parity with AKS
- BREAKING (SSE):
/run/streamevent types changed:agent_switch→phase_started+phase_completed(emitted in real time)- New
tokenevent type for LLM token-level streaming
- BREAKING (Terraform): Provider version bumps across all modules:
hashicorp/azurerm~> 3.0→~> 4.0hashicorp/aws~> 5.0→~> 6.0hashicorp/google~> 5.0→~> 7.0hashicorp/helm~> 2.12→~> 3.1hashicorp/kubernetes~> 2.25→~> 3.0
- CORS fail-closed in production: no wildcard unless
CORS_ORIGINSis explicitly set /docsand/redocdisabled whenENVIRONMENT=production- Rate limiter initialisation moved from import time to lifespan (no Redis connection on import)
thread_pool_max_workersdefault raised from 4 to 8_extract_text_contentrenamed toextract_text_content(public API)to_dict()on dataclasses usesdataclasses.asdict()instead of manual dictvars(report)replaced withreport.to_dict()for consistencyexcept AgentErrornarrowed toexcept (AgentExecutionError, AgentTimeoutError, AgentValidationError)—AgentConfigurationErrornow propagates to the callerMultiAgentGraph._get_executor()uses double-checked lockingMultiAgentGraph.close()useswait=Trueand resets_executor = None_NoOpTracersingleton cached at module level_safe_evalreturn type narrowed fromAnytoint | float- Trivy CI action pinned to SHA (
@6e7b7d1f...) instead of@master - Redis default password removed from
docker-compose.yml— fails ifREDIS_PASSWORDnot set - Terraform root directory converted to documentation-only (per-cloud entry points)
- CRITICAL:
request_idlost in thread pool —contextvars.copy_context()now propagates context variables into executor threads - CRITICAL: Race condition in
MultiAgentGraph._get_executor()— concurrentarun()calls could create multiple thread pools - CRITICAL:
_run_in_executorsilently fell back to unbounded default executor when_executor is None— now raisesRuntimeError - CRITICAL: Resource leak in
_stream_pipeline—MultiAgentGraphwas not closed via context manager, leaking the internalThreadPoolExecutor type: ignore[union-attr]inanalyst.pyreplaced with properextract_text_content()call for multi-modal LLM content- ConversationMemory warning for non-SQLite backends changed to
logger.infowith actionable persistence guidance
- CI job
integration:pytest -m integrationwith Docker +uv sync --extra redis --extra postgres - E2E tests (
tests/test_integration_real.py): fullMultiAgentGraph.run()with real SqliteSaver, PostgresSaver, and RedisSaver (redis-stack image for RedisJSON) - README:
RATE_LIMIT_BACKEND, Redis rate limiter fail-open policy, test suite layout (mocked vs integration),/healthLLM semantics, Prometheus metrics reference table /readyreadiness probe endpoint — separate from/healthliveness probe- Fail-fast checkpointer in production (
_fallback_or_raiseincore/memory.py) - OTel
insecure=Trueconditional onOTEL_EXPORTER_OTLP_INSECUREor localhost endpoint - LLM retry with backoff activated in all 6 agent graph nodes
Content-Security-Policy: default-src 'self'header (replaces deprecatedX-XSS-Protection)- Helm NetworkPolicy template (opt-in via
networkPolicy.enabled) - Tests: SSE done event validation, timeout error event, shutdown 503 guard, rate limiting on /research, auth exempt paths, Redis/Postgres checkpointer mocks, populated session history
- Tests: all 5 security headers asserted (CSP, Referrer-Policy, Cache-Control, X-Frame-Options, X-Content-Type-Options)
- Tests:
/readyendpoint (200 OK, 503 LLM not init, 503 shutting down) - Tests:
/healthwith LLM initialised — regression test forllm_provider.valuebug - Tests: vectorstore happy path (Chroma + PGVector with mocked imports)
- Tests:
recall_historytool (empty, populated, error, no-summary-key) - Tests: Tavily/SerpAPI search provider branches (success + ImportError)
- CORS production hardening documented in README Security section
- Terraform remote state backend warning and instructions in
versions.tf - All 20+ Makefile targets documented in README
_node_validatedefaults tois_sufficient=Falseon error (fail-close instead of fail-open)/researchendpoint usessession_idasthread_id(unified with/run/stream)GET /sessions/{id}/historynow runslist_runs_by_sessionvia_run_in_executorResearchRequest.session_idgainsmax_length=128+ alphanumeric pattern validationMultiAgentGraphexecutor usesget_settings().thread_pool_max_workers- Helm
readinessProbepoints to/readyinstead of/health - CI Docker cache key content-based (
hashFiles) instead of commit SHA - Security workflow
pip-auditsyncs with--extra anthropicto match production image analyst.pyexcept blocks now logexc_info=Truefor debugging_extract_text_contentused consistently in all fallback paths- Rate limit middleware now excludes
/readyalongside/health
- CRITICAL:
settings.llm_provider.value→settings.llm_provider—LLMProviderisLiteral(str), not Enum;.valuecausedAttributeErroron/health - Release versioning:
pyproject.toml, HelmChart.yaml, and API_APP_VERSIONset to0.3.0 Chart.yamlmaintaineryour-name→brescoudocs/security.mdlinkyour-org→brescou- Terraform
kubernetes_secret: use provider-validdatamap (invalidstring_dataremoved in GKE and EKS modules) - CI infra-lint: validate Terraform modules only; root module uses legacy child providers +
count(not validatable in CI without credentials) - Helm: moved
REDIS_URLfrom ConfigMap to Secret (passwords must not be in cleartext ConfigMaps) CLAUDE.mddirectory tree: addedcore/observability.py- Dockerfile: pinned base images by SHA digest; documented
--build-arg LLM_EXTRAS
- Provider-agnostic LLM factory (
core/llm.py) supporting Anthropic, OpenAI, Google, AWS Bedrock, Azure OpenAI, and Ollama get_settings()with@lru_cachereplacing module-level singletonConversationMemory.list_runs_by_session()with SQL-side session filtering- Optional Bearer token API authentication (
API_KEYenv var) - SSE stream timeout enforcement via
asyncio.timeout - Shared LLM and checkpointer instances pre-warmed at FastAPI lifespan startup
- Helm chart for Kubernetes deployment (
infra/helm/) - Terraform modules for GKE and EKS (
infra/terraform/) - Multi-agent example patterns: sequential, parallel, supervisor, human-in-the-loop
agents/models.py— decoupledResearchResultandAnalysisReportdataclasses_extract_text_content()helper for safe json.loads on multi-modal LLM responsesMultiAgentGraphcontext manager protocol (__enter__/__exit__)- Graceful shutdown guard — endpoints return 503 when
_shutting_downis set session_idvalidation:max_length=128,pattern=^[a-zA-Z0-9_-]+$onRunRequestTHREAD_POOL_MAX_WORKERSsetting (default 4) for configurable thread pool sizing.dockerignoreto exclude .env, .git, tests, docs from Docker build context- PodDisruptionBudget template for Helm chart
@model_validatoron Settings enforcingPOSTGRES_URLwhenmemory_backend=postgres- Exponent guard (
max 1000) on calculatorast.Powoperator math.isfinite()guard beforeint(result)in calculator tooltests/test_config.py— settings caching, llm_config, cross-field validatorstests/test_observability.py— structured logging and OTel tracing coverage
MultiAgentGraphnow uses the configured memory backend instead of hardcodedMemorySaver- All LLM providers now honour
max_tokens(or equivalent) from settings sanitize_log_datanow recurses into list values and checks sensitive key before type- Middleware registration order: rate limiter now executes before auth (brute-force protection)
_stream_pipelineuses thread-safeget_shared_llm()/get_shared_checkpointer()accessors- SSE error events return generic messages instead of leaking
str(exc)internals - ConversationMemory read operations (
get_run,list_runs,list_runs_by_session) protected by_lock - SQLite isolation level changed from
NonetoDEFERREDwiththreading.Lockprotection _input_validatorrenamed toinput_validator(cross-module import convention)- Conftest fixtures use real
ResearchResult/AnalysisReportinstances (function-scope) - LLM/checkpointer initialization uses double-checked locking with
threading.Lock - Agents return partial state updates instead of full
{**state, ...}copies run()/run_structured()DRY-ed via shared_execute()method in both agents- Retry logic uses jitter (
0.5 + random.random()) in exponential backoff - Dockerfile comment corrected:
--frozen→--locked - Docker Compose Redis healthcheck uses
-a ${REDIS_PASSWORD:-changeme} - Helm
replicasfield conditional onautoscaling.enabled - CI workflow:
permissions: contents: read, Python matrix (3.12+3.13), parallel lint/test pyproject.tomlcoverage threshold raised from 50% to 70%
dir()antipattern inResearchAgent._node_summarizereplaced with proper scoping- Redis URL no longer logged in plain text (credentials stripped)
_is_sensitive_keynow uses word-boundary regex to avoid false positives- Redis
requirepassnow compatible with healthcheck (-aflag added) - Terraform GKE
deletion_protectioncondition:"production"→"prod" - SSTI regex refined to avoid false positives on
{{config}}while catching real injections
- Initial release of langgraph-agent-stack template
ResearchAgentandAnalystAgentwith LangGraph state machinesMultiAgentGraphorchestrator with conditional routing- FastAPI REST API with SSE streaming
- SQLite / Redis / PostgreSQL checkpointing via
ConversationMemory - Security:
InputValidator,RateLimiter, bandit SAST, gitleaks scanning - CI/CD with GitHub Actions (lint, test, Docker build)
- Docker multi-stage build and docker-compose for local development