Release v0.6.0 — Model abstraction, observability & quantitative eval · addiescode-sj/readmycareer.com

v0.6.0 — Model abstraction, observability & quantitative eval

This release closes the "last-mile" gap of measuring, exposing, and proving the pipeline: a provider-agnostic model layer, end-to-end agent telemetry with an admin view, and content-quality eval metrics — plus a single source of truth for model ids and pricing.

✨ Added

Model-provider abstraction — a provider-agnostic ModelAdapter (agents/lib/model-adapter.ts) with Gemini and OpenAI implementations. The gap-analysis stage can run on OpenAI via the provider field on POST /api/analyze or the MODEL_PROVIDER env var; planning stays on Gemini to preserve the context-cache path.
Single model & pricing registry — model ids are centralized in agents/lib/models.ts (GEMINI_MODEL / OPENAI_MODEL, env-overridable) and per-1M-token prices in config/model-pricing.json, read by both the TS agent layer and the Python eval harness. Setting GEMINI_MODEL once switches in-process agents, app routes, and the spawned MCP skills together.
Agent observability — structured per-stage telemetry (latency, tokens, cache hits, retries, success/failure) emitted as JSON logs and aggregated in-process; metrics persist to a new RLS-enabled agent_runs table (written by the API route — agents never touch the DB) and surface at /admin/observability.
Quantitative eval — gap-analysis recall/precision vs. labeled gaps, run-to-run variance (--repeat N), regression baseline + diff (--save-baseline / --compare-baseline); a Grounding / Citation Rate with per-case source attribution; a cross-model comparison harness; and a README results-table generator.
Korean eval report — each agent-harness run regenerates a human-readable Korean report (documents/agent-eval-report.md); the cost methodology is documented in documents/cost-calculation.ko.md.
Reference-grounding tooling — the career-knowledge-base sync labels reference docs by Drive folder and returns a by_doc_type breakdown, plus a retrieval inspector to view ranked hits per query / doc_type.

🔧 Changed

agents/orchestrator.ts routes all LLM calls through the ModelAdapter interface instead of the inline callGemini helper (context-cache logic moved into the Gemini adapter unchanged); runCareerAnalysis gained optional provider and onMetric parameters.
Admin access control — added an is_admin flag + SECURITY DEFINER public.is_admin() helper; /admin/observability and its route now require the admin role, and agent_runs reads are admin-only via RLS.

Full changelog: see CHANGELOG.md. Compare: v0.5.2...v0.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.0 — Model abstraction, observability & quantitative eval

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.6.0 — Model abstraction, observability & quantitative eval

✨ Added

🔧 Changed

Uh oh!