Skip to content

Add evaluation get-started guide (GTM-1941)#2713

Draft
felixkrrr wants to merge 1 commit intomainfrom
cursor/GTM-1941-evals-onboarding-guides-8f5c
Draft

Add evaluation get-started guide (GTM-1941)#2713
felixkrrr wants to merge 1 commit intomainfrom
cursor/GTM-1941-evals-onboarding-guides-8f5c

Conversation

@felixkrrr
Copy link
Contributor

Summary

Adds a proper Get Started guide for the Evaluation section, addressing GTM-1941. The current evaluation docs have detailed reference pages but lack a clear onboarding path for new users.

What changed

New file: content/docs/evaluation/get-started.mdx

A structured get-started guide that follows the same pattern as the existing observability and prompt-management get-started pages:

  • "Use AI" tab — points users to the Langfuse Skill for agent-assisted setup
  • "Do it yourself" tab — a decision tree with three paths:
    1. Monitor Production — set up LLM-as-a-Judge on live traces (for teams that already have traces flowing)
    2. Test Before Shipping — run experiments with datasets and evaluators via SDK (for teams building/iterating on prompts)
    3. Human Review — set up annotation queues for domain expert review (for teams needing ground truth)
  • Full Python and JS/TS code examples for the experiments path
  • "Next steps" section guiding users to combine methods, build datasets, add to CI, and track trends

Updated files

  • content/docs/evaluation/meta.json — added get-started to the navigation, placed between overview and core-concepts
  • content/docs/meta.json — updated the top-level "Set up Evals" link to point to /docs/evaluation/get-started instead of /docs/evaluation/overview
  • content/docs/evaluation/overview.mdx — updated the "Getting Started" section to reference the new get-started guide

Analysis of current gaps

The evaluation docs currently have:

  • overview.mdx — high-level intro with a brief "Getting Started" section that just lists links
  • core-concepts.mdx — detailed concept explanations
  • 5 evaluation method pages (LLM-as-a-Judge, annotation queues, scores via SDK/UI, score analytics)
  • 4 experiment pages (data model, datasets, experiments via SDK/UI)

What was missing:

  • A proper onboarding flow that helps users choose the right evaluation method for their situation
  • Quick-start code examples in one place (the experiments-via-sdk page has examples but is 1300+ lines of reference docs)
  • A decision framework for choosing between monitoring, experiments, and human review
  • Consistency with the observability and prompt-management sections that both have dedicated get-started pages

Linear Issue: GTM-1941

Open in Web Open in Cursor 

…uation method

- Create content/docs/evaluation/get-started.mdx with three paths:
  Monitor Production (LLM-as-a-Judge), Test Before Shipping (experiments),
  and Human Review (annotation queues)
- Follow same pattern as observability and prompt-management get-started pages
  with AI agent / Do it yourself tabs
- Add to evaluation section navigation (meta.json)
- Update top-level docs sidebar to link to get-started instead of overview
- Update overview.mdx to reference the new get-started page

Co-authored-by: felixkrrr <felixkrrr@users.noreply.github.com>
@vercel
Copy link

vercel bot commented Mar 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
langfuse-docs Ready Ready Preview, Comment Mar 25, 2026 3:48pm

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants