Skip to content

Proposal: ship WFGY 16-problem RAG failure labels as an optional Helicone tagging scheme #5612

@onestardao

Description

@onestardao

Hi Helicone team,

Helicone is already a powerful gateway and observability layer for LLM applications. Many of those apps are RAG or RAG+agent pipelines, and users often lack a clean taxonomy to describe why a given request failed.

I maintain WFGY RAG 16 Problem Map, an open-source project focused on RAG / LLM failure modes and diagnostics.

Repo (MIT):
https://github.com/onestardao/WFGY

Main RAG failure map page:
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md

WFGY includes:

  • A 16-label RAG failure taxonomy (retrieval, prompt, structure, infra)
  • A triage prompt that takes a request/trace and assigns one of those labels
  • Structured fix suggestions for each label

The same map is already used or cited by:

  • RAGFlow and LlamaIndex RAG debugging docs
  • ToolUniverse – Harvard MIMS Lab
  • Rankify – University of Innsbruck
  • Multimodal RAG Survey – QCRI LLM Lab
  • And curated lists like Awesome LLM Apps and Awesome Data Science – academic

Proposal

Add WFGY’s 16-problem map as an optional tagging scheme inside Helicone, for example:

  1. A small recipe / docs section showing how to:

    • Export or sample failing Helicone traces for a RAG app.
    • Run the WFGY triage prompt on each trace and compute a rag_failure_type tag.
    • Push those tags back into Helicone metadata.
  2. Optionally, a template dashboard that:

    • Breaks down requests by rag_failure_type.
    • Helps users see whether they mostly suffer from retrieval issues, prompt design issues, infra issues, etc.

This would give Helicone users a practical, standardized language for RAG failures on top of the existing observability features.

If this sounds aligned with the roadmap, I’d be happy to draft a short guide or example for a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions