Skip to content

Suggestion: WFGY as an interpretability-oriented failure map for LLM systems #524

@onestardao

Description

@onestardao

Hi, thanks for this excellent interpretability and responsible ML list.

I would like to suggest WFGY as a resource that lives between interpretability, debugging and evaluation for LLM systems.

Why it is relevant to interpretability:

  • Instead of only asking “why did the model say this?”, WFGY asks “which failure mode of the system is active right now?”.
  • This shifts the focus from opaque model internals to explicit, human-readable categories.

The central artifact:

WFGY 2.0 ProblemMap
https://github.com/onestardao/WFGY/tree/main/ProblemMap

  • 16 failure modes for LLM / RAG pipelines, each with: definition, examples, and diagnostic prompts.
  • Many of them are directly related to interpretability themes: e.g. symbol collapse, entropy collapse, semantic recursion, multi-agent chaos memory.

Support material:

  • WFGY 1.0 PDF gives the mathematical structure and experimental evidence.
  • WFGY 3.0 Singularity Demo provides 131 high-stakes questions that can be used as a challenging evaluation set.

If you think “system-level interpretability & debugging frameworks” are in scope, WFGY might be a useful addition. I can propose a short entry via PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions