Suggestion: WFGY as an interpretability-oriented failure map for LLM systems

Hi, thanks for this excellent interpretability and responsible ML list.

I would like to suggest WFGY as a resource that lives between interpretability, debugging and evaluation for LLM systems.

- Project: WFGY (All Principles Return to One)
- Repo: https://github.com/onestardao/WFGY
- License: MIT
- Stars: ~1.4k+

Why it is relevant to interpretability:

- Instead of only asking “why did the model say this?”, WFGY asks “which failure mode of the **system** is active right now?”.  
- This shifts the focus from opaque model internals to explicit, human-readable categories.

The central artifact:

**WFGY 2.0 ProblemMap**  
https://github.com/onestardao/WFGY/tree/main/ProblemMap  

- 16 failure modes for LLM / RAG pipelines, each with: definition, examples, and diagnostic prompts.  
- Many of them are directly related to interpretability themes: e.g. symbol collapse, entropy collapse, semantic recursion, multi-agent chaos memory.

Support material:

- **WFGY 1.0 PDF** gives the mathematical structure and experimental evidence.  
- **WFGY 3.0 Singularity Demo** provides 131 high-stakes questions that can be used as a challenging evaluation set.

If you think “system-level interpretability & debugging frameworks” are in scope, WFGY might be a useful addition. I can propose a short entry via PR.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: WFGY as an interpretability-oriented failure map for LLM systems #524

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Suggestion: WFGY as an interpretability-oriented failure map for LLM systems #524

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions