-
Notifications
You must be signed in to change notification settings - Fork 623
Open
Description
Hi, thanks for this excellent interpretability and responsible ML list.
I would like to suggest WFGY as a resource that lives between interpretability, debugging and evaluation for LLM systems.
- Project: WFGY (All Principles Return to One)
- Repo: https://github.com/onestardao/WFGY
- License: MIT
- Stars: ~1.4k+
Why it is relevant to interpretability:
- Instead of only asking “why did the model say this?”, WFGY asks “which failure mode of the system is active right now?”.
- This shifts the focus from opaque model internals to explicit, human-readable categories.
The central artifact:
WFGY 2.0 ProblemMap
https://github.com/onestardao/WFGY/tree/main/ProblemMap
- 16 failure modes for LLM / RAG pipelines, each with: definition, examples, and diagnostic prompts.
- Many of them are directly related to interpretability themes: e.g. symbol collapse, entropy collapse, semantic recursion, multi-agent chaos memory.
Support material:
- WFGY 1.0 PDF gives the mathematical structure and experimental evidence.
- WFGY 3.0 Singularity Demo provides 131 high-stakes questions that can be used as a challenging evaluation set.
If you think “system-level interpretability & debugging frameworks” are in scope, WFGY might be a useful addition. I can propose a short entry via PR.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels