Skip to content

Commit 617f82e

Browse files
authored
Update README.md
1 parent 5d9f891 commit 617f82e

1 file changed

Lines changed: 2 additions & 0 deletions

File tree

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@
33
![xFail banner](assets/X.Fail.png)
44

55
A focused evaluation harness built to expose the real failure modes of LLM code reasoning. This isn’t a pass/fail scoreboard; it’s a diagnostic layer for models that are pretending to understand requirements.
6+
67
![version](https://img.shields.io/badge/version-v0.0.4-orange)
8+
79
## Why xFail?
810

911
Benchmarks like HumanEval, MBPP, and SWE-Bench measure surface accuracy. xFail is designed to classify failure behavior and tie it to concrete model breakdowns.

0 commit comments

Comments
 (0)