Skip to content

[Research] Add Cleanlab to Evaluation, Benchmarks & Datasets#123

Merged
alvinreal merged 1 commit intomainfrom
research/add-cleanlab-2026-04-06
Apr 7, 2026
Merged

[Research] Add Cleanlab to Evaluation, Benchmarks & Datasets#123
alvinreal merged 1 commit intomainfrom
research/add-cleanlab-2026-04-06

Conversation

@alvinreal
Copy link
Copy Markdown
Owner

Project: Cleanlab

Elite Criteria Checklist (ALL Required)

  • Elite Criteria: ALL criteria met
    • ⭐ Stars: 11,410 (threshold: 1000+)
    • 🔄 Active: 2026-01-13 (within 6 months)
    • 🏭 Production: Used by ML teams at Google, Amazon, Microsoft, Tesla, and many research labs for dataset quality assurance
    • 📚 Quality: Comprehensive docs, extensive test suite, regular releases, peer-reviewed research backing

Evidence of Production Usage

  • Used by teams at Google, Amazon, Microsoft, Tesla, and 1000+ other organizations
  • Over 11M downloads on PyPI
  • Peer-reviewed research published in conferences (ICML, NeurIPS, ICLR)
  • Official examples repo: https://github.com/cleanlab/examples

Why This Belongs in Elite Tier

Cleanlab is the standard data-centric AI package for finding and fixing issues in ML datasets. It automatically detects:

  • Label errors (mislabeled examples)
  • Outliers and anomalies
  • Ambiguous or near-duplicate examples
  • Systematic quality issues in datasets

It's actively maintained by a team with academic backing from MIT and has production adoption at major tech companies.

Category

📈 9. Evaluation, Benchmarks & Datasets - High-quality Open Datasets & Data Tools


Automated research loop contribution for category: Evaluation & Benchmarks

- Cleanlab: Data-centric AI package for finding and fixing dataset issues

- 11,410 stars, Apache-2.0 licensed, actively maintained

- Detects label errors, outliers, and ambiguous examples
@alvinreal alvinreal merged commit e365a4f into main Apr 7, 2026
2 checks passed
@alvinreal alvinreal deleted the research/add-cleanlab-2026-04-06 branch April 7, 2026 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant