Skip to content
Change the repository type filter

All

    Repositories list

    • al-qasida

      Public
      Python
      0100Updated Dec 27, 2025Dec 27, 2025
    • Python
      3400Updated Dec 15, 2025Dec 15, 2025
    • HTML
      0000Updated Dec 10, 2025Dec 10, 2025
    • MedExpert

      Public
      Code for the "MedExpert: An Expert-Annotated Dataset for Medical Chatbot Evaluation" paper at Machine Learning for Health (ML4H) 2025.
      Python
      0600Updated Dec 3, 2025Dec 3, 2025
    • Python
      4401Updated Nov 27, 2025Nov 27, 2025
    • Essential code for the paper *Genomic Next-Token Predictors are In-Context Learners*.
      Python
      0100Updated Nov 16, 2025Nov 16, 2025
    • Python
      0100Updated Nov 5, 2025Nov 5, 2025
    • mmBERT

      Public
      A massively multilingual modern encoder language model
      Python
      911710Updated Oct 13, 2025Oct 13, 2025
    • NSF CCRI ENS Project: Next Generation Tools for Spoken Language Science & Technology
      1000Updated Oct 1, 2025Oct 1, 2025
    • Python
      0200Updated Sep 23, 2025Sep 23, 2025
    • Code and data for the paper: "Hell or High Water: Evaluating Agentic Recovery from External Failures"
      Python
      0600Updated Aug 14, 2025Aug 14, 2025
    • Jupyter Notebook
      0200Updated Aug 11, 2025Aug 11, 2025
    • 0200Updated Aug 6, 2025Aug 6, 2025
    • State-of-the-art paired encoder and decoder models (17M-1B params)
      Python
      35301Updated Aug 6, 2025Aug 6, 2025
    • Code for paper FEEDBACK FRICTION: LLMs Struggle to Fully Incorporate External Feedback https://arxiv.org/pdf/2506.11930
      Python
      0800Updated Jun 16, 2025Jun 16, 2025
    • Python
      1100Updated Jun 12, 2025Jun 12, 2025
    • NeoCoder

      Public
      Official implementation of our paper "Benchmarking Language Model Creativity: A Case Study on Code Generation"
      Python
      41000Updated May 16, 2025May 16, 2025
    • Bringing BERT into modernity via both architecture changes and scaling
      Python
      136000Updated Apr 22, 2025Apr 22, 2025
    • Code and dataset for the paper: Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol (https://arxiv.org/pdf/2504.10284)
      Python
      0120Updated Apr 21, 2025Apr 21, 2025
    • Code for paper "Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data"
      Python
      0300Updated Apr 21, 2025Apr 21, 2025
    • The repo for the paper "CLAIMCHECK: How Grounded are LLM Critiques of Scientific Papers? "
      0000Updated Apr 20, 2025Apr 20, 2025
    • Web Agent Arena
      HTML
      0000Updated Apr 10, 2025Apr 10, 2025
    • Python
      8600Updated Apr 7, 2025Apr 7, 2025
    • Python
      1010Updated Feb 15, 2025Feb 15, 2025
    • This project focus on curating a robust analogical reasoning dataset for research and development.
      Python
      2500Updated Dec 18, 2024Dec 18, 2024
    • Web-grounded natural language instructions
      HTML
      61730Updated Nov 25, 2024Nov 25, 2024
    • Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044
      Python
      53500Updated Oct 3, 2024Oct 3, 2024
    • wikicite

      Public
      Python
      0100Updated Jul 29, 2024Jul 29, 2024
    • Kreyol-MT

      Public
      Python
      0600Updated Jun 1, 2024Jun 1, 2024
    • Scripts and docs that help us run cost effective experiment with OpenAI APIs
      Python
      2400Updated May 28, 2024May 28, 2024