Skip to content
Change the repository type filter

All

    Repositories list

    • SE-Bench

      Public
      Official repo for "SE-Bench: Benchmarking Self-Evolution with Knowledge Internalization"
      Python
      MIT License
      31940Updated Feb 24, 2026Feb 24, 2026
    • JustRL

      Public
      [ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
      Python
      1025010Updated Feb 23, 2026Feb 23, 2026
    • NOSA

      Public
      The official implementation of NOSA
      Python
      01500Updated Feb 12, 2026Feb 12, 2026
    • A LLM-based Agent that predict its tasks proactively.
      Python
      Apache License 2.0
      3949950Updated Feb 10, 2026Feb 10, 2026
    • Code and models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts
      Python
      22410Updated Feb 4, 2026Feb 4, 2026
    • APB

      Public
      Official Implementation of APB (ACL 2025 main Oral) and Spava.
      C++
      43400Updated Jan 30, 2026Jan 30, 2026
    • ACDiT

      Public
      ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer
      Python
      MIT License
      14120Updated Jan 29, 2026Jan 29, 2026
    • KG-Infused-RAG

      Public
      Official implementation for the paper "KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs"
      Python
      22100Updated Jan 18, 2026Jan 18, 2026
    • H-Neurons

      Public
      The official implementation of the paper: H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs
      Python
      MIT License
      02100Updated Jan 14, 2026Jan 14, 2026
    • BlockFFN

      Public
      Source codes for paper "BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity".
      Python
      51800Updated Jan 10, 2026Jan 10, 2026
    • LLaVA-UHD

      Public
      LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs
      Python
      Apache License 2.0
      2141570Updated Dec 20, 2025Dec 20, 2025
    • [ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation
      Python
      47720Updated Dec 8, 2025Dec 8, 2025
    • Python
      Apache License 2.0
      6286300Updated Nov 6, 2025Nov 6, 2025
    • StateX

      Public
      The official implementation of the paper "StateX: Enhancing RNN Recall via Post-training State Expansion".
      Python
      0300Updated Oct 24, 2025Oct 24, 2025
    • AgentRM

      Public
      [ACL 2025 main] AgentRM: Enhancing Agent Generalization with Reward Modeling
      Python
      0600Updated Sep 29, 2025Sep 29, 2025
    • The code of the paper Stuffed Mamba: Oversized States Lead to the Inability to Forget
      Python
      0100Updated Sep 28, 2025Sep 28, 2025
    • BurstEngine is an efficient framework designed to train LLMs on long-sequence data.
      Python
      3900Updated Sep 25, 2025Sep 25, 2025
    • The code for the paper "Cost-Optimal Grouped-Query Attention for Long-Context Modeling"
      Python
      1410Updated Sep 14, 2025Sep 14, 2025
    • SIR-Bench

      Public
      Python
      Apache License 2.0
      0510Updated Sep 12, 2025Sep 12, 2025
    • Seq1F1B

      Public
      Sequence-level 1F1B schedule for LLMs.
      Python
      Other
      3.6k3810Updated Aug 26, 2025Aug 26, 2025
    • FR-Spec

      Public
      [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling
      C++
      25130Updated Jul 15, 2025Jul 15, 2025
    • TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
      Python
      Apache License 2.0
      1311541Updated Jun 14, 2025Jun 14, 2025
    • Python
      0900Updated Jun 11, 2025Jun 11, 2025
    • DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
      Python
      MIT License
      16610Updated Jun 10, 2025Jun 10, 2025
    • Must-read Papers on Textual Adversarial Attack and Defense
      Python
      MIT License
      1941.6k30Updated Jun 4, 2025Jun 4, 2025
    • Python
      MIT License
      0000Updated May 28, 2025May 28, 2025
    • DIET

      Public
      Official code for "The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training"
      Python
      0100Updated May 27, 2025May 27, 2025
    • Migician

      Public
      [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
      Python
      MIT License
      48910Updated May 20, 2025May 20, 2025
    • ToLeaP

      Public
      Python
      MIT License
      1511Updated May 17, 2025May 17, 2025
    • SICOG

      Public
      Will Pre-Training Ever End? A First Step Toward Next-Generation Foundation MLLMs via Self-Improving Systematic Cognition
      Python
      GNU General Public License v3.0
      23110Updated May 14, 2025May 14, 2025