THUDM repositories

slime

Public

slime is an LLM post-training framework for RL Scaling.

Python

•

Apache License 2.0

•428•3.4k•115•41•Updated

Jan 19, 2026

AgentRL

Public

Scaling Agentic Reinforcement Learning with a Multi-Turn, Multi-Task Framework

Python

•

MIT License

•10•189•7•0•Updated

Jan 17, 2026

CaRR

Public

This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards".

Python

•

MIT License

•3•45•1•0•Updated

Jan 12, 2026

MobileRL

Public

Python

•

MIT License

•6•52•1•0•Updated

Dec 23, 2025

AgentBench

Public

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

gpt-4 llm chatgptllm-agent

Python

•

Apache License 2.0

•222•3.1k•57•8•Updated

Nov 17, 2025

ComputerRL

Public

Python

•

Apache License 2.0

•5•12•3•0•Updated

Nov 7, 2025

PETra

Public

Python

•0•2•0•0•Updated

Nov 5, 2025

AlignBench

Public

大模型多维度中文对齐评测基准 (ACL 2024)

large-language-models llm chatgptchatglm

Python

•30•421•15•0•Updated

Oct 25, 2025

LLM4CardGame

Public

Python

•1•10•2•0•Updated

Oct 15, 2025

DeepDive

Public

DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL

Python

•19•232•2•0•Updated

Oct 2, 2025

TDRM

Public

Python

•

Apache License 2.0

•1•9•0•0•Updated

Sep 25, 2025

ReST-RL

Public

Reinforcing LLM Reasoning through Self-Training and Value-Guided Decoding

Python

•

MIT License

•0•13•0•0•Updated

Sep 18, 2025

INFTY

Public

INFTY Engine: An Optimization Toolkit to Support Continual AI

Python

•

MIT License

•9•566•0•0•Updated

Sep 13, 2025

DataSciBench

Public

DataSciBench: An LLM Agent Benchmark for Data Science

Python

•5•49•0•0•Updated

Sep 1, 2025

Android-Lab

Public

Python

•

MIT License

•21•287•2•0•Updated

Aug 18, 2025

SWE-Dev

Public

[ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.

Python

•

MIT License

•0•57•1•0•Updated

Jul 21, 2025

z-ai-sdk-typescript

Public

Typescript SDK for Z.ai - Not yet released.

TypeScript

•

MIT License

•1•6•1•0•Updated

Jul 17, 2025

BiPro

Public

code and data for Paper: BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework(ACL 2025 main)

Python

•0•6•0•0•Updated

Jun 28, 2025

LongWriter

Public

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

fine-tuning llm long-contextlong-text

Python

•

Apache License 2.0

•184•1.8k•28•2•Updated

Jun 24, 2025

TreeRL

Public

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python

•

Apache License 2.0

•8•86•4•0•Updated

Jun 16, 2025

WebRL

Public

Building Open LLM Web Agents with Self-Evolving Online Curriculum RL

Python

•31•495•0•0•Updated

Jun 6, 2025

AndroidGen

Public

Python

•

Apache License 2.0

•1•11•1•0•Updated

May 29, 2025

AlignMMBench

Public

code, data and model for Paper: AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models (ACL'25 main)

Python

•2•5•1•0•Updated

May 20, 2025

CogKit

Public

Finetuning and inference tools for the CogView4 and CogVideoX model series.

finetune video-generation text2cogvideox

Python

•

Apache License 2.0

•13•112•18•1•Updated

May 14, 2025

VisualAgentBench

Public

Towards Large Multimodal Models as Visual Foundation Agents

gpt llm-agent multimodal-large-language-models

Python

•

Apache License 2.0

•9•249•16•0•Updated

Apr 24, 2025

MoELoRA_Riemannian

Public

Source code of paper: A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models. (ICML 2025)

Python

•1•35•0•0•Updated

Apr 2, 2025

Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models

Public

Parameter-Efficient Fine-Tuning for Foundation Models

3•106•1•0•Updated

Mar 31, 2025

WebGLM

Public

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

llm chatgpt rlhfwebglm

Python

•

Apache License 2.0

•136•1.6k•51•1•Updated

Mar 25, 2025

WhoIsWho

Public

KDD'23 Web-Scale Academic Name Disambiguation: the WhoIsWho Benchmark, Leaderboard, and Toolkit

data-mining name-disambiguation academic-graph

Python

•17•47•6•0•Updated

Mar 19, 2025

scholar-profiling

Public

Jupyter Notebook

•1•18•5•0•Updated

Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

THUKEG

All

All

126 repositories

slime

AgentRL

CaRR

MobileRL

AgentBench

ComputerRL

PETra

AlignBench

LLM4CardGame

DeepDive

TDRM

ReST-RL

INFTY

DataSciBench

Android-Lab

SWE-Dev

z-ai-sdk-typescript

BiPro

LongWriter

TreeRL

WebRL

AndroidGen

AlignMMBench

CogKit

VisualAgentBench

MoELoRA_Riemannian

Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models

WebGLM

WhoIsWho

scholar-profiling

All

All

Repositories list

126 repositories