RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

News

[2026.01.26] Our paper has been accepted by ICLR 2026 ✨.
[2025.01.26] Our project introduction has been featured on DeepWiki.
[2025.06.11] Our post on X (aka. Twitter) has received many likes.
[2025.06.11] We were featured as HuggingFace Daily Paper #3.

Overview

Reinforced Rule-based Reasoning (RuleReasoner) is a simple yet effective method enabling small reasoning models (SRMs) to effectively learn rule-based reasoning. Unlike large models that need complex training, RuleReasoner uses a curated collection of tasks and a domain-aware dynamic sampling approach, adjusting training based on historical performance. This simple yet effective technique allows SRMs to outperform frontier Large Reasoning Models (LRMs) by +4.1% on in-distribution tasks and +10.4% on out-of-distribution tasks, while also being more computationally efficient.

Domain-aware dynamic sampling with higher training sampling efficiency and domain performance balance.

Comprehensive Data curation for data curricula on rule-centric application.

Rule Reasoner (8B and 4B) depicts comparable performance versus a wide range of baselines.

Rule Reasoner (8B and 4B) also achives strong OOD performance across three benchmarks (subsets of rule-based reasoning) including BBH, ProverQA, and BBEH.

Quick Start

Prerequisites

Running RuleReasoner requires the dependencies listed in requirements.txt.

Installation

Build RuleReasoner from the source and install dependencies:

Clone the repository:

git clone https://github.com/bigai-nlco/RuleReasoner.git

Navigate to the project directory:
```
cd RuleReasoner
```

Install the dependencies:

pip install -r requirements.txt
pip install -e ./verl
pip install -e .

Training

Run the training with:

./scripts/train/train_mix.sh

Evaluation

Run the evaluation with:

./scripts/eval/eval_model.sh \
    --model $MODEL_PATH \
    --datasets $DATASET_PATH \
    --output-dir $OUTPUT_DIR

Project Structure

└── RuleReasoner
    ├── LICENSE
    ├── README.md
    ├── requirements.txt
    ├── scripts
    │   ├── build_dataset.py
    │   ├── data
    │   ├── eval
    │   └── train
    ├── setup.py
    ├── src
    │   ├── __init__.py
    │   ├── data
    │   ├── globals.py
    │   ├── system_prompts.py
    │   └── utils.py
    └── verl
	└── ...

Contributing

Discussions: Share your insights, provide feedback, or ask questions in the discussions section.
Issues: Submit bugs found or log feature requests for the RuleReasoner project in the issues section.
Pull Requests: Review open PRs, and submit your own PRs.

Contributing Guidelines

Fork the Repository: Start by forking the project repository to your own account.
Clone Locally: Clone the forked repository to your local machine.
```
git clone https://github.com/<YOUR-USERNAME>/RuleReasoner.git
```
Create a New Branch: Always work on a new branch, giving it a descriptive name.
```
git checkout -b new-feature-x
```
Make Your Changes: Develop and test your changes locally.
Commit Your Changes: Commit with a clear message describing your updates.
```
git commit -m 'Implemented new feature x.'
```
Push to your fork: Push the changes to your forked repository.
```
git push origin new-feature-x
```
Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
Review: Once your PR is reviewed and approved, it will be merged into the main branch.

License

Rulereasoner is distributed under the terms of the MIT License.

Citation

@inproceedings{
    liu2026rulereasoner,
    title={RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling},
    author={Yang Liu and Jiaqi Li and Zilong Zheng},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=MQV4TJyqnb}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

News

Overview

Table of Contents

Quick Start

Prerequisites

Installation

Training

Evaluation

Project Structure

Contributing

License

Citation

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
scripts		scripts
src		src
verl		verl
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

News

Overview

Table of Contents

Quick Start

Prerequisites

Installation

Training

Evaluation

Project Structure

Contributing

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages