- [2026.01.26] Our paper has been accepted by ICLR 2026 ✨.
- [2025.01.26] Our project introduction has been featured on DeepWiki.
- [2025.06.11] Our post on X (aka. Twitter) has received many likes.
- [2025.06.11] We were featured as HuggingFace Daily Paper #3.
Reinforced Rule-based Reasoning (RuleReasoner) is a simple yet effective method enabling small reasoning models (SRMs) to effectively learn rule-based reasoning. Unlike large models that need complex training, RuleReasoner uses a curated collection of tasks and a domain-aware dynamic sampling approach, adjusting training based on historical performance. This simple yet effective technique allows SRMs to outperform frontier Large Reasoning Models (LRMs) by +4.1% on in-distribution tasks and +10.4% on out-of-distribution tasks, while also being more computationally efficient.
- Domain-aware dynamic sampling with higher training sampling efficiency and domain performance balance.
- Comprehensive Data curation for data curricula on rule-centric application.
- Rule Reasoner (8B and 4B) depicts comparable performance versus a wide range of baselines.
- Rule Reasoner (8B and 4B) also achives strong OOD performance across three benchmarks (subsets of rule-based reasoning) including BBH, ProverQA, and BBEH.
Running RuleReasoner requires the dependencies listed in requirements.txt.
Build RuleReasoner from the source and install dependencies:
-
Clone the repository:
git clone https://github.com/bigai-nlco/RuleReasoner.git
-
Navigate to the project directory:
cd RuleReasoner -
Install the dependencies:
pip install -r requirements.txt pip install -e ./verl pip install -e .
Run the training with:
./scripts/train/train_mix.shRun the evaluation with:
./scripts/eval/eval_model.sh \
--model $MODEL_PATH \
--datasets $DATASET_PATH \
--output-dir $OUTPUT_DIR└── RuleReasoner
├── LICENSE
├── README.md
├── requirements.txt
├── scripts
│ ├── build_dataset.py
│ ├── data
│ ├── eval
│ └── train
├── setup.py
├── src
│ ├── __init__.py
│ ├── data
│ ├── globals.py
│ ├── system_prompts.py
│ └── utils.py
└── verl
└── ...- Discussions: Share your insights, provide feedback, or ask questions in the discussions section.
- Issues: Submit bugs found or log feature requests for the
RuleReasonerproject in the issues section. - Pull Requests: Review open PRs, and submit your own PRs.
Contributing Guidelines
- Fork the Repository: Start by forking the project repository to your own account.
- Clone Locally: Clone the forked repository to your local machine.
git clone https://github.com/<YOUR-USERNAME>/RuleReasoner.git
- Create a New Branch: Always work on a new branch, giving it a descriptive name.
git checkout -b new-feature-x
- Make Your Changes: Develop and test your changes locally.
- Commit Your Changes: Commit with a clear message describing your updates.
git commit -m 'Implemented new feature x.' - Push to your fork: Push the changes to your forked repository.
git push origin new-feature-x
- Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
- Review: Once your PR is reviewed and approved, it will be merged into the main branch.
Rulereasoner is distributed under the terms of the MIT License.
@inproceedings{
liu2026rulereasoner,
title={RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling},
author={Yang Liu and Jiaqi Li and Zilong Zheng},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=MQV4TJyqnb}
}