Skip to content

Code for the paper (Neurips 25 Oral) "Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies"

License

Notifications You must be signed in to change notification settings

instadeepai/rl-inference-strategies

Repository files navigation

All tasks 30 Seconds

Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies

Python Version License Ruff ArXiv Website

This repository contains the official code for the NeurIPS 2025 Oral paper titled: Breaking the Performance Ceiling in Reinforcement Learning Requires Inference Strategies. The code is built as an extension to the Mava codebase.

⚙️ Setup

💾 Downloading pre-trained checkpoints

Before running the experiments please download the pre-trained model checkpoints from the following link: Download all_checkpoints.zip (~420MB). Once downloaded, extract the contents into the root directory of the code and delete the zip file:

unzip all_checkpoints.zip
rm all_checkpoints.zip

After extraction, the code directory should look like this:

RL-INFERENCE-STRATEGIES/
├── all_checkpoints/
├── base_policy_hyperparameters/
├── inference_configurations/
├── ...

📦 Installing dependencies

We strongly recommend using uv and python 3.12, but any other virtual environment manager can be used in a similar way.

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create a virtual environment and install all dependencies
uv sync

# If you have a GPU, install the CUDA version of JAX
uv sync --extra cuda12

# Activate the virtual environment
source .venv/bin/activate

🧪 Run experiments

We give convenient launcher scripts to reproduce all the results from the paper.

🧠 Train base policies

To retrain all base policies with the parameters used in the paper, please run

python experiment_launch_scripts/train_base_policies.py

🧭 Train COMPASS policies

To retrain all COMPASS policies with the parameters used in the paper, please run

python experiment_launch_scripts/train_compass_policies.py

📈 Run inference strategy experiments

To run all stochastic evaluation experiments, please run

python experiment_launch_scripts/eval_stochastic.py

To run all COMPASS experiments, please run

python experiment_launch_scripts/eval_compass_cmaes.py

To run all SGBS experiments, please run

python experiment_launch_scripts/eval_sgbs.py

To run all online fine-tuning experiments, please run

python experiment_launch_scripts/eval_finetuning.py

📚 Citing

If you build on this work, please cite our paper:

@inproceedings{
    chalumeau2025breaking,
    title={Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies},
    author={Felix Chalumeau and Daniel Rajaonarivonivelomanantsoa and Ruan John de Kock and Juan Claude Formanek and Sasha Abramowitz and Omayma Mahjoub and Wiem Khlifi and Simon Verster Du Toit and Louay Ben Nessir and Refiloe Shabe and Arnol Manuel Fokam and Siddarth Singh and Ulrich Armel Mbou Sob and Arnu Pretorius},
    booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
    year={2025},
    url={https://openreview.net/forum?id=RxkCwOKVKa}
}

🙏 Acknowledgements

This work was supported with Cloud TPUs from Google's TPU Research Cloud (TRC) 🌤.

About

Code for the paper (Neurips 25 Oral) "Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies"

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages