IIG-RL-Benchmark

IIG-RL-Benchmark is a library for running game theoretical and deep RL algorithms on OpenSpiel games. Furthermore, we compute exact exploitability using the exp-a-spiel library which currently supports Phantom Tic-Tac-Toe and 3x3 Dark Hex, as well as their abrupt versions.

Paper: arxiv.org/abs/2502.08938

Play against RL: Online demo

Citation

@article{rudolph2025reevaluating,
  title={Reevaluating Policy Gradient Methods for Imperfect-Information Games},
  author={Rudolph, Max and Lichtlé, Nathan and Mohammadpour, Sobhan and Bayen, Alexandre and Kolter, J. Zico and Zhang, Amy and Farina, Gabriele and Vinitsky, Eugene and Sokota, Samuel},
  journal={arXiv preprint arXiv:2502.08938},
  year={2025}
}

Installation

Ensure that your conda is not activated to prevent spawning dual virtual environments.

# Download this repository (IIG-RL-Benchmark) and install dependencies
git clone https://github.com/nathanlct/IIG-RL-Benchmark.git
cd IIG-RL-Benchmark

# Update the exp-a-spiel (eas) dependency
git submodule init
git submodule update

# This will install eas and its dependencies. Make sure to do this from the top level repo.
pixi install
pixi shell

# Install our custom OpenSpiel (with fixes fixes for PTTT and DH3 and the addition of abrupt PTTT) (our default is python = 3.11)
# If your system isn't linux or you're not on python 3.11, choose the variant for your system (https://github.com/nathanlct/open_spiel/releases)
pip uninstall open_spiel -y
pip install https://github.com/nathanlct/open_spiel/releases/download/v1.pttt/open_spiel-1.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

# Lastly, install the other requirements
pip install -r requirements.txt

Installing exp-a-spiel on MacOS

The exp-a-spiel dependency (which is only required to compute exact exploitability of policies) supports installation on Linux. To install on OSX, use the following branch in exp-a-spiel (which has not been extensively tested): osx_install (link).

(Optional alternative) Installing exp-a-spiel via `pip`

cd IIG-RL-Benchmark
pip install ./eas

Usage

Train a PPO agent on Phantom Tic-Tac-Toe for 10M steps with an entropy coefficient of 0.05 and no exploitability computation.

python main.py algorithm=ppo game=classical_phantom_ttt max_steps=10000000 algorithm.ent_coef=0.05 compute_exploitability=False

Train an NFSP agent on Abrupt Dark Hex 3 for 10M steps and compute exact exploitability every 1M steps.

python main.py algorithm=nfsp game=abrupt_dark_hex max_steps=10000000 compute_exploitability=True compute_exploitability_every=1000000

The following algorithms are implemented: ppo, nfsp, ppg, mmd, rnad, escher, psro.

Exact exploitability is supported for the following games: classical_phantom_ttt, abrupt_phantom_ttt, classical_dark_hex, abrupt_dark_hex.

See the configs folder for all available command-line arguments.

Evaluating head2head

Input the paths and algorithm names into a yaml file like the example in head2head/example.yaml

python head2head/head2head_eval.py --agents-yaml path/to/example.yaml --save-dir <save-dir>

This command will kick off head2head evaluation for the agents listed in the eval yaml using the eas library.

Games

The following games are also available to play against our RL agents in this online demo.

Phantom Tic-Tac-Toe (PTTT)

Tic-Tac-Toe: Players take turn placing their symbol (X or O) on a 3x3 grid. The goal is to form a horizontal, vertical, or diagonal line of your symbol. X plays first.

Phantom: The opponent's moves are hidden. Selecting an occupied cell reveals it, and the player's turn continues.

Abrupt PTTT: In the abrupt variant, if a player's move reveals an opponent's symbol, their turn ends.

3x3 Dark Hex (DH3)

Hex: Players take turns placing their stones (red or blue) on a 3x3 hexagonal board. The goal is to form a chain of adjacent stones connecting both sides of your color. Red plays first.

Dark: The opponent's moves are hidden. Selecting an occupied cell reveals it, and the player's turn continues.

Abrupt DH3: In the abrupt variant, if a player's move reveals an opponent's symbol, their turn ends.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
algorithms		algorithms
configs		configs
eas @ 2d83a35		eas @ 2d83a35
head2head		head2head
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
main.py		main.py
pixi.toml		pixi.toml
requirements.txt		requirements.txt
setup.py		setup.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IIG-RL-Benchmark

Citation

Installation

Installing exp-a-spiel on MacOS

(Optional alternative) Installing exp-a-spiel via `pip`

Usage

Evaluating head2head

Games

Phantom Tic-Tac-Toe (PTTT)

3x3 Dark Hex (DH3)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

nathanlct/IIG-RL-Benchmark

Folders and files

Latest commit

History

Repository files navigation

IIG-RL-Benchmark

Citation

Installation

Installing exp-a-spiel on MacOS

(Optional alternative) Installing exp-a-spiel via pip

Usage

Evaluating head2head

Games

Phantom Tic-Tac-Toe (PTTT)

3x3 Dark Hex (DH3)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

(Optional alternative) Installing exp-a-spiel via `pip`

Packages