Skip to content

prateekbhustali/imprl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Inspection and Maintenance Planning using Reinforcement Learning (IMPRL)

License: Apache 2.0 Python 3.11 uv PyTorch 2.2 black pytest

IMPRL is a Python library for deep reinforcement learning for inspection and maintenance (I&M) planning of engineering systems. It provides a collection of environments and single-agent and multi-agent algorithm implementations.

alt text

Algorithms 🧠

Reinforcement Learning (RL) Algorithms

The following reinforcement algorithms are implemented:

Single-agent algorithms

Algorithm Formulation entry point
DDQN off-policy, value-based train_and_log.py
JAC off-policy, actor-critic train_and_log.py
PPO on-policy, actor-critic imprl/agents/PPO.py

Multi-agent algorithms

Algorithm Formulation entry point
DDMAC / DCMAC off-policy, actor-critic train_and_log.py
IACC / IACC_PS off-policy, actor-critic train_and_log.py
MAPPO_PS on-policy, actor-critic imprl/agents/MAPPO_PS.py
QMIX_PS off-policy, value-based train_and_log.py
VDN_PS off-policy, value-based train_and_log.py
IAC / IAC_PS off-policy, actor-critic train_and_log.py
IPPO_PS on-policy, actor-critic imprl/agents/IPPO_PS.py

Heuristic baselines

  • We provide heuristic baselines such as inspect_repair which finds optimal maintenance policies by optimizing inspection intervals and prioritizing component repairs.

  • More simple polices such as do_nothing and failure_replace also provided. These require no optimization and are most suitable for sanity checks.

SARSOP

  • For k_out_of_n_infinite environments, it is possible to compute (near-)optimal policies using point-based value iteration algorithms for POMDPs (such as SARSOP). To enable visualization of SARSOP policies, we provide a wrapper for interfacing with SARSOP called SARSOPAgent.

Package Structure πŸ—‚οΈ

imprl/
β”œβ”€β”€ agents/                    # RL algorithms, configs, and training entry points
β”‚   β”œβ”€β”€ configs/                  # default hyperparameter configs
β”‚   β”œβ”€β”€ primitives/               # reusable networks, replay buffers, and schedulers
β”‚   β”œβ”€β”€ __init__.py               # agent registry and factory
β”‚   └── *.py                      # algorithm implementations
β”œβ”€β”€ baselines/                 # heuristic and reference policies
β”œβ”€β”€ envs/                      # environment registry, wrappers, and environment cores
β”‚   β”œβ”€β”€ game_envs/                # matrix-game environments, wrappers, and configs
β”‚   β”œβ”€β”€ structural_envs/          # k-out-of-n environments, wrappers, and configs
β”‚   └── __init__.py               # env registry and factory
β”œβ”€β”€ post_process/              # policy visualisation and result post-processing
β”‚   β”œβ”€β”€ policy_visualizer.py      # rollout and belief-space plots
β”‚   └── stats.py                  # summary statistics helpers
└── runners/                   # serial and parallel rollout utilities
    β”œβ”€β”€ parallel.py               # multiprocessing rollout helpers
    └── serial.py                 # single-process rollout and training helpers

Installation πŸ“¦

1) Install uv ⚑

uv is a fast Python package and project manager. See more install options: uv docs

# macOS (Homebrew)
brew install uv

# Or via script (Linux/macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh

2) Create a virtual environment

From the repository root:

uv venv --python 3.11
source .venv/bin/activate  # Windows: .venv\Scripts\activate

3) Install dependencies via uv groups πŸ“¦

From this directory, install the base runtime deps defined in pyproject.toml. This install includes the core dependencies for running the library:

uv sync

# optionally install dev tools: pytest, black, ruff
uv sync --group dev
3.1) PyTorch GPU support (optional)

GPU notes (PyTorch): If you need a CUDA-enabled wheel, install a build matching your system from the official PyTorch index: https://pytorch.org/get-started/locally/

Example (Linux, CUDA 12.1):

uv pip install --index-url https://download.pytorch.org/whl/cu121 torch

Check if PyTorch detects your GPU:

python -c "import torch; print(torch.cuda.is_available())"
3.2) Installing additional packages (optional)

Use uv add <pkg> to add packages to your project and lockfile. For example, to add Jupyter Notebook:

uv add "notebook>=7.1.2,<8.0.0"

If resolution fails, relax version ranges and retry.

4) Setup wandb

For logging, the library relies on wandb.You can log into wandb using your private API key,

wandb login
# <enter wandb API key>

Getting Started πŸš€

Creating an environment

import imprl.envs

# create an environment
env = imprl.envs.make(name="k_out_of_n_infinite", setting="hard-1-of-4_infinite")

# env uses the standard Gymnasium API.
obs, info = env.reset()

# select an action for each agent
action = [0, 1, 0, 2]

# step the environment
next_obs, reward, termination, truncation, info = env.step(action)

Training a DRL agent

Train a DDQN agent on the 1-out-of-4 infinite-horizon environment.

Default configs specifying the hyperparameters and environment configs are located at imprl/config/agents/DDQN.yaml. We use Hydra for configuration management and you can override any of the default config values via the command line. For example, to disable logging to wandb, you can set WANDB.mode=disabled:

cd imprl
python train_and_log.py --config-name DDQN WANDB.mode=disabled

Visualizing policies

from omegaconf import OmegaConf
from imprl.post_process import PolicyVisualizer

# create an environment
env = imprl.envs.make(name="k_out_of_n_infinite", setting="hard-1-of-4_infinite")

# create an agent
cfg = OmegaConf.load("imprl/agents/configs/IPPO_PS.yaml")
agent = imprl.agents.make(
    "IPPO_PS", env, OmegaConf.to_container(cfg, resolve=False), device
)

# (optional) load checkpoint
# agent.load_weights(checkpoint_dir, int(checkpoint))

plotter = PolicyVisualizer(env, agent)
plotter.plot()

You can find a detailed introduction in getting_started.ipynb.

Acknowledgements πŸ™

This project utilizes the abstractions in EPyMARL and the author would like to acknowledge the insights shared in Reinforcement Learning Implementation Tips and Tricks for developing this library.

Related Work πŸ”—

  • IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL

    • Benchmarking scalability of cooperative MARL methods in real-world infrastructure management planning problems.
    • Environments: (Correlated and uncorrelated) k-out-of-n systems and offshore wind structural systems.
    • RL solvers: Provides wrappers for interfacing with several (MA)RL libraries such as EPyMARL, Rllib, MARLlib etc.
  • IMP-act: Benchmarking MARL for Infrastructure Management Planning at Scale with JAX

    • Large-scale road networks with up to 178 agents implemented in JAX for scalability.
    • IMP-act-JaxMARL interfaces IMP-act with multi-agent solvers in JaxMARL.
    • We also provide NumPy-based environments for compatibility with PyTorch in IMP-act-epymarl.

Citation πŸ“š

If you find this repository useful, please consider citing:

Assessing the Optimality of Decentralized Inspection and Maintenance Policies for Stochastically Degrading Engineering Systems (open access)

@inproceedings{bhustali_decentralization_2025,
title = "Assessing the Optimality of Decentralized Inspection and Maintenance Policies for Stochastically Degrading Engineering Systems",
author = "Prateek Bhustali and Andriotis, {Charalampos P.}",
year = "2025",
doi = "10.1007/978-3-031-74650-5_13",
isbn = "978-3-031-74649-9",
series = "Communications in Computer and Information Science",
publisher = "Springer",
pages = "236--254",
editor = "Oliehoek, {Frans A.} and Manon Kok and Sicco Verwer",
booktitle = "Artificial Intelligence and Machine Learning",
url = "https://link.springer.com/chapter/10.1007/978-3-031-74650-5_13",
}

About

Inspection and Maintenance Planning using Reinforcement Learning (IMPRL)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors