Skip to content

Commit 2ca8a01

Browse files
Update .toml extras; fixed tests; added changelog; removed old setup.py;
1 parent 2e6dc40 commit 2ca8a01

7 files changed

Lines changed: 1384 additions & 128 deletions

File tree

CHANGELOG.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
## [1.0.0] - 2026-04-02
2+
3+
Release Highlights: Version 1.0.0 (Generated using an LLM: Google Gemini)
4+
🚀 Major Breaking Changes & Core Updates
5+
Changed to manage project with uv - change minimum Python version to 3.11.
6+
7+
Improved README.
8+
9+
Gymnasium Migration: Full migration from gym to gymnasium (v1.0.0 compatibility). This includes updated return values for step() and reset() (support for terminated/truncated flags).
10+
11+
API Refactor: Significant renaming of internal functions for clarity, specifically around Markov state management (get_augmented_state, etc.) and image representation.
12+
13+
Dependency Modernization: Upgraded numpy and random number generation to align with modern Gymnasium standards (_np_random).
14+
15+
🛠 Environment Enhancements (RLToyEnv)
16+
Advanced Rendering: Added a more flexible render() function that allows for custom trajectory rollouts and "imaginary" rollouts from specific starting states.
17+
18+
Observation Capabilities: * Improved get_image_representation to support uncertainty visualization (epistemic and aleatoric) via bar plots.
19+
20+
Added support for setting custom dtype_s and dtype_o for state and observation spaces.
21+
22+
Dynamics & Noise: Transition and reward noises can now be state-and-action dependent. Improved default noise profiles for continuous environments.
23+
24+
Reward Logic: Improved reward_every_n_steps logic to work across discrete, continuous, and grid environments.
25+
26+
🧪 Wrappers & Compatibility
27+
Gymnasium Wrapper: Updated wrapper to support irrelevant dimensions and image transformations.
28+
29+
External Integration: Improved support and examples for MiniGrid, ProcGen, and Mujoco (v4) environments.
30+
31+
Resource Management: Added close() functionality to properly release Pygame resources.
32+
33+
📈 Tooling & Documentation
34+
Example Suite: Overhauled example.py with a better CLI, individual function calls, and logging toggles for image observations.
35+
36+
Experimentation: Updated experiment configuration scripts and cleaned up Jupyter notebooks for plotting results.
37+
38+
CI/CD: Updated GitHub workflows to support newer Python versions and fixed code coverage reporting.
39+
40+
🐛 Bug Fixes
41+
Fixed issues with copy.deepcopy() by removing redundant state variables (self.P, self.R).
42+
43+
Resolved reward bugs related to delays exceeding sequence lengths.
44+
45+
Fixed terminal state logic for grid environments.
46+
47+
Rectified various test failures in TestGymEnvWrapper and TestRLToyEnv.

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ ae = gym.make("QbertNoFrameskip-v4")
3838
env = GymEnvWrapper(ae, **config)
3939
```
4040

41+
## Important Note
42+
We are moving to package management with `uv` and away from using Ray Rllib, so some experiment / agent running functionality might break. The wrappers and toy environment should still work fine though.
43+
4144
## Getting started
4245
There are 4 parts to the package:
4346
1) **Toy Environments**: The base toy Environment in [`mdp_playground/envs/rl_toy_env.py`](mdp_playground/envs/rl_toy_env.py) implements the toy environment functionality, including discrete and continuous environments, and is parameterised by a `config` dict which contains all the information needed to instantiate the required toy MDP. Please see [`example.py`](example.py) for some simple examples of how to use these. For further details, please refer to the documentation in [`mdp_playground/envs/rl_toy_env.py`](mdp_playground/envs/rl_toy_env.py).

mdp_playground/envs/rl_toy_env.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2490,13 +2490,21 @@ def render(self,):
24902490
def imagine_and_render(self, actions, state=None, render=True):
24912491
"""
24922492
Performs steps in a deep copy of the environment with an action
2493-
sequence and then optionally renders the resulting trajectory and returns the rendered RGB images.
2493+
sequence and then optionally renders the resulting trajectory and
2494+
returns the rendered RGB images. It's called "imagine" and not "rollout"
2495+
because performing steps in a copy of the environment means that the
2496+
original environment and its state is not affected by the actions rolled
2497+
out here.
24942498
If render is False, returns the observations created by stepping in the env
24952499
using actions instead of rendered images.
24962500
2497-
Currently, render_mode is hardcoded to "rgb_array" for the copied environment.
2501+
Notes:
2502+
1) Currently, render_mode is hardcoded to "rgb_array" for the copied environment.
24982503
Would need to look deeper into pygame, e.g. for how to instantiate mutliple windows
2499-
to support "human" render_mode as well.
2504+
to support "human" render_mode as well.
2505+
2) Ideally, the rollout and render would be separated but currently the render() is
2506+
based on the current state of the environment, so separating is going to be harder.
2507+
25002508
25012509
Parameters
25022510
----------

mdp_playground/spaces/test_image_continuous.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ def test_image_continuous(self):
5252
img1 = Image.fromarray(np.squeeze(imc.generate_image(pos)), "RGB")
5353
if render:
5454
img1.show()
55-
img1.save("cont_state_target.pdf")
55+
img1.save("cont_state_target.pdf", format="PDF")
5656

5757
# Terminal sub-spaces
5858
lows = np.array([2.0, 4.0])

pyproject.toml

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
[project]
2+
name = "mdp-playground"
3+
version = "1.0.0"
4+
description = "A python package to design and debug RL agents"
5+
readme = "README.md"
6+
requires-python = ">=3.11"
7+
dependencies = [
8+
"ale-py>=0.11.2",
9+
"dill>=0.4.1",
10+
"gymnasium>=1.2.2",
11+
"numpy>=2.4.2",
12+
"scipy>=1.17.0",
13+
"pillow>=12.2.0", # Image processing
14+
]
15+
license = { text = "Apache License, Version 2.0" }
16+
17+
authors = [
18+
{ name = "Raghu Rajan" },
19+
{ name = "Jessica Borja" },
20+
{ name = "Suresh Guttikonda" },
21+
{ name = "Fabio Ferreira" },
22+
{ name = "Jan Ole von Hartz" },
23+
{ name = "André Biedenkapp" },
24+
{ name = "Frank Hutter" }
25+
]
26+
27+
maintainers = [
28+
{ name = "Raghu Rajan", email = "rajanr@cs.uni-freiburg.de" }
29+
]
30+
31+
classifiers = [
32+
"Programming Language :: Python :: 3",
33+
"License :: OSI Approved :: Apache Software License",
34+
"Operating System :: OS Independent",
35+
"Natural Language :: English",
36+
"Intended Audience :: Developers",
37+
"Intended Audience :: Education",
38+
"Intended Audience :: Science/Research",
39+
"Topic :: Scientific/Engineering :: Artificial Intelligence",
40+
]
41+
42+
[project.urls]
43+
Homepage = "https://github.com/automl/mdp-playground"
44+
"Bug Tracker" = "https://github.com/automl/mdp-playground/issues"
45+
46+
47+
[project.optional-dependencies]
48+
# A single consolidated extra containing all environment and analysis tools. Many of these haven't been tested yet.
49+
extras = [
50+
# "ray[default,rllib]>=2.54.1",
51+
# "tensorflow>=2.21.0",
52+
# "tensorflow-probability>=0.23.0",
53+
"gymnasium[atari,other]>=1.2.2",
54+
"mujoco>=3.1.0",
55+
"configspace>=1.2.2",
56+
"pandas>=3.0.0",
57+
"scipy>=1.17.1",
58+
"matplotlib>=3.10.8", # Plotting
59+
"opencv-python>=4.13.0.92", # CV2
60+
# "opencv-python-headless>=4.13.0.92",
61+
"requests>=2.31.0",
62+
]
63+
64+
# hpo_analysis = [
65+
# "cave>=1.4.0"
66+
# ]
67+
68+
[project.scripts]
69+
run-mdpp-experiments = "mdp_playground.scripts.run_experiments:cli"
70+
71+
[tool.setuptools.packages.find]
72+
where = ["."]
73+
74+
[tool.setuptools.package-data]
75+
"*" = ["*"]
76+
77+
[build-system]
78+
requires = ["setuptools>=61.0"]
79+
build-backend = "setuptools.build_meta"
80+
81+
[dependency-groups]
82+
dev = [
83+
"pytest>=9.0.2",
84+
]
85+
86+
[tool.uv]
87+
package = true

setup.py

Lines changed: 0 additions & 124 deletions
This file was deleted.

0 commit comments

Comments
 (0)