Skip to content

Commit c800316

Browse files
Adds native vector implementation (#599)
1 parent fa48461 commit c800316

28 files changed

+2500
-223
lines changed

.github/workflows/ci.yml

+12-12
Original file line numberDiff line numberDiff line change
@@ -184,44 +184,44 @@ jobs:
184184

185185
- runs-on: macos-13
186186
python: '3.9'
187-
wheel-name: 'cp39-cp39-macosx_10_15_x86_64'
187+
wheel-name: 'cp39-cp39-macosx_13_0_x86_64'
188188
arch: x86_64
189189
- runs-on: macos-13
190190
python: '3.10'
191-
wheel-name: 'cp310-cp310-macosx_10_15_x86_64'
191+
wheel-name: 'cp310-cp310-macosx_13_0_x86_64'
192192
arch: x86_64
193193
- runs-on: macos-13
194194
python: '3.11'
195-
wheel-name: 'cp311-cp311-macosx_10_15_x86_64'
195+
wheel-name: 'cp311-cp311-macosx_13_0_x86_64'
196196
arch: x86_64
197197
- runs-on: macos-13
198198
python: '3.12'
199-
wheel-name: 'cp312-cp312-macosx_10_15_x86_64'
199+
wheel-name: 'cp312-cp312-macosx_13_0_x86_64'
200200
arch: x86_64
201201
- runs-on: macos-13
202202
python: '3.13'
203-
wheel-name: 'cp313-cp313-macosx_10_15_x86_64'
203+
wheel-name: 'cp313-cp313-macosx_13_0_x86_64'
204204
arch: x86_64
205205

206206
- runs-on: macos-14
207207
python: '3.9'
208-
wheel-name: 'cp39-cp39-macosx_11_0_arm64'
208+
wheel-name: 'cp39-cp39-macosx_13_0_arm64'
209209
arch: arm64
210210
- runs-on: macos-14
211211
python: '3.10'
212-
wheel-name: 'cp310-cp310-macosx_11_0_arm64'
212+
wheel-name: 'cp310-cp310-macosx_13_0_arm64'
213213
arch: arm64
214214
- runs-on: macos-14
215215
python: '3.11'
216-
wheel-name: 'cp311-cp311-macosx_11_0_arm64'
216+
wheel-name: 'cp311-cp311-macosx_13_0_arm64'
217217
arch: arm64
218218
- runs-on: macos-14
219219
python: '3.12'
220-
wheel-name: 'cp312-cp312-macosx_11_0_arm64'
220+
wheel-name: 'cp312-cp312-macosx_13_0_arm64'
221221
arch: arm64
222222
- runs-on: macos-14
223223
python: '3.13'
224-
wheel-name: 'cp313-cp313-macosx_11_0_arm64'
224+
wheel-name: 'cp313-cp313-macosx_13_0_arm64'
225225
arch: arm64
226226

227227
runs-on: ${{ matrix.runs-on }}
@@ -240,10 +240,10 @@ jobs:
240240

241241
- name: Install ALE wheel
242242
# wildcarding doesn't work for some reason, therefore, update the project version here
243-
run: python -m pip install ale_py-0.10.2-${{ matrix.wheel-name }}.whl
243+
run: python -m pip install ale_py-0.11.0-${{ matrix.wheel-name }}.whl
244244

245245
- name: Install Gymnasium and pytest
246-
run: python -m pip install gymnasium>=1.0.0 pytest
246+
run: python -m pip install gymnasium>=1.0.0 pytest opencv-python
247247

248248
- name: Test
249249
run: python -m pytest

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ repos:
2929
hooks:
3030
- id: flake8
3131
args:
32-
- '--per-file-ignores=tests/python/test_atari_env.py:F811 tests/python/test_python_interface.py:F811'
32+
- '--per-file-ignores=tests/python/test_atari_env.py:F811 tests/python/test_python_interface.py:F811 src/ale/python/__init__.py:E402'
3333
- --ignore=E203,W503,E741
3434
- --max-complexity=30
3535
- --max-line-length=456

CHANGELOG.md

+31
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,37 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.11.0](https://github.com/Farama-Foundation/Arcade-Learning-Environment/compare/v0.10.2...v0.11.0) - 2025-04-26
9+
10+
This release adds (an experiment) built-in vectorisation environment, available through `gymnasium.make_vec("ALE/{game_name}-v5", num_envs)` or `ale_py.AtariVectorEnv("{rom_name}", num_envs)`.
11+
12+
```python
13+
import gymnasium as gym
14+
import ale_py
15+
16+
gym.register_envs(ale_py)
17+
18+
envs = gym.make_vec("ALE/Pong-v5")
19+
observations, infos = envs.reset()
20+
21+
for i in range(100):
22+
actions = envs.action_space.sample()
23+
observations, rewards, terminations, truncations, infos = envs.step(actions)
24+
25+
envs.close()
26+
```
27+
28+
Vectorisation is a crucial feature of RL to help increase the sample rate of environments through sampling multiple sub-environments at the same time.
29+
[Gymnasium](https://gymnasium.farama.org/api/vector/) provides a generalised vectorisation capability, however, is relatively slow due its python implementation.
30+
For faster implementations, [EnvPool](https://github.com/sail-sg/envpool) provide C++ vectorisation that significantly increase the sample speed but it no longer maintained.
31+
Inspired by the `EnvPool` implementation, we've implemented an asynchronous vectorisation environment in C++, in particular, the [standard Atari preprocessing](https://gymnasium.farama.org/api/wrappers/misc_wrappers/#gymnasium.wrappers.AtariPreprocessing) including frame skipping, frame stacking, observation resizing, etc.
32+
33+
For full documentation of the vector environment, see [this page](https://ale.farama.org/v0.11.0/vector-environment).
34+
35+
We will continue building out this vectorisation to include [XLA](https://github.com/openxla/xla) support, improved preprocessing and auto resetting.
36+
37+
As this is an experimental feature, we wish to hear about any bugs, problems or features to add. Raise an issue on GitHub or ask a question on the [Farama Discord server](https://discord.gg/bnJ6kubTg6).
38+
839
## [0.10.2](https://github.com/Farama-Foundation/Arcade-Learning-Environment/compare/v0.10.1...v0.10.2) - 2025-02-13
940

1041
Fixed performance regression for CPP users - A single-argument `act` function was missing causing the `paddle_strength` introduced in v0.10.0 to default to zero rather than one. As Gymnasium passed this variable to act, this was only an issue for users directly interacting with `ale_interface`. For more details, see https://github.com/Farama-Foundation/Arcade-Learning-Environment/pull/595.

CMakeLists.txt

+6
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ if(SDL_SUPPORT)
1313
list(APPEND VCPKG_MANIFEST_FEATURES "sdl")
1414
endif()
1515

16+
option(BUILD_VECTOR_LIB "Build Vector Interface" OFF)
17+
if (BUILD_VECTOR_LIB)
18+
list(APPEND VCPKG_MANIFEST_FEATURES "vector")
19+
add_definitions(-DBUILD_VECTOR_LIB)
20+
endif()
21+
1622
# Set cmake module path
1723
set(CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake ${CMAKE_MODULE_PATH})
1824

docs/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ env.close()
4141
getting-started
4242
env-spec
4343
environments
44+
vector-environment
4445
multi-agent-environments
4546
faq
4647
citing

docs/vector-environment.md

+196
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
# ALE Vector Environment Guide
2+
3+
## Introduction
4+
5+
The Arcade Learning Environment (ALE) Vector Environment provides a high-performance implementation for running multiple Atari environments in parallel. This implementation utilizes native C++ code with multi-threading to achieve significant performance improvements, especially when running many environments simultaneously.
6+
7+
The vector environment is equivalent to `FrameStackObservation(AtariPreprocessing(gym.make("ALE/{AtariGame}-v5")), stack_size=4)`.
8+
9+
## Key Features
10+
11+
- **Parallel Execution**: Run multiple Atari environments simultaneously with minimal overhead
12+
- **Standard Preprocessing**: Includes standard preprocessing steps from the Atari Deep RL literature:
13+
- Frame skipping
14+
- Observation resizing
15+
- Grayscale conversion
16+
- Frame stacking
17+
- NoOp initialization at reset
18+
- Fire reset (for games requiring the fire button to start)
19+
- Episodic life modes
20+
- **Performance Optimizations**:
21+
- Native C++ implementation
22+
- Next-step autoreset (see [blog](https://farama.org/Vector-Autoreset-Mode) for more detail)
23+
- Multi-threading for parallel execution
24+
- Thread affinity options for better performance on multi-core systems
25+
- Batch processing capabilities
26+
- **Asynchronous Operation**: Split step operation into `send` and `recv` for more flexible control flow
27+
- **Gymnasium Compatible**: Implements the Gymnasium `VectorEnv` [interface](https://gymnasium.farama.org/api/vector/)
28+
29+
## Installation
30+
31+
The vector implementation is packaged with ale-py that can be installed through PyPI, `pip install ale-py`.
32+
33+
Optionally, users can build the project locally, requiring VCPKG, that will install OpenCV to support observation preprocessing.
34+
35+
## Basic Usage
36+
37+
### Creating a Vector Environment
38+
39+
```python
40+
from ale_py.vector_env import VectorAtariEnv
41+
42+
# Create a vector environment with 4 parallel instances of Breakout
43+
envs = VectorAtariEnv(
44+
game="Breakout",
45+
num_envs=4,
46+
)
47+
48+
# Reset all environments
49+
observations, info = envs.reset()
50+
51+
# Take random actions in all environments
52+
actions = envs.action_space.sample()
53+
observations, rewards, terminations, truncations, infos = envs.step(actions)
54+
55+
# Close the environment when done
56+
envs.close()
57+
```
58+
59+
## Advanced Configuration
60+
61+
The vector environment provides numerous configuration options:
62+
63+
```python
64+
envs = VectorAtariEnv(
65+
# Required parameters
66+
game="Breakout", # ROM name in snake_case
67+
num_envs=8, # Number of parallel environments
68+
69+
# Preprocessing parameters
70+
frame_skip=4, # Number of frames to skip (action repeat)
71+
grayscale=True, # Use grayscale observations
72+
stack_num=4, # Number of frames to stack
73+
img_height=84, # Height to resize frames to
74+
img_width=84, # Width to resize frames to
75+
76+
# Environment behavior
77+
noop_max=30, # Maximum number of no-ops at reset
78+
fire_reset=True, # Press FIRE on reset for games that require it
79+
episodic_life=False, # End episodes on life loss
80+
max_episode_steps=108000, # Max frames per episode (27000 steps * 4 frame skip)
81+
repeat_action_probability=0.0, # Sticky actions probability
82+
full_action_space=False, # Use full action space (not minimal)
83+
84+
# Performance options
85+
batch_size=0, # Number of environments to process at once (default=0 is the `num_envs`)
86+
num_threads=0, # Number of worker threads (0=auto)
87+
thread_affinity_offset=-1,# CPU core offset for thread affinity (-1=no affinity)
88+
seed=0, # Random seed
89+
)
90+
```
91+
92+
## Observation Format
93+
94+
The observation format from the vector environment is:
95+
96+
```
97+
observations.shape = (num_envs, stack_size, height, width)
98+
```
99+
100+
Where:
101+
- `num_envs`: Number of parallel environments
102+
- `stack_size`: Number of stacked frames (typically 4)
103+
- `height`, `width`: Image dimensions (typically 84x84)
104+
105+
This differs from the standard Gymnasium Atari environment format which uses:
106+
```
107+
observations.shape = (num_envs, stack_size, height, width) # Without num_envs
108+
```
109+
110+
## Performance Considerations
111+
112+
### Number of Environments
113+
114+
Increasing the number of environments typically improves throughput until you hit CPU core limits.
115+
For optimal performance, set `num_envs` close to the number of physical CPU cores.
116+
117+
### Send/Recv vs Step
118+
119+
Using the `send`/`recv` API can allow for better overlapping of computation and environment stepping:
120+
121+
```python
122+
# Send actions to environments
123+
envs.send(actions)
124+
125+
# Do other computation here while environments are stepping
126+
127+
# Receive results when ready
128+
observations, rewards, terminations, truncations, infos = envs.recv()
129+
```
130+
131+
### Batch Size
132+
133+
The `batch_size` parameter controls how many environments are processed simultaneously by the worker threads:
134+
135+
```python
136+
# Process environments in batches of 4
137+
envs = VectorAtariEnv(game="Breakout", num_envs=16, batch_size=4)
138+
```
139+
140+
A smaller batch size can improve latency while a larger batch size can improve throughput.
141+
When passing a batch size, the information will include the environment id of each observation
142+
which is critical as the first (batch size) observations are returned for `reset` and `recv`.
143+
144+
### Thread Affinity
145+
146+
On systems with multiple CPU cores, setting thread affinity can improve performance:
147+
148+
```python
149+
# Set thread affinity starting from core 0
150+
envs = VectorAtariEnv(game="Breakout", num_envs=8, thread_affinity_offset=0)
151+
```
152+
153+
154+
## Examples
155+
156+
### Training Example with PyTorch
157+
158+
```python
159+
import torch
160+
import numpy as np
161+
from ale_py.vector_env import VectorAtariEnv
162+
163+
# Create environment
164+
envs = VectorAtariEnv(game="Breakout", num_envs=8)
165+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
166+
167+
# Initialize model (simplified example)
168+
model = torch.nn.Sequential(
169+
torch.nn.Conv2d(4, 32, kernel_size=8, stride=4),
170+
torch.nn.ReLU(),
171+
torch.nn.Conv2d(32, 64, kernel_size=4, stride=2),
172+
torch.nn.ReLU(),
173+
torch.nn.Conv2d(64, 64, kernel_size=3, stride=1),
174+
torch.nn.ReLU(),
175+
torch.nn.Flatten(),
176+
torch.nn.Linear(3136, 512),
177+
torch.nn.ReLU(),
178+
torch.nn.Linear(512, envs.single_action_space.n)
179+
).to(device)
180+
181+
# Reset environment
182+
observations, _ = envs.reset()
183+
184+
# Training loop
185+
for step in range(1000):
186+
# Convert observations to PyTorch tensors
187+
obs_tensor = torch.tensor(observations, dtype=torch.float32, device=device) / 255.0
188+
189+
# Get actions from model
190+
with torch.no_grad():
191+
q_values = model(obs_tensor)
192+
actions = q_values.max(dim=1)[1].cpu().numpy()
193+
194+
# Step the environment
195+
observations, rewards, terminations, truncations, infos = envs.step(actions)
196+
```

pyproject.toml

+2-1
Original file line numberDiff line numberDiff line change
@@ -46,11 +46,12 @@ dynamic = ["version"]
4646
test = [
4747
"pytest>=7.0",
4848
"gymnasium>=1.0.0",
49+
"opencv-python>=3.0",
4950
]
5051

5152
[project.urls]
5253
homepage = "https://github.com/Farama-Foundation/Arcade-Learning-Environment"
53-
documentation = "https://github.com/Farama-Foundation/Arcade-Learning-Environment/tree/master/docs"
54+
documentation = "https://ale.farama.org"
5455
changelog = "https://github.com/Farama-Foundation/Arcade-Learning-Environment/blob/master/CHANGELOG.md"
5556

5657
[tool.setuptools]

setup.py

+1
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ def build_extension(self, ext):
4747
"-DSDL_DYNLOAD=ON",
4848
"-DBUILD_CPP_LIB=OFF",
4949
"-DBUILD_PYTHON_LIB=ON",
50+
"-DBUILD_VECTOR_LIB=ON",
5051
]
5152
build_args = []
5253

src/ale/CMakeLists.txt

+11
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,17 @@ if (BUILD_CPP_LIB OR BUILD_PYTHON_LIB)
5858
add_library(ale-lib ale_interface.cpp)
5959
set_target_properties(ale-lib PROPERTIES OUTPUT_NAME ale)
6060
target_link_libraries(ale-lib PUBLIC ale)
61+
62+
if (BUILD_VECTOR_LIB)
63+
find_package(OpenCV CONFIG REQUIRED)
64+
include_directories(${OpenCV_INCLUDE_DIRS})
65+
66+
target_include_directories(ale PUBLIC external)
67+
68+
add_subdirectory(vector)
69+
70+
target_link_libraries(ale PRIVATE ${OpenCV_LIBS})
71+
endif()
6172
endif()
6273

6374
# Python Library

0 commit comments

Comments
 (0)