docs: warn against using Gymnasium FrameStackObservation for image observations by midhunxavier · Pull Request #2258 · DLR-RM/stable-baselines3

midhunxavier · 2026-05-27T19:23:25Z

Description

Adds a warning to the custom-environment documentation explaining that Gymnasium's FrameStackObservation wrapper adds a new leading dimension to image observations (e.g. an image of shape (3, 64, 64) becomes (2, 3, 64, 64) when stacking 2 frames). SB3's is_image_space check requires exactly 3 dimensions, so the stacked space is treated as a batch of images rather than an image. As a result, a FlattenExtractor is silently used instead of a CNN.

The note recommends using SB3's VecFrameStack instead (which stacks along the channel dimension and keeps the observation recognized as an image) and links to the issue. A changelog entry is added accordingly.

This follows @araffin's guidance in the issue to keep the is_image_space check as-is and document the pitfall instead of changing the behavior.

Motivation and Context

closes #2090

Users stacking image frames with Gymnasium's FrameStackObservation get a silent fallback to FlattenExtractor (no CNN), with no error, which is surprising and hard to debug. Documenting the recommended VecFrameStack alternative prevents this pitfall.

I have raised an issue to propose this change (required for new features and bug fixes) — issue [Bug]: is_image_space works poorly with Gymnasium's FrameStackObservation #2090, triaged by a maintainer (@araffin), who proposed this documentation approach.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist

LLM/code-assistant disclosure (per CONTRIBUTING.md): this documentation change was prepared with the assistance of Claude, an LLM code assistant. The approach — documenting the pitfall and recommending VecFrameStack rather than altering is_image_space — was proposed by maintainer @araffin in #2090. I have personally reviewed and verified the change.

…-RM#2090) Gymnasium's FrameStackObservation adds a leading dimension, turning a (3, 64, 64) image into (2, 3, 64, 64). SB3's is_image_space() requires exactly 3 dimensions, so the stacked space is not recognized as an image and a FlattenExtractor is silently used instead of a CNN. Per maintainer guidance in DLR-RM#2090, keep the is_image_space check as-is and document the pitfall instead, recommending VecFrameStack. Adds a warning admonition on the custom env page and a changelog entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: warn against using Gymnasium FrameStackObservation for image observations#2258

docs: warn against using Gymnasium FrameStackObservation for image observations#2258
midhunxavier wants to merge 1 commit into
DLR-RM:masterfrom
midhunxavier:docs/framestack-is-image-space-note

midhunxavier commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

midhunxavier commented May 27, 2026

Description

Motivation and Context

Types of changes

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant