Env/maze env #339

Jiya126 · 2026-01-29T10:13:01Z

Summary

Maze Environment addition: navigates to the exit cell, with test cases and github workflow for the env

Type of Change

Alignment Checklist

Before submitting, verify:

I have read .claude/docs/PRINCIPLES.md and this PR aligns with our principles
I have checked .claude/docs/INVARIANTS.md and no invariants are violated
I have run /pre-submit-pr (or bash .claude/hooks/lint.sh and tests) and addressed all issues

RFC Status

Not required (bug fix, docs, minor refactoring)
RFC exists: #___
RFC needed (will create before merge)

Test Plan

Follow README
and run test cases

PYTHONPATH=src:envs uv run pytest tests/envs/test_maze_environment.py -v

Claude Code Review

NA

greptile-apps · 2026-01-29T10:17:04Z

Greptile Overview

Greptile Summary

This PR adds a new Maze environment to OpenEnv, implementing a gridworld navigation task where an agent must reach an exit cell while avoiding walls.

Key Changes:

Implements complete environment with MazeAction, MazeObservation, and MazeState Pydantic models
Provides MazeEnv client with WebSocket support for persistent sessions
Core maze logic derived from MIT-licensed reference implementation with proper attribution
Includes comprehensive test suite covering reset, step, state endpoints, and concurrent sessions
Integrates with GitHub Actions workflow for automated Docker builds
Rewards are computed inside the environment boundary (penalties for moves, revisits, walls; reward for reaching exit)
Follows standard OpenEnv patterns: Gymnasium-style API, container isolation, client-server separation

Architecture:

Client (client.py) and server (server/) properly separated - no cross-imports
Environment supports concurrent WebSocket sessions (SUPPORTS_CONCURRENT_SESSIONS: bool = True)
Uses standard create_app() helper with max_concurrent_envs=1 (can be increased as needed)
Includes both in-repo and standalone import patterns for flexibility

Confidence Score: 5/5

This PR is safe to merge with no critical issues identified
The implementation follows all OpenEnv principles and invariants: proper client-server separation, rewards inside environment, Gymnasium-style API, comprehensive tests, and no syntax errors. The maze logic is properly attributed to its MIT-licensed source.
No files require special attention

Important Files Changed

Filename	Overview
envs/maze_env/models.py	Defines Pydantic models for MazeAction, MazeObservation, and MazeState with proper type safety
envs/maze_env/client.py	Implements MazeEnv client with WebSocket support, proper serialization/deserialization methods
envs/maze_env/server/maze_env_environment.py	Environment implementation with reset/step methods, rewards computed inside environment boundary
envs/maze_env/server/app.py	FastAPI server setup using create_app helper, follows standard environment pattern
envs/maze_env/server/maze.py	Core maze game logic derived from MIT-licensed reference implementation with proper attribution
tests/envs/test_maze_environment.py	Comprehensive test suite covering reset, step, state, and WebSocket functionality

Sequence Diagram

sequenceDiagram
    participant Client as MazeEnv Client
    participant WS as WebSocket Connection
    participant Server as FastAPI Server
    participant Env as MazeEnvironment
    participant Maze as Maze Core Logic

    Client->>Server: Connect WebSocket
    Server->>Env: Create environment instance
    Env->>Maze: Initialize maze (start_cell, exit_cell)
    Maze-->>Env: Maze initialized
    Server-->>Client: Connection established

    Client->>WS: reset() message
    WS->>Server: Forward reset request
    Server->>Env: reset(seed, start_cell)
    Env->>Maze: reset(start_cell)
    Maze->>Maze: Reset agent position, clear visited cells
    Maze-->>Env: Initial state
    Env->>Env: Build MazeObservation (legal_actions, position, metadata)
    Env-->>Server: MazeObservation with reward=0
    Server-->>WS: Serialize observation
    WS-->>Client: StepResult with observation

    Client->>WS: step(MazeAction) message
    WS->>Server: Forward action
    Server->>Env: step(MazeAction)
    Env->>Maze: step(action_id)
    Maze->>Maze: Execute move, calculate reward
    Note over Maze: Penalties: -0.05 (move)<br/>-0.25 (revisit)<br/>-0.75 (wall)<br/>Reward: +10.0 (exit)
    Maze->>Maze: Check status (WIN/LOSE/PLAYING)
    Maze-->>Env: state, reward, status
    Env->>Env: Build MazeObservation with updated position
    Env->>Env: Update internal state (step_count, done)
    Env-->>Server: MazeObservation with reward, done
    Server-->>WS: Serialize observation
    WS-->>Client: StepResult

    Client->>WS: state() message
    WS->>Server: Forward state request
    Server->>Env: Get state property
    Env-->>Server: MazeState (episode_id, step_count, position, status)
    Server-->>WS: Serialize state
    WS-->>Client: MazeState

    Client->>WS: close()
    WS->>Server: Disconnect
    Server->>Env: Cleanup session
    Env-->>Server: Session closed

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-29T10:17:07Z

envs/maze_env/models.py

+    done: bool = False
+    current_position: List[int] = Field(default_factory=list)
+    exit_cell: List[int] = Field(default_factory=list)
+    status : str = "playing"  # e.g., "playing", "win", "lose"


Space before colon breaks Python syntax

Suggested change

status : str = "playing" # e.g., "playing", "win", "lose"

status: str = "playing" # e.g., "playing", "win", "lose"

Prompt To Fix With AI

This is a comment left during a code review. Path: envs/maze_env/models.py Line: 56:56 Comment: Space before colon breaks Python syntax ```suggestion status: str = "playing" # e.g., "playing", "win", "lose" ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-29T10:17:09Z

envs/maze_env/client.py

+            step_count=payload.get("step_count", 0),
+            done=payload.get("done", False),
+            current_position=payload.get("current_position", []),
+            exit_cell=payload.get("exit_cell", []),


Space before = breaks Python syntax

Suggested change

exit_cell=payload.get("exit_cell", []),

status=payload.get("status", "playing"),

Prompt To Fix With AI

This is a comment left during a code review. Path: envs/maze_env/client.py Line: 113:113 Comment: Space before `=` breaks Python syntax ```suggestion status=payload.get("status", "playing"), ``` How can I resolve this? If you propose a fix, please make it concise.

Jiya126 · 2026-01-30T12:58:14Z

@Darktex could you review this env
I have not added any renders yet. Any suggestions on how to do the same

Jiya126 added 3 commits January 29, 2026 15:04

Env addition: Maze env added to navigate to the exit cell

eb69fbf

Test cases: test file added for maze env

c6698a0

Githooks update: Updated githooks for maze env automate docker build

5cd702a

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 29, 2026

greptile-apps bot reviewed Jan 29, 2026

View reviewed changes

Syntax fixes

b95bc9f

Jiya126 marked this pull request as ready for review January 29, 2026 10:20

Jiya126 mentioned this pull request Jan 29, 2026

[Env Request] Maze game #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Env/maze env #339

Env/maze env #339

Uh oh!

Jiya126 commented Jan 29, 2026

Uh oh!

greptile-apps bot commented Jan 29, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 29, 2026

Uh oh!

greptile-apps bot Jan 29, 2026

Uh oh!

Jiya126 commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	status : str = "playing" # e.g., "playing", "win", "lose"
	status: str = "playing" # e.g., "playing", "win", "lose"

	exit_cell=payload.get("exit_cell", []),
	status=payload.get("status", "playing"),

Env/maze env #339

Are you sure you want to change the base?

Env/maze env #339

Uh oh!

Conversation

Jiya126 commented Jan 29, 2026

Summary

Type of Change

Alignment Checklist

RFC Status

Test Plan

Claude Code Review

Uh oh!

greptile-apps bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Jiya126 commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot commented Jan 29, 2026 •

edited

Loading