Skip to content

Conversation

@Jiya126
Copy link
Contributor

@Jiya126 Jiya126 commented Jan 29, 2026

Summary

Maze Environment addition: navigates to the exit cell, with test cases and github workflow for the env

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • New environment
  • Refactoring

Alignment Checklist

Before submitting, verify:

  • I have read .claude/docs/PRINCIPLES.md and this PR aligns with our principles
  • I have checked .claude/docs/INVARIANTS.md and no invariants are violated
  • I have run /pre-submit-pr (or bash .claude/hooks/lint.sh and tests) and addressed all issues

RFC Status

  • Not required (bug fix, docs, minor refactoring)
  • RFC exists: #___
  • RFC needed (will create before merge)

Test Plan

Follow README
and run test cases

PYTHONPATH=src:envs uv run pytest tests/envs/test_maze_environment.py -v

Claude Code Review

NA

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 29, 2026
@greptile-apps
Copy link

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

This PR adds a new Maze environment to OpenEnv, implementing a gridworld navigation task where an agent must reach an exit cell while avoiding walls.

Key Changes:

  • Implements complete environment with MazeAction, MazeObservation, and MazeState Pydantic models
  • Provides MazeEnv client with WebSocket support for persistent sessions
  • Core maze logic derived from MIT-licensed reference implementation with proper attribution
  • Includes comprehensive test suite covering reset, step, state endpoints, and concurrent sessions
  • Integrates with GitHub Actions workflow for automated Docker builds
  • Rewards are computed inside the environment boundary (penalties for moves, revisits, walls; reward for reaching exit)
  • Follows standard OpenEnv patterns: Gymnasium-style API, container isolation, client-server separation

Architecture:

  • Client (client.py) and server (server/) properly separated - no cross-imports
  • Environment supports concurrent WebSocket sessions (SUPPORTS_CONCURRENT_SESSIONS: bool = True)
  • Uses standard create_app() helper with max_concurrent_envs=1 (can be increased as needed)
  • Includes both in-repo and standalone import patterns for flexibility

Confidence Score: 5/5

  • This PR is safe to merge with no critical issues identified
  • The implementation follows all OpenEnv principles and invariants: proper client-server separation, rewards inside environment, Gymnasium-style API, comprehensive tests, and no syntax errors. The maze logic is properly attributed to its MIT-licensed source.
  • No files require special attention

Important Files Changed

Filename Overview
envs/maze_env/models.py Defines Pydantic models for MazeAction, MazeObservation, and MazeState with proper type safety
envs/maze_env/client.py Implements MazeEnv client with WebSocket support, proper serialization/deserialization methods
envs/maze_env/server/maze_env_environment.py Environment implementation with reset/step methods, rewards computed inside environment boundary
envs/maze_env/server/app.py FastAPI server setup using create_app helper, follows standard environment pattern
envs/maze_env/server/maze.py Core maze game logic derived from MIT-licensed reference implementation with proper attribution
tests/envs/test_maze_environment.py Comprehensive test suite covering reset, step, state, and WebSocket functionality

Sequence Diagram

sequenceDiagram
    participant Client as MazeEnv Client
    participant WS as WebSocket Connection
    participant Server as FastAPI Server
    participant Env as MazeEnvironment
    participant Maze as Maze Core Logic

    Client->>Server: Connect WebSocket
    Server->>Env: Create environment instance
    Env->>Maze: Initialize maze (start_cell, exit_cell)
    Maze-->>Env: Maze initialized
    Server-->>Client: Connection established

    Client->>WS: reset() message
    WS->>Server: Forward reset request
    Server->>Env: reset(seed, start_cell)
    Env->>Maze: reset(start_cell)
    Maze->>Maze: Reset agent position, clear visited cells
    Maze-->>Env: Initial state
    Env->>Env: Build MazeObservation (legal_actions, position, metadata)
    Env-->>Server: MazeObservation with reward=0
    Server-->>WS: Serialize observation
    WS-->>Client: StepResult with observation

    Client->>WS: step(MazeAction) message
    WS->>Server: Forward action
    Server->>Env: step(MazeAction)
    Env->>Maze: step(action_id)
    Maze->>Maze: Execute move, calculate reward
    Note over Maze: Penalties: -0.05 (move)<br/>-0.25 (revisit)<br/>-0.75 (wall)<br/>Reward: +10.0 (exit)
    Maze->>Maze: Check status (WIN/LOSE/PLAYING)
    Maze-->>Env: state, reward, status
    Env->>Env: Build MazeObservation with updated position
    Env->>Env: Update internal state (step_count, done)
    Env-->>Server: MazeObservation with reward, done
    Server-->>WS: Serialize observation
    WS-->>Client: StepResult

    Client->>WS: state() message
    WS->>Server: Forward state request
    Server->>Env: Get state property
    Env-->>Server: MazeState (episode_id, step_count, position, status)
    Server-->>WS: Serialize state
    WS-->>Client: MazeState

    Client->>WS: close()
    WS->>Server: Disconnect
    Server->>Env: Cleanup session
    Env-->>Server: Session closed
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

done: bool = False
current_position: List[int] = Field(default_factory=list)
exit_cell: List[int] = Field(default_factory=list)
status : str = "playing" # e.g., "playing", "win", "lose"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space before colon breaks Python syntax

Suggested change
status : str = "playing" # e.g., "playing", "win", "lose"
status: str = "playing" # e.g., "playing", "win", "lose"
Prompt To Fix With AI
This is a comment left during a code review.
Path: envs/maze_env/models.py
Line: 56:56

Comment:
Space before colon breaks Python syntax

```suggestion
    status: str = "playing"  # e.g., "playing", "win", "lose"
```

How can I resolve this? If you propose a fix, please make it concise.

step_count=payload.get("step_count", 0),
done=payload.get("done", False),
current_position=payload.get("current_position", []),
exit_cell=payload.get("exit_cell", []),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space before = breaks Python syntax

Suggested change
exit_cell=payload.get("exit_cell", []),
status=payload.get("status", "playing"),
Prompt To Fix With AI
This is a comment left during a code review.
Path: envs/maze_env/client.py
Line: 113:113

Comment:
Space before `=` breaks Python syntax

```suggestion
            status=payload.get("status", "playing"),
```

How can I resolve this? If you propose a fix, please make it concise.

@Jiya126 Jiya126 marked this pull request as ready for review January 29, 2026 10:20
@Jiya126
Copy link
Contributor Author

Jiya126 commented Jan 30, 2026

@Darktex could you review this env
I have not added any renders yet. Any suggestions on how to do the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant