WorldModel Gym v0.1.0

Initial public release of the WorldModel Gym benchmark platform.

Highlights

Added three procedural long-horizon benchmark environments:
- MemoryMaze (key-door sparse-reward POMDP)
- SwitchQuest (hidden-order switch chaining)
- CraftLite (resource/crafting dependency graph)
Added dual observation modes (rgb, symbolic, both) and stable semantic episode traces.
Added deterministic evaluation harness with train/test seed tracks and continual-shift track.
Added metrics pipeline:
- success_rate, mean_return, median_steps_to_success
- achievement completion counts
- planning cost (wall_clock_ms_per_step, imagined transitions, peak memory)
- model fidelity (k-step reward prediction error)
- generalization gap and continual transfer metrics
Added planners:
- MCTS with simulation/depth budgets and tree diagnostics
- MPC-CEM with population/iteration/horizon budgets and score distributions
Added baseline agent suite:
- random, greedy-oracle, planner-only oracle
- imagination agent (online world model + MPC)
- search MCTS skeleton
- model-free PPO placeholder API
Added world model baselines:
- deterministic latent model
- stochastic latent model (RSSM-style simplified)
- ensemble wrapper for uncertainty proxy
Added backend platform:
- FastAPI runs API and uploads
- SQLite persistence via SQLAlchemy
- leaderboard/task/run endpoints
Added UI clients:
- Next.js dashboard (home/tasks/leaderboard/run viewer)
- Expo mobile viewer (tasks/leaderboard/run summary)
Added reproducibility stack:
- Dockerfiles and docker-compose
- CI workflow and pre-commit gates
- Makefile automation for setup/lint/test/demo/paper
Added paper artifacts:
- imported draft PDF
- LaTeX manuscript + BibTeX references

Validation

make lint passes.
make test passes.
make demo runs benchmark + upload flow.
make paper builds paper PDF (with fallback build path when TeX toolchain is unavailable).

Breaking Changes

None (first public release).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WorldModel Gym v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

WorldModel Gym v0.1.0

Highlights

Validation

Breaking Changes

Uh oh!