The Future of AI Evaluation MC-Bench changes how we evaluate AI models by challenging them to create Minecraft builds.
Docker Quick Start (recommended)
Run the ruff formatter
make fmtRun the ruff checker
make checkRun the ruff checker with --fix option
make check-fix