Skip to content

art-e-fact/connect-four-demo

Repository files navigation

Boardgame Workbench with NVIDIA-Cosmos and Artefacts

This repository provides tools to train, run, and evaluate (test) robots to play boardgames. Examples are done using:

Contents

Install

connect-four-demo.mp4

Prerequisites

  • A machine with an NVIDIA card (Tested on mobile 5070ti (12GB) and desktop 4060ti (16GB))
  • uv
  • git lfs

Clone and Pull LFS Objects

This repo uses git submodules (which themselves use Git LFS), so clone recursively:

git clone --recurse-submodules https://github.com/art-e-fact/connect_four-demo.git
cd connect_four-demo
git submodule foreach --recursive 'git lfs pull'

Project structure

This repository is organized as a collection of independent Python packages (e.g., cosmos-reason-node, gr00t-node, simulation, so100-driver, cosmos-visual-tester).

Instead of a monolithic workspace, we maintain strict isolation between modules. This simplifies dependency management and allows you to easily cherry-pick parts for your own projects. Because of this, we use uv with the --directory flag to run commands within the context of a specific package.

Note: All commands in this README assume you are running them from the repository root using the pattern:

uv run --directory <package_name> <command> ...

### Test installation

Test everything was correctly installed by running the teleop-agent in simulation (uv will automatically pull in Isaac Sim)

```bash
uv run --directory simulation teleop-agent \
  --task LeIsaac-SO101-ConnectFour-Ball-v0 \
  --teleop_device=keyboard \
  --enable_cameras

Inference

You will need a huggingface account to pull the relevant models

For Both Simulation and Real Hardware

Start the strategy server

  • Change port if needed
uv run --directory cosmos-reason-node strategy-server \
    --host 0.0.0.0 \
    --port 5556

(You can add the --device cuda:0 flag if your graphics card is capable of running cosmos-reason, groot, and IsaacSim all on the GPU)

Start the policy server

  • Change model-path and port if needed
uv run --directory gr00t-node server \
    --embodiment-tag NEW_EMBODIMENT \
    --model-path tomo202/groot_n1_6_so101-isaac-connect-four-ball_checkpoint \
    --device cuda:0 \
    --host 0.0.0.0 \
    --port 5555

For Simulation

Run the simulation client

uv run --directory simulation inference \
  --task LeIsaac-SO101-ConnectFour-Ball-v0 \
  --policy_host=localhost \
  --policy_port=5555 \
  --strategy_host=localhost \
  --strategy_port=5556 

When the robot receives a new strategy from cosmos-reason, a new ball (alternating colours) will be placed in front of the robot. If the ball is unreachable for whatever reason (e.g. goes out of bounds, isn't placed correctly), press "P" to spawn a new ball of the same colour.

With Real Hardware

Run the SO101 client

  • Change policy_port, robot.port, robot.id, lang_instruction and robot.cameras if needed
uv run --directory so100-driver so100\
  --robot.type=so101_follower \
  --robot.port=/dev/ttyACM0 \
  --robot.id=white_follower \
  --robot.cameras="{ wrist: {type: opencv, index_or_path: /dev/video0, width: 640, height: 480, fps: 30}, front: {type: opencv, index_or_path: /dev/video2, width: 640, height: 480, fps: 30}}" \
  --policy_host=localhost \
  --policy_port=5555 \
  --strategy_host=localhost \
  --strategy_port=5556 \
  --lang_instruction="Play a game of connect four against me, and try to win!"

Dataset Collection with the Teleop Agent

With real hardware

Assumes LeRobot SO101 is set up and configured.

  • Change flags as required.
conda activate lerobot
lerobot-record \ 
  --robot.type=so101_follower \
  --robot.port=/dev/ttyACM0 \
  --robot.id=follower_arm \
  --robot.cameras='{ wrist: {type: opencv, index_or_path: 4, width: 640, height: 480, fps: 30, fourcc: "MJPG"}, front: {type: opencv, index_or_path: 6, width: 640, height: 480, fps: 30, fourcc: "MJPG"}}' \
  --teleop.type=so101_leader \
  --teleop.port=/dev/ttyACM1 \
  --teleop.id=leader_arm \
  --display_data=true \
  --dataset.repo_id=<my-repo> \
  --dataset.num_episodes=20 \
  --dataset.single_task="play connect 4"

With simulation

uv run --directory simulation teleop-agent \
    --task=LeIsaac-SO101-ConnectFour-Ball-v0 \
    --teleop_device=so101leader \
    --port=/dev/ttyACM0 \
    --num_envs=1 \
    --device=cuda \
    --enable_cameras \
    --record \
    --use_lerobot_recorder \
    --lerobot_dataset_repo_id=hf-username/dataset-name

Nebius (Remote GPU)

Both Dataset Massaging with Cosmos and Training can run on Nebius GPU VMs. One-time setup:

# Install Nebius CLI & authenticate
curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh | bash
nebius profile create --profile <unique-name-here> \
  --endpoint api.nebius.cloud \
  --federation-endpoint auth.nebius.com \
  --parent-id <project-id-from-web-console>
  # Browser will open for auth

# jq (for JSON parsing)
sudo apt install jq
# SSH key (if you don't have one)
ssh-keygen -t ed25519

# HuggingFace token (used by both scripts)
export HF_TOKEN="hf_..."

Jobs run detached — you can Ctrl+C or close your terminal and they continue on the VM.

# Check Status
./scripts/nebius-cosmos.sh --check
./scripts/nebius-train.sh --check

# Cleanup (tear down VM + disk after a job finishes):
./scripts/nebius-cosmos.sh --cleanup
./scripts/nebius-train.sh --cleanup

Note: VMs stay running after jobs complete. Always run --cleanup when done to avoid charges.

VM state is saved in ~/.nebius-cosmos/ and ~/.nebius-train/ so cleanup works even after restarting your machine.


Dataset Massaging with Cosmos

Use Cosmos-Reason to automatically identify and extract individual demonstrations from a raw teleop recording.

Locally

# 1. Generate annotations
uv run --directory cosmos-dataset-editor cosmos-generate <hf_dataset> --output-toml my-project.toml
# 2. (Optional) Review and fix the annotations in a TUI
uv run --directory cosmos-dataset-editor cosmos-edit my-project.toml
# 3. Create the new dataset and push to HuggingFace
uv run --directory cosmos-dataset-editor cosmos-recut my-project.toml --push-to-hub
  • Use --new-dataset-id <owner>/<name> on either cosmos-generate or cosmos-recut to override the default (<source>-recut).
  • Use --model nvidia/Cosmos-Reason2-8B on the cosmos-generate step for better accuracy (requires more VRAM).

On Nebius

Requires Nebius setup. Runs cosmos-generate + cosmos-recut with the 8B model, then pushes to HuggingFace.

./scripts/nebius-cosmos.sh \
    --docker-image tomolnorman/cosmos-recut:latest \
    --dataset-id   <huggingface-dataset-to-process>

(You can build Dockerfile.cosmos and push to your own hub also)

Optional flags:

Flag Default Description
--new-dataset-id <dataset-id>-recut Output dataset ID
--camera-key auto-detect Camera key to process
--platform gpu-h100-sxm Nebius GPU platform
--preset 1gpu-16vcpu-200gb VM preset
--disk-size 100 Boot disk GiB

Training

Prerequisites

In addition to a HuggingFace account and token, you will need an API key from Weights and Biases

Local (Docker)

You will need the NVIDIA Container Toolkit to passthrough the GPU to Docker.

  1. Build the image
docker build -t groot-training . # rename as you wish
  1. Run with arguments:
docker run --gpus=all --shm-size=16g \
  -e LEROBOT_DS_ID="..." \
  -e MODEL_ID="..." \
  -e HF_TOKEN=hf_... \
  -e WANDB_API_KEY=... \
  -e GLOBAL_BATCH_SIZE=1 \
  groot-training

In particular, adjust shm-size (shared memory space, docker defaults to 64mb) and GLOBAL_BATCH_SIZE (according to how much vram you have)

On Nebius

Requires Nebius setup. Launches a fine-tuning run on a H100 VM — trains and uploads to HuggingFace.

export WANDB_API_KEY="..."

./scripts/nebius-train.sh \
    --docker-image  tomolnorman/groot-finetune:latest \
    --dataset-id    <huggingface-dataset-to-pull-from> \
    --model-id      <huggingface-model-to-push-to>

(You can build Dockerfile and push to your own hub also)

Optional flags:

Flag Default Description
--max-steps 10000 Training steps
--learning-rate 1e-4 Learning rate
--batch-size 64 Global batch size
--save-steps 2500 Checkpoint interval
--platform gpu-h100-sxm Nebius GPU platform
--preset 1gpu-16vcpu-200gb VM preset
--disk-size 250 Boot disk GiB

Testing

In cosmos-visual-tester/tests/test_connect_four.py you will find an example of how to run test evaluations post-simulation using Cosmos-Reason. By doing so we can keep our pytest file simple, using natural language assertions. When running the test, it will start a simulation using Cosmos-Reason (strategy) and GR00T (motor control) for a few steps and record a video. The video is then analyzed by Cosmos-Reason against the assertion statements made in the test file.

Tests are ran headlessly by default. See the test README for environment variables that can be configured.

Locally

uv run --directory cosmos-visual-tester pytest -v

With Artefacts

Although the tests can be run locally, Artefacts can orchestrate the run and automatically upload results, logs, and the recorded video to the Artefacts Dashboard — making it easy to run and parameterize your tests, as well as view, store, and share test results across your team.

The artefacts-cli (installed via pip) and an artefacts.yaml file are required (the yaml is already in this repository).

Installation and setup

  1. Create an account at app.artefacts.com
  2. Create an organization and a project
  3. Rename the project in the artefacts.yaml file to your <org_name>/<project_name>
  4. Install the CLI (we suggest using a virtual environment or pipx):
    pip install artefacts-cli
    artefacts config add <org_name>/<project_name>
  5. You will be redirected to the dashboard (browser) to create an API key — paste it into your terminal
  6. Select N when prompted about whether to create a new artefacts.yaml (already in this repo)

Run

artefacts run test-cosmos

Results, logs, and a video will be uploaded to the dashboard.

See Docs for more information on using Artefacts.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors