# MCTS demos
cargo run -p colver-core --bin mcts_demo --release -- 100
cargo run -p colver-core --bin smart_ismcts_demo --release -- 100
# DD solver benchmark
cargo run -p colver-core --bin dd_bench --release -- 1000
# Python bindings
uv sync
uv run python3 -c "import colver; env = colver.Env(); env.reset()"
# Web frontend
uv run python -m colver.web
# Docker
docker build -t colver . && docker run -p 8000:8000 colver# Generate game replay data (COLVGM01 format, preferred)
cargo run -p colver-core --bin generate_game_data --release --features parallel -- \
--dmc-model models/dmc_final.bin --games 500000 --output data/games.bin
# V2 training (304-dim, standard architecture)
cargo run -p colver-core --bin train_belief_net --features dmc_train --release -- \
--replays data/training/games_500k.bin --epochs 200 --batch-size 512 --lr 3e-4 \
--v2 --augment --cosine-lr --warmup-epochs 10 --val-split 0.05 \
--output models/belief_net_v2.bin
# V3 temporal features (380-dim)
cargo run -p colver-core --bin train_belief_net --features dmc_train --release -- \
--replays data/training/games_500k.bin --epochs 15 --batch-size 512 --lr 3e-4 \
--v3 --augment --cosine-lr --warmup-epochs 3 --output models/race_v3.bin
# V3 3-class output (observer excluded, 3 classes: left/partner/right)
# count-reg default is now 0.1; 300 epochs for overnight training
cargo run -p colver-core --bin train_belief_net --features dmc_train --release -- \
--replays data/training/games_500k.bin --epochs 300 --batch-size 512 --lr 3e-4 \
--cosine-lr --warmup-epochs 10 --seed 42 --v2 --augment \
--count-reg 0.1 --output models/belief_v3.bin
# Architecture variants: --variant cross_attn | suit_shared | var_mlp | aux_loss
# Variable MLP: --variant var_mlp --num-layers 3 --hidden 256
# Card count regularization: --count-reg 0.1
# suit_shared doesn't need --augment (equivariant by construction)cargo run --bin belief_eval --release -- --model models/belief_net_v2.bin \
--replays data/training/games_500k.bin --mode MODE --games 5000Modes: offline (accuracy/CE/calibration), match (IS-DD NN vs heuristic), diagnose (per-card predictions), scenario (hand-crafted tests), per_trick (per-trick accuracy), ablation (input block importance), ensemble (multi-model averaging via --model m1.bin,m2.bin).
# Rust training (candle, ~474 steps/s on 4090)
cargo run -p colver-core --bin train_dmc --features dmc_train --release -- \
--num-envs 256 --steps 35000000 \
--bid-model models/bid_nn_final.bin \
--nn-bid-start 0.75 --nn-bid-end 0.95 --nn-bid-anneal-steps 20000000 \
--eval-freq 1000000 \
--eval-random-matches 100 \
--eval-isdd-matches 10 --eval-isdd-time-ms 20 \
--eval-checkpoint models/dmc_35.bin --eval-checkpoint-matches 50
# Python training (slower, ~140 steps/s)
PYTHONPATH=scripts/training uv run python scripts/training/train_dmc.py --num-envs 256 --steps 20000000
# Evaluation
uv run python scripts/analysis/eval_dmc.py models/dmc_final.pt \
--games 200 --baseline smart --time-ms 20 --both-sides
# Export PyTorch weights to Rust binary format
python scripts/export/export_dmc_weights.py models/dmc_final.pt models/dmc_final.bin# Bid a Dede (v2, default): 3×512, DD solver + 24× suit augmentation
cargo run -p colver-core --bin train_bid_nn --features dmc_train --release -- \
--hidden 512 --layers 3 --steps 20000000 --pool-file data/pools/dd_2.5M.bin
# Bid a Doudou (v1, legacy): 2×256, DouZero self-play
cargo run -p colver-core --bin train_bid_nn --features dmc_train --release -- \
--num-envs 64 --steps 5000000 --pool-size 1000000Phase 1: pre-solves deal pool (1M deals x 4 suits). Phase 2: trains Dueling DQN with PER + opponent diversity. BidNet::load auto-detects hidden size (tries 256, 512, 1024).
# Generate training data
cargo run -p colver-core --bin generate_value_data --release --features nn -- \
10000 data/training/value_train.bin --fast
# Train (PyTorch)
python scripts/training/train_value_net.py --data data/training/value_train.bin --output models/value_net.bin
# Evaluate
cargo run -p colver-core --bin nn_experiment --release --features nn -- \
models/value_net.bin 50 --data data/training/value_train.binPublished as colver via GitHub Actions with trusted publishing.
git tag v0.2.1 && git push origin v0.2.1
# Builds wheels for: x86_64-linux, aarch64-linux, x86_64-macos, aarch64-macos, x86_64-windowsAll binaries: cargo run -p colver-core --bin NAME --release -- ARGS
| Binary | Feature | Description |
|---|---|---|
bench |
— | Performance benchmark (~1.3M rollouts/sec) |
mcts_demo |
rand | MCTS vs random demo |
smart_ismcts_demo |
rand | Smart IS-MCTS vs random + vs naive |
oracle_experiment |
rand | Bid achievability with perfect-info MCTS |
bidding_experiment |
rand | Head-to-head bidding strategies |
match_experiment |
rand | Full match play (first to 2000 pts) |
bid_tournament |
rand | Round-robin parameterized bidding |
bid_debug |
rand | Side-by-side bidding printout |
bid_compare |
rand | Bidding comparison with DD oracle |
bid_nn_eval |
rand | Evaluate bid NN vs heuristic bidders |
bid_nn_tournament |
rand | NN bid round-robin across play methods |
strength_experiment |
rand | Rollout policy comparison, D×I sweep |
maxi_diagnose |
rand | Maxi vs DMC play-by-play diagnostic |
v2_tournament |
rand | V2 bidding fine-tune tournament |
dd_bench |
— | DD solver benchmark |
dd_calibrate |
rand | DD bidding calibration |
isdd_sweep |
rand | IS-DD parameter sweep (count/time/soft) |
generate_belief_data |
rand | Belief training data (COLVBL01 format) |
generate_game_data |
rand | Game replays (COLVGM01 format) |
generate_value_data |
nn | NN value function training data |
train_belief_net |
dmc_train | Belief network training |
train_bid_nn |
dmc_train | NN bidding training |
train_dmc |
dmc_train | DMC card play training |
belief_eval |
rand | Belief network evaluation (7 modes) |
nn_experiment |
nn | NN value function evaluation |
Recommended web configs based on sweep (200 deals, vs DouDou35 DMC play model):
- 20ms time-limited + soft inference: ~48% vs DouDou35, ~230ms/deal
- 50ms time-limited: ~57%, 515ms/deal (higher quality)
- Gains plateau sharply after D=8 determinizations
- Soft inference worth it at D≥16 (+3.5% for 7% more compute)
Note: The default play model is now DouDou50 (411-dim canonical ResNet, 50M steps). DouDou35 (415-dim legacy, 35M steps) remains available for backward compatibility.