Skip to content

SaridakisStamatisChristos/sudoku_dlx

sudoku_dlx

Fast Sudoku solver & generator using Algorithm X / Dancing Links (DLX) with Python bitsets and an incremental cover model. Exposes timing & search stats for reproducible benchmarking. Runs in the browser via Pyodide (demo linked below).

© 2025 Stamatis-Christos Saridakis — MIT. Core algorithm: exact cover (Knuth). This implementation is original and bitset-based.

CI Docs Codecov License: MIT PyPI version GitHub Pages

Features

  • Exact-cover Sudoku with DLX (no pointer structs; compact bitset encoding).
  • Stats exposed: elapsed ms, node visits, backtracks.
  • Generator: deterministic with --seed; removes clues while preserving unique solvability near a target clue count.
  • Difficulty rater: quick heuristic in [0, 10] from givens + search effort.
  • CLI (sudoku-dlx) + typed API + tests + CI + web demo.

Note: Current generator aims for unique puzzles near target_givens; minimality and symmetry options are planned (see Roadmap).

Install (dev)

git clone https://github.com/SaridakisStamatisChristos/sudoku_dlx.git
cd sudoku_dlx
python -m venv .venv && source .venv/bin/activate  # Windows: .\.venv\Scripts\activate
pip install -e ".[dev]"
pre-commit install
pytest -q

Difficulty (v2)

Deterministic score in [0,10] combining:

  • empties (gaps), DLX nodes, backtracks, and fill pressure. Invariant under isomorphisms; avoids timing for stability.

CLI

sudoku-dlx --help

# Solve (single grid as 81 chars; '.' or '0' for blanks)
sudoku-dlx solve --grid "..3.2.6..9..3.5..1..18.64..81.29..7....8....67..82.5......."
sudoku-dlx solve --file puzzles.txt               # a file with 9 lines of 9 chars
sudoku-dlx solve --grid "<81chars>" --pretty --stats

# Rate difficulty (0..10)
sudoku-dlx rate  --grid "<81chars>"

# Canonicalize (dedupe isomorphic puzzles)
sudoku-dlx canon --grid "<81chars>"  # D4 × bands/stacks × inner row/col × digit relabel
# Produces a stable 81-char string for deduping datasets.

# Batch tools
sudoku-dlx gen-batch --out puzzles.txt --count 1000 --givens 30 --symmetry rot180 --minimal
sudoku-dlx rate-file --in puzzles.txt --csv ratings.csv
python bench/bench_file.py --in puzzles.txt

## Batch controls & parallel
Generate with bounds and multiple processes:
```bash
sudoku-dlx gen-batch --out puzzles.txt --count 1000 --givens 30 \
  --min-givens 28 --max-givens 40 --parallel 8

JSON output for ratings & sampling in stats:

sudoku-dlx rate-file  --in puzzles.txt --json > scores.ndjson
sudoku-dlx stats-file --in puzzles.txt --limit 5000 --sample 1000 --json stats.json

Dedupe a file of puzzles (fast)

sudoku-dlx dedupe --in puzzles.txt --out unique.txt

Generate a unique puzzle (deterministic with seed)

sudoku-dlx gen --seed 123 --givens 30 # ~target clue count (approx) sudoku-dlx gen --seed 123 --givens 30 --pretty

Analyze (valid/solvable/unique/difficulty/stats/canonical)

sudoku-dlx check --grid "<81chars>" sudoku-dlx check --grid "<81chars>" --json > report.json

Explain (human-style)

sudoku-dlx explain --grid "<81chars>" --json

Strategies include:

- singles: naked single, hidden single (row/col/box)

- locked candidates: pointing (box→line), claiming (line→box)

- pairs: naked pair, hidden pair (row/col/box)

- triples: naked triple, hidden triple (row/col/box)

- fish: X-Wing (rows & columns)

- advanced: Swordfish (rows & columns), Simple Coloring (Rule 2)

Deterministic steps for reproducible tutorials.

Dataset stats

sudoku-dlx stats-file --in puzzles.txt --json stats.json --csv diff_hist.csv

prints a compact JSON summary to stdout and writes optional files using v2 difficulty:

{

"count": 1000, "valid_pct": 100.0, "solvable_pct": 100.0, "unique_pct": 100.0,

"givens_mean": 29.4, "difficulty_mean": 4.2, "difficulty_p90": 6.8, ...

}

Advanced generator flags:

Minimal & symmetry (slower; strict guarantee)

sudoku-dlx gen --seed 123 --givens 28 --minimal sudoku-dlx gen --seed 123 --givens 28 --minimal --symmetry rot180

Trace & Visualize

sudoku-dlx solve --grid "<81chars>" --trace out.json

Open web/visualizer.html and load out.json


## Formats & batch explain
Convert between txt/csv/jsonl:
```bash
sudoku-dlx convert --in puzzles.txt --out puzzles.csv

Explain many puzzles to NDJSON:

sudoku-dlx explain-file --in puzzles.txt --out steps.ndjson --max-steps 200

Export to DIMACS CNF:

sudoku-dlx to-cnf --grid "<81chars>" --out puzzle.cnf

Cross-check with SAT (optional)

Install the optional extra:

pip install -e ".[sat]"

Then verify solutions using an independent SAT solver:

sudoku-dlx solve --grid "<81chars>" --crosscheck sat

What this gives you

Strict minimality: after generation, removing any single clue breaks uniqueness.

Symmetry enforcement: rot180 removals are paired; mix keeps pairs adjacent but allows singles too.

CI-safe tests: fast settings, strong assertions.

Library (typed API)

from sudoku_dlx import from_string, to_string, is_valid, solve, generate, rate

# Parse and validate
g = from_string("53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79")
assert is_valid(g)

# Solve with stats
res = solve(g)
if res is not None:
    print("Solved (ms, nodes, backtracks):", res.stats.ms, res.stats.nodes, res.stats.backtracks)
    print("Solution (81 chars):", to_string(res.grid))

# Generate a unique puzzle near a target clue count
p = generate(seed=42, target_givens=30)

# Difficulty score in [0,10]
print("difficulty:", rate(p))

Stats

solve() returns SolveResult(stats=Stats(ms, nodes, backtracks)). These are also used by the rater.

Difficulty rating

from sudoku_dlx import rate, from_string
g = from_string("53..7....6..195...." + "."*63)
print(rate(g))  # e.g., 3.8

Roadmap

  • Minimality guarantee and symmetry knobs for the generator
  • Isomorph class canonicalization (reject equivalent puzzles)
  • Optional DLX step recorder + visualizer
  • MUS/MCS-style minimality certificates

License

MIT — see LICENSE.

Publish to PyPI

This repository is already structured as a Python package (src layout, metadata in pyproject.toml). To publish a new version on PyPI:

  1. Update pyproject.toml with the new version and adjust the changelog/release notes.

  2. Make sure the build backend is installed, then build the distribution artifacts:

    python -m pip install --upgrade build twine
    python -m build  # creates dist/*.tar.gz and dist/*.whl
  3. Upload the artifacts with Twine:

    python -m twine upload dist/*
  4. Tag the release in Git and push the tag so GitHub releases stay in sync.

The CI workflow already runs tests against multiple Python versions and uploads coverage reports to Codecov; the pages workflow deploys the static demo from the web/ directory.

Automated releases (no auto-PyPI)

On pushing a tag like v0.2.1, GitHub Actions will:

  • run tests against 3.10–3.12,
  • build wheels/sdist, and
  • attach artifacts to the GitHub Release (no PyPI upload).

Release gates:

  • twine check validates the built metadata.
  • tag vX.Y.Z must equal sudoku_dlx.__version__ (build fails if not).
  • installs the wheel and imports the package before attaching.

Manual publish (when you’re ready)

  1. Create a token on PyPI (or TestPyPI).
  2. Add a repo secret:
    • Settings → Secrets and variables → Actions
    • New secrets: PYPI_API_TOKEN (and/or TEST_PYPI_API_TOKEN)
  3. From Actions tab, run publish:
    • Choose pypi or testpypi
    • Optionally set ref (leave blank to use default branch HEAD)
  4. The workflow builds and uploads the current code to the chosen index.