sudoku_dlx

Fast Sudoku solver & generator using Algorithm X / Dancing Links (DLX) with Python bitsets and an incremental cover model. Exposes timing & search stats for reproducible benchmarking. Runs in the browser via Pyodide (demo linked below).

© 2025 Stamatis-Christos Saridakis — MIT. Core algorithm: exact cover (Knuth). This implementation is original and bitset-based.

Features

Exact-cover Sudoku with DLX (no pointer structs; compact bitset encoding).
Stats exposed: elapsed ms, node visits, backtracks.
Generator: deterministic with --seed; removes clues while preserving unique solvability near a target clue count.
Difficulty rater: quick heuristic in [0, 10] from givens + search effort.
CLI (sudoku-dlx) + typed API + tests + CI + web demo.

Note: Current generator aims for unique puzzles near target_givens; minimality and symmetry options are planned (see Roadmap).

Install (dev)

git clone https://github.com/SaridakisStamatisChristos/sudoku_dlx.git
cd sudoku_dlx
python -m venv .venv && source .venv/bin/activate  # Windows: .\.venv\Scripts\activate
pip install -e ".[dev]"
pre-commit install
pytest -q

Difficulty (v2)

Deterministic score in [0,10] combining:

empties (gaps), DLX nodes, backtracks, and fill pressure. Invariant under isomorphisms; avoids timing for stability.

CLI

sudoku-dlx --help

# Solve (single grid as 81 chars; '.' or '0' for blanks)
sudoku-dlx solve --grid "..3.2.6..9..3.5..1..18.64..81.29..7....8....67..82.5......."
sudoku-dlx solve --file puzzles.txt               # a file with 9 lines of 9 chars
sudoku-dlx solve --grid "<81chars>" --pretty --stats

# Rate difficulty (0..10)
sudoku-dlx rate  --grid "<81chars>"

# Canonicalize (dedupe isomorphic puzzles)
sudoku-dlx canon --grid "<81chars>"  # D4 × bands/stacks × inner row/col × digit relabel
# Produces a stable 81-char string for deduping datasets.

# Batch tools
sudoku-dlx gen-batch --out puzzles.txt --count 1000 --givens 30 --symmetry rot180 --minimal
sudoku-dlx rate-file --in puzzles.txt --csv ratings.csv
python bench/bench_file.py --in puzzles.txt

## Batch controls & parallel
Generate with bounds and multiple processes:
```bash
sudoku-dlx gen-batch --out puzzles.txt --count 1000 --givens 30 \
  --min-givens 28 --max-givens 40 --parallel 8

JSON output for ratings & sampling in stats:

sudoku-dlx rate-file  --in puzzles.txt --json > scores.ndjson
sudoku-dlx stats-file --in puzzles.txt --limit 5000 --sample 1000 --json stats.json

Dedupe a file of puzzles (fast)

sudoku-dlx dedupe --in puzzles.txt --out unique.txt

Generate a unique puzzle (deterministic with seed)

sudoku-dlx gen --seed 123 --givens 30 # ~target clue count (approx) sudoku-dlx gen --seed 123 --givens 30 --pretty

Analyze (valid/solvable/unique/difficulty/stats/canonical)

sudoku-dlx check --grid "<81chars>" sudoku-dlx check --grid "<81chars>" --json > report.json

Explain (human-style)

sudoku-dlx explain --grid "<81chars>" --json

Strategies include:

- singles: naked single, hidden single (row/col/box)

- locked candidates: pointing (box→line), claiming (line→box)

- pairs: naked pair, hidden pair (row/col/box)

- triples: naked triple, hidden triple (row/col/box)

- fish: X-Wing (rows & columns)

- advanced: Swordfish (rows & columns), Simple Coloring (Rule 2)

Deterministic steps for reproducible tutorials.

Dataset stats

sudoku-dlx stats-file --in puzzles.txt --json stats.json --csv diff_hist.csv

prints a compact JSON summary to stdout and writes optional files using v2 difficulty:

{

"count": 1000, "valid_pct": 100.0, "solvable_pct": 100.0, "unique_pct": 100.0,

"givens_mean": 29.4, "difficulty_mean": 4.2, "difficulty_p90": 6.8, ...

}

Advanced generator flags:

Minimal & symmetry (slower; strict guarantee)

sudoku-dlx gen --seed 123 --givens 28 --minimal sudoku-dlx gen --seed 123 --givens 28 --minimal --symmetry rot180

Trace & Visualize

sudoku-dlx solve --grid "<81chars>" --trace out.json

Open web/visualizer.html and load out.json


## Formats & batch explain
Convert between txt/csv/jsonl:
```bash
sudoku-dlx convert --in puzzles.txt --out puzzles.csv

Explain many puzzles to NDJSON:

sudoku-dlx explain-file --in puzzles.txt --out steps.ndjson --max-steps 200

Export to DIMACS CNF:

sudoku-dlx to-cnf --grid "<81chars>" --out puzzle.cnf

Cross-check with SAT (optional)

Install the optional extra:

pip install -e ".[sat]"

Then verify solutions using an independent SAT solver:

sudoku-dlx solve --grid "<81chars>" --crosscheck sat

What this gives you

Strict minimality: after generation, removing any single clue breaks uniqueness.

Symmetry enforcement: rot180 removals are paired; mix keeps pairs adjacent but allows singles too.

CI-safe tests: fast settings, strong assertions.

Library (typed API)

from sudoku_dlx import from_string, to_string, is_valid, solve, generate, rate

# Parse and validate
g = from_string("53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79")
assert is_valid(g)

# Solve with stats
res = solve(g)
if res is not None:
    print("Solved (ms, nodes, backtracks):", res.stats.ms, res.stats.nodes, res.stats.backtracks)
    print("Solution (81 chars):", to_string(res.grid))

# Generate a unique puzzle near a target clue count
p = generate(seed=42, target_givens=30)

# Difficulty score in [0,10]
print("difficulty:", rate(p))

Stats

solve() returns SolveResult(stats=Stats(ms, nodes, backtracks)). These are also used by the rater.

Difficulty rating

from sudoku_dlx import rate, from_string
g = from_string("53..7....6..195...." + "."*63)
print(rate(g))  # e.g., 3.8

Roadmap

Minimality guarantee and symmetry knobs for the generator
Isomorph class canonicalization (reject equivalent puzzles)
Optional DLX step recorder + visualizer
MUS/MCS-style minimality certificates

License

MIT — see LICENSE.

Publish to PyPI

This repository is already structured as a Python package (src layout, metadata in pyproject.toml). To publish a new version on PyPI:

Update pyproject.toml with the new version and adjust the changelog/release notes.

Make sure the build backend is installed, then build the distribution artifacts:

python -m pip install --upgrade build twine
python -m build  # creates dist/*.tar.gz and dist/*.whl

Upload the artifacts with Twine:
```
python -m twine upload dist/*
```
Tag the release in Git and push the tag so GitHub releases stay in sync.

The CI workflow already runs tests against multiple Python versions and uploads coverage reports to Codecov; the pages workflow deploys the static demo from the web/ directory.

Automated releases (no auto-PyPI)

On pushing a tag like v0.2.1, GitHub Actions will:

run tests against 3.10–3.12,
build wheels/sdist, and
attach artifacts to the GitHub Release (no PyPI upload).

Release gates:

twine check validates the built metadata.
tag vX.Y.Z must equal sudoku_dlx.__version__ (build fails if not).
installs the wheel and imports the package before attaching.

Manual publish (when you’re ready)

Create a token on PyPI (or TestPyPI).
Add a repo secret:
- Settings → Secrets and variables → Actions
- New secrets: PYPI_API_TOKEN (and/or TEST_PYPI_API_TOKEN)
From Actions tab, run publish:
- Choose pypi or testpypi
- Optionally set ref (leave blank to use default branch HEAD)
The workflow builds and uploads the current code to the chosen index.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
.github		.github
bench		bench
docs		docs
notebooks		notebooks
scripts		scripts
src/sudoku_dlx		src/sudoku_dlx
tests		tests
tools		tools
web		web
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
codecov.yml		codecov.yml
mkdocs.yml		mkdocs.yml
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

License

SaridakisStamatisChristos/sudoku_dlx

Folders and files

Latest commit

History

Repository files navigation

sudoku_dlx

Features

Install (dev)

Difficulty (v2)

CLI

Dedupe a file of puzzles (fast)

Generate a unique puzzle (deterministic with seed)

Analyze (valid/solvable/unique/difficulty/stats/canonical)

Explain (human-style)

Strategies include:

- singles: naked single, hidden single (row/col/box)

- locked candidates: pointing (box→line), claiming (line→box)

- pairs: naked pair, hidden pair (row/col/box)

- triples: naked triple, hidden triple (row/col/box)

- fish: X-Wing (rows & columns)

- advanced: Swordfish (rows & columns), Simple Coloring (Rule 2)

Deterministic steps for reproducible tutorials.

Dataset stats

prints a compact JSON summary to stdout and writes optional files using v2 difficulty:

{

"count": 1000, "valid_pct": 100.0, "solvable_pct": 100.0, "unique_pct": 100.0,

"givens_mean": 29.4, "difficulty_mean": 4.2, "difficulty_p90": 6.8, ...

}

Advanced generator flags:

Minimal & symmetry (slower; strict guarantee)

Trace & Visualize

Open web/visualizer.html and load out.json

Cross-check with SAT (optional)

Library (typed API)

Stats

Difficulty rating

Roadmap

License

Publish to PyPI

Automated releases (no auto-PyPI)

Manual publish (when you’re ready)

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages