A UCI-compatible chess engine written in Rust.
Ferrite implements the core ideas behind modern computer chess: bitboard representation, magic bitboard move generation, alpha-beta search with pruning heuristics, PeSTO tapered evaluation, and Syzygy endgame tablebase support. It communicates via the Universal Chess Interface (UCI) protocol, making it compatible with any standard chess GUI.
- Bitboard representation β 8 bitboards (6 piece types + 2 colors) encode the entire position using CPU-native
u64operations - Magic bitboards β O(1) slider attack lookups via precomputed hash tables with collision-free magic numbers
- 16-bit move encoding β compact
ChessMove(u16)for cache-friendly move lists and single-integer comparison - Zobrist hashing β O(1) incremental hash updates for transposition table and repetition detection
- PeSTO tapered evaluation β separate midgame/endgame piece-square tables blended by game phase
- Iterative deepening negamax β alpha-beta pruning with time-controlled deepening
- Null Move Pruning (NMP) β skip-turn heuristic with zugzwang guard
- Late Move Reductions (LMR) β search late quiet moves at reduced depth first
- Quiescence search β resolve captures at leaf nodes to avoid the horizon effect
- Move ordering β hash move, MVV-LVA captures, killer moves, history heuristic
- Transposition table β Zobrist-indexed with depth-preferred replacement and aging
- Syzygy endgame tablebases β perfect play for positions with 5 or fewer pieces
- BK tactical test suite β 24-position test suite with EPD parser and SAN converter
- Rust toolchain (edition 2024)
git clone https://github.com/yourusername/ferrite.git
cd ferrite
# Debug build
make build
# Optimized release build (LTO + single codegen unit)
make release
# Run the engine (release mode)
make run| Target | Command | Description |
|---|---|---|
build |
cargo build |
Debug build |
release |
cargo build --release |
Optimized release build |
run |
cargo run --release |
Run engine in UCI mode |
test |
cargo test |
Run all unit + integration tests |
bench |
cargo bench --bench ... |
Run Criterion benchmarks |
clippy |
cargo clippy -- -D warnings |
Lint with Clippy |
fmt |
cargo fmt |
Format code |
fmt-check |
cargo fmt -- --check |
Check formatting |
clean |
cargo clean |
Remove build artifacts |
ci |
fmt-check clippy test bench |
Full CI pipeline |
Ferrite speaks UCI. Point any UCI-compatible GUI at the compiled binary:
- Arena β Engines > Install New Engine > select
target/release/ferrite - CuteChess β Tools > Settings > Engines > Add > set command to the binary path
- Lichess (via lichess-bot) β configure
engine.dirandengine.nameinconfig.yml
UCI "go" command
β
βΌ
ββββββββββββββββββββββββ
β Iterative Deepening β βββ depth 1, 2, 3, ...
β (search.rs:125) β each iteration warms TT for next
ββββββββββ¬ββββββββββββββ
β
βΌ
ββββββββββββββββββββββββ ββββββββββββββββ
β Negamax + Ξ±-Ξ² ββββββΊβ TT Probe β hit? β return stored score
β (search.rs:190) β ββββββββββββββββ
β β ββββββββββββββββ
β ββββββΊβ Syzygy β β€5 pieces? β perfect score
β β ββββββββββββββββ
β β ββββββββββββββββ
β if can_null && ββββββΊβ Null Move β skip turn; still β₯ Ξ²? β prune
β depth β₯ 3 β β Pruning β
β β ββββββββββββββββ
β β ββββββββββββββββ
β for each move: ββββββΊβ Move Order β hash β MVV-LVA β killers
β β β β β history β quiet
β if late + quiet: β ββββββββββββββββ
β LMR (depth-2) β
β re-search if >Ξ± β
ββββββββββ¬ββββββββββββββ
β depth == 0
βΌ
ββββββββββββββββββββββββ
β Quiescence Search β βββ resolve captures to avoid horizon effect
β (search.rs:374) β stand-pat + capture-only search
ββββββββββ¬ββββββββββββββ
β
βΌ
ββββββββββββββββββββββββ
β PeSTO Evaluation β βββ tapered MG/EG piece-square tables
β (evaluation.rs:32) β
ββββββββββββββββββββββββ
UCI command βββΊ parse position/FEN βββΊ Board struct (bitboards + Zobrist)
β
"go depth N" βββΊ SearchState.reset() βββΊ iterative_deepening(1..N)
β
βΌ
negamax(board, depth, Ξ±, Ξ²)
β
βΌ
"bestmove e2e4" βββΊ UCI output
A chess board has 64 squares. A u64 has 64 bits. This is not a coincidence β it's the foundation of every competitive chess engine.
BitBoard(u64) (src/board/bitboard.rs) wraps a single u64 where each bit represents a square. Setting bit 0 means "A1 is in this set", bit 63 means "H8 is in this set". This is the LERF (Little-Endian Rank-File) mapping:
A1=0, B1=1, ... H1=7
A2=8, B2=9, ... H2=15
...
A8=56, B8=57, ... H8=63
From any square index: rank = index >> 3, file = index & 7. Shifting left by 8 moves "one rank up" β a natural fit for pawn pushes.
Why this matters: Questions like "Where can this knight move?" become a single bitwise AND:
let targets = KNIGHT_ATTACKS[sq] & !own_pieces; // one AND + one NOT on u64sThe engine uses 8 bitboards total: 6 piece types (pawn, knight, bishop, rook, queen, king) + 2 colors (white, black). Any positional query is a combination of bitwise operations on these.
Brian Kernighan's bit trick powers the iterator β clearing the lowest set bit in O(1):
let idx = self.0.trailing_zeros() as u8; // find lowest set bit
self.0 &= self.0 - 1; // clear itThis iterates over N set bits in exactly N steps, with no branches beyond the loop termination check.
ChessMove(u16) (src/board/chessmove.rs) packs a full chess move into 16 bits:
Bit layout: src(6) | dst(6) | promo(2) | is_promo(1) | reserved(1)
bits 0..5: source square (0-63)
bits 6..11: destination square (0-63)
bits 12..13: promotion piece (0=Knight, 1=Bishop, 2=Rook, 3=Queen)
bit 14: is_promotion flag
bit 15: reserved
Why 16 bits? Two reasons:
- Cache efficiency β A move list of 256 moves fits in 512 bytes (< 8 cache lines). With 32-bit moves, that doubles. In a search that examines millions of positions, this matters.
- Integer comparison β Equality checks (
a == b) compile to a singlecmpinstruction. No field-by-field comparison needed.
The Copy trait is derived, so moves are passed by value with zero overhead β they're just integers.
(src/board/attacks.rs, src/board/magic.rs)
Leaper attacks (knight, king, pawn) are simple: precompute a 64-entry array for each piece type at startup. KNIGHT_ATTACKS[sq] gives the attack bitboard for a knight on square sq regardless of board state. Knight and king attacks depend only on the piece's square, not on other pieces.
Slider attacks (bishop, rook, queen) depend on which squares between the slider and the board edge are occupied by other pieces ("blockers"). A naive implementation ray-traces in each direction until hitting a blocker β O(7) operations per direction. Magic bitboards reduce this to O(1) via a precomputed hash table:
index = (occupied & mask).wrapping_mul(magic) >> shift
attacks = TABLE[offset + index]
The key insight: for a rook on a given square, only the pieces between the rook and the edge matter. Pieces on the edge itself don't change the attack (the ray ends there regardless). Excluding edge squares from the occupancy mask reduces the number of relevant bits β for example, a rook in the center has 10 relevant bits (not 14), giving a table of 1,024 entries instead of 16,384.
Finding magic numbers uses the carry-rippler trick to enumerate all 2^N subsets of a mask (occ = (occ - mask) & mask), then tries sparse random candidates from an XorShift64 PRNG until finding one that produces zero collisions:
let magic = rng.sparse_random(); // rng.next() & rng.next() & rng.next()
// Sparse numbers (few bits set) work better as magic multipliersThe PRNG uses a fixed seed (0x12345678_9ABCDEF0) for deterministic initialization β every run of the engine produces the same magic numbers, making debugging reproducible.
Zobrist hashing assigns a random 64-bit key to every (piece, color, square) combination, plus keys for side-to-move, castling rights, and en passant file. A position's hash is the XOR of all applicable keys.
Why XOR? It's its own inverse: hash ^= key; hash ^= key; restores the original hash. This means:
- Making a move = XOR out the piece from source square, XOR it in at destination, XOR the side key. O(1) per move, regardless of board complexity.
- Unmaking a move = apply the same XORs (self-inverse).
Components: piece[6][2][64] + side + castling[16] + ep[8] = 781 random keys, generated by a XorShift64 PRNG with fixed seed 0x3243F6A8885A308D for determinism. Keys are lazily initialized via LazyLock.
(src/evaluation.rs, src/pst.rs)
The evaluation function uses the PeSTO (Piece-Square Tables Only) approach β a well-known evaluation framework that assigns each piece a positional bonus/penalty depending on which square it occupies.
Material values differ between midgame and endgame:
| Piece | Midgame | Endgame |
|---|---|---|
| Pawn | 82 | 94 |
| Knight | 337 | 281 |
| Bishop | 365 | 297 |
| Rook | 477 | 512 |
| Queen | 1025 | 936 |
Piece-square tables provide 64-entry bonus/penalty arrays for each piece type, separately for midgame and endgame. For example, knights are rewarded for being centralized in the midgame, while kings are penalized for leaving the back rank.
Tapered evaluation blends the two:
phase = sum of PHASE_WEIGHT[piece] for all pieces on board
= 0*pawns + 1*knights + 1*bishops + 2*rooks + 4*queens
score = (mg_score * phase + eg_score * (24 - phase)) / 24
With all minor/major pieces present, phase = 24 (full midgame). As pieces are traded, phase decreases toward 0 (pure endgame). This elegantly handles the transition: king safety dominates in the midgame, while king activity and pawn structure matter in the endgame.
PST indexing quirk: PeSTO tables store values with a8=index 0, but Ferrite uses A1=0 (LERF). The fix: White reads table[sq ^ 56] (flips rank), Black reads table[sq] directly.
The search starts at depth 1 and increases: 1, 2, 3, ... up to max_depth. This seems wasteful, but has two critical benefits:
- Time control β The last completed iteration's result is always valid. If time runs out mid-iteration, we use the previous result.
- TT warmup β Each iteration populates the transposition table, making the next iteration dramatically faster (TT hits provide instant score lookups).
A soft time limit (50% of allocated time) prevents starting an iteration that likely won't finish.
Negamax is a simplification of minimax: instead of alternating between maximizing and minimizing, always maximize from the current player's perspective by negating the child's score.
Alpha-beta pruning skips branches that cannot possibly improve the result. If we've found a move scoring 5, and a sibling branch already guarantees our opponent can force a score of 3 in a different line, we don't need to explore that branch further (beta cutoff).
"If I skip my turn and my position is still great, then with a real move it must be even better."
The engine makes a "null move" (passes the turn) and searches at reduced depth (depth - 3). If the reduced search still returns a score >= beta, we assume the real search would too, and prune the whole subtree.
Zugzwang guard: NMP is disabled when the side to move has only pawns and a king (king + pawns positions are the most common zugzwang scenarios where being forced to move is a disadvantage).
Moves are ordered so that the best-looking ones come first. Moves later in the list are statistically less likely to be good. LMR exploits this:
- After searching the first 3 moves at full depth, subsequent quiet (non-capture, non-check) moves are searched at depth - 2 first.
- If a reduced-depth search finds a score above alpha, the move is re-searched at full depth.
- Captures, checks, killer moves, and moves while in check are never reduced.
This typically reduces the search tree by 30-50% with minimal impact on playing strength.
At leaf nodes (depth 0), simply evaluating the position can be misleading β what if we're about to lose a queen on the next move? This is the horizon effect.
Quiescence search extends the search by examining all captures (and all evasions when in check) until the position is "quiet." The stand-pat heuristic uses the static evaluation as a lower bound: if the position is already good enough, we don't need to search further captures.
Alpha-beta pruning is most effective when the best move is searched first. Move ordering is therefore critical to search performance. The priority hierarchy:
| Priority | Source | Score | Rationale |
|---|---|---|---|
| 1 | Hash move (TT) | 100,000 | Best move from previous search of this position |
| 2 | Captures (MVV-LVA) | 10,000+ | Most Valuable Victim, Least Valuable Attacker |
| 3 | Promotions | 9,000 | Creating a new queen is almost always good |
| 4 | Killer move #1 | 8,000 | Quiet move that caused a beta cutoff at this ply |
| 5 | Killer move #2 | 7,000 | Second-best quiet cutoff move at this ply |
| 6 | History score | 0-16,384 | Quiet moves that frequently cause cutoffs |
| 7 | Other quiet | 0 | Remaining moves |
MVV-LVA (Most Valuable Victim, Least Valuable Attacker): Captures are scored by victim_value * 10 - attacker_index. Capturing a queen with a pawn (QxP: 90010 - 0 = 9000) scores higher than capturing a pawn with a queen (PxQ: 10010 - 4 = 996). This encourages winning captures and penalizes trades that lose material.
Killer heuristic: Two slots per ply store quiet moves that caused beta cutoffs. When searching a sibling position at the same depth, these "killer moves" are tried before other quiet moves. The intuition: if a move refuted one position, it might refute a nearby position too.
History heuristic: A 6x64 table indexed by [piece][destination_square] accumulates depth^2 bonuses whenever a quiet move causes a beta cutoff. This builds a per-search "reputation" for effective quiet moves. The score is capped at 16,384 to prevent overflow.
The transposition table (TT) is a hash map indexed by Zobrist hash, storing the result of previously searched positions. In chess, many different move sequences lead to the same position (transpositions), and the TT avoids re-searching them.
Entry structure (per position):
| Field | Type | Purpose |
|---|---|---|
key |
u64 |
Full Zobrist hash for collision detection |
depth |
u8 |
Search depth that produced this result |
score |
Score |
Evaluation score |
flag |
TTFlag |
Exact, LowerBound (beta cutoff), or UpperBound |
best_move |
Option<ChessMove> |
Best move found (used for move ordering) |
age |
u8 |
Search generation for aging |
Sizing: The table uses power-of-2 sizing so that the index computation is a fast bitwise AND (hash as usize & mask) instead of an expensive modulo operation. Default size is 64 MB, configurable via the UCI Hash option (1-4096 MB).
Replacement policy: Depth-preferred with aging. An entry is replaced if:
- The slot is empty (
key == 0) - It's the same position (update with newer data)
- The new search is deeper (
depth >= entry.depth) - The existing entry is stale (
entry.age != current_generation)
Mate score adjustment: Mate scores are ply-dependent (mate-in-3 from the root is different from mate-in-3 from ply 5). When storing, scores are adjusted to be relative to the root; when probing, they're adjusted back to the current ply.
Syzygy tablebases contain precomputed perfect-play results for all positions with a given number of pieces (up to 5 in this engine). When the search reaches a position with 5 or fewer pieces, it probes the tablebase for an authoritative Win/Draw/Loss result instead of searching further.
Bridge implementation: Ferrite's Board type is different from shakmaty's Chess type, so the bridge converts via FEN string:
Board β FEN string β shakmaty::Fen β shakmaty::Chess β Syzygy probe
This FEN round-trip has negligible cost because it only runs for positions with 5 or fewer pieces β by definition, simple positions with short FEN strings.
WDL scoring:
| Tablebase Result | Engine Score |
|---|---|
| Win | +20,000 |
| Cursed Win | +100 |
| Draw | 0 |
| Blessed Loss | -100 |
| Loss | -20,000 |
Tablebases are loaded from disk via the UCI SyzygyPath option. The engine ships with 3-4-5 piece tables in endgame/syzgy-3-4-5/.
All benchmarks run on a release build (cargo build --release with LTO and single codegen unit) using Criterion.rs.
| Benchmark | Time (mean) | Description |
|---|---|---|
search_depth_3_startpos |
1.39 ms | Depth 3 from starting position |
search_depth_4_startpos |
2.66 ms | Depth 4 from starting position |
search_depth_3_kiwipete |
11.31 ms | Depth 3 from KiwiPete (48 legal moves) |
| Benchmark | Time (mean) | Description |
|---|---|---|
movegen_startpos |
360 ns | Generate all legal moves from startpos |
movegen_kiwipete |
777 ns | Generate all legal moves from KiwiPete |
| Benchmark | Time (mean) | Description |
|---|---|---|
eval_startpos |
100 ns | Evaluate starting position |
eval_middlegame |
105 ns | Evaluate complex middlegame (Italian Game) |
eval_endgame |
98 ns | Evaluate K+R vs K endgame |
eval_complex |
110 ns | Evaluate Sicilian Dragon middlegame |
| Name | FEN | Description |
|---|---|---|
| Startpos | rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 |
Initial position (20 moves) |
| KiwiPete | r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - |
48 legal moves, complex tactics |
| Endgame | 8/5k2/8/8/8/8/4K3/4R3 w - - 0 1 |
K+R vs K |
| Complex | r1bq1rk1/pp2ppbp/2np2p1/2n5/P3PP2/N1P2N2/1PB3PP/R1B1QRK1 b - - |
Sicilian Dragon middlegame |
Ferrite has 98 tests across three test targets:
# Run all tests
make test
# Run only unit tests
cargo test --lib
# Run only the BK tactical suite
cargo test --test bk_suite| Module | Tests | Covers |
|---|---|---|
bitboard |
10 | Construction, popcount, iteration, bitwise ops |
board |
13 | FEN parsing, make/unmake, castling, en passant, promotion, hash |
chessmove |
5 | Encoding/decoding, roundtrip for all 64x64 squares |
attacks |
12 | Leaper + slider tables, blockers, all-squares verification |
movegen |
11 | Perft depths 1-4, KiwiPete, Position3, iterator masks |
square |
4 | LERF mapping, rank/file roundtrip, display |
piece |
3 | Color flip, index mapping |
zobrist |
3 | Non-zero keys, uniqueness, XOR cancellation |
evaluation |
5 | Startpos near-zero, material advantage, endgame phase |
search |
11 | Mate-in-1, depth completion, TT speedup, draw detection, PV |
tt |
4 | Store/probe, miss, mate adjustment, replacement policy |
syzygy |
2 | Invalid path, piece count guard |
uci |
11 | Position parsing, go params, time allocation, promotions |
bk_suite |
3 | EPD parser, SAN conversion, 24-position tactical suite |
Move generation is verified against known perft results (total leaf nodes at a given depth):
| Depth | Startpos | KiwiPete | Position 3 |
|---|---|---|---|
| 1 | 20 | 48 | 14 |
| 2 | 400 | 2,039 | 191 |
| 3 | 8,902 | 97,862 | 2,812 |
| 4 | 197,281 | β | β |
| Command | Description |
|---|---|
uci |
Identify engine, list options, print uciok |
isready |
Synchronize; responds readyok |
ucinewgame |
Reset board, clear TT |
position startpos [moves ...] |
Set position from starting position |
position fen <FEN> [moves ...] |
Set position from FEN string |
go depth <N> |
Search to fixed depth |
go movetime <ms> |
Search for fixed time |
go wtime/btime/winc/binc [...] |
Search with time control |
go infinite |
Search until stop |
stop |
Halt search, return best move found |
setoption name Hash value <MB> |
Set transposition table size (1-4096 MB) |
setoption name SyzygyPath value <path> |
Load Syzygy tablebases from directory |
d / print |
Print current board (debug) |
quit |
Exit engine |
| Option | Type | Default | Range | Description |
|---|---|---|---|---|
Hash |
spin | 64 | 1-4096 | TT size in MB |
SyzygyPath |
string | <empty> |
β | Path to Syzygy tablebase dir |
> uci
< id name chess-engine
< id author yourname
< option name Hash type spin default 64 min 1 max 4096
< option name SyzygyPath type string default <empty>
< uciok
> isready
< readyok
> position startpos moves e2e4 e7e5
> go depth 6
< info depth 1 score cp 5 nodes 27 time 1 nps 27000 pv d2d4
< info depth 2 score cp 20 nodes 178 time 1 nps 178000 pv d2d4 d7d5
< ...
< info depth 6 score cp 15 nodes 45231 time 24 nps 1884625 pv g1f3 b8c6
< bestmove g1f3
> quit
| Crate | Version | Purpose |
|---|---|---|
shakmaty |
0.27 | Chess position type for Syzygy bridge |
shakmaty-syzygy |
0.25 | Syzygy endgame tablebase probing |
arrayvec |
0.7 | Stack-allocated move lists (no heap allocation) |
criterion |
0.5 | Benchmarking framework (dev-dependency) |
- PeSTO evaluation tables from PeSTO's Evaluation Function β empirically tuned piece-square tables by Ronald Friederich
- Syzygy tablebases by Ronald de Man β endgame truth tables probed via
shakmaty-syzygy - BK test suite β classic 24-position tactical benchmark by Brat-Ko