Pre-compute spatial graph statistics during `Data` initialization #65

orien · 2026-01-02T02:34:49Z

Context

The spatial entropy calculation uses graph statistics (average degree and starting positions) to compute password strength for keyboard patterns. Previously, these statistics were calculated on-demand for every spatial match by iterating through the adjacency graph data, performing map/compact/inject operations each time.

Since adjacency graphs are immutable after loading, these statistics can be computed once during initialisation and reused, eliminating redundant calculations.

Changes

Data class (lib/zxcvbn/data.rb):

Added compute_graph_stats private method that pre-computes average degree and starting positions for all adjacency graphs during initialisation
Stores results in @graph_stats hash with structure: { graph_name => { average_degree:, starting_positions: } }
Exposes statistics via graph_stats reader

Math module (lib/zxcvbn/math.rb):

Updated average_degree_for_graph to retrieve pre-computed value from data.graph_stats
Updated starting_positions_for_graph to retrieve pre-computed value from data.graph_stats
Eliminated repeated map/compact/inject operations on graph data

Performance

Benchmark results (1000 iterations across 10 passwords with spatial patterns):

Before: 0.097ms per password
After: 0.088ms per password
Improvement: 9.3% faster

The improvement comes from eliminating repeated iterations over adjacency graph data during entropy calculations. All 262 tests pass, confirming correctness is maintained.

Instead of calculating average degree and starting positions for each spatial match, pre-compute these statistics once when loading adjacency graphs. This avoids repeated map/inject operations on graph data during password matching, improving performance by approximately 9.3%. Performance improvement: 0.097ms -> 0.088ms per password (9.3% faster)

Added test coverage for previously untested Math module methods: lg (logarithm base 2): - Powers of 2 (exact values) - Non-power-of-2 values (with tolerance) - Decimal values (negative logs) nCk (combinations): - Edge cases (k > n, k = 0) - Small combinations (n=5) - Larger values (poker hands: 52 choose 5) - Symmetry property verification - Basic edge cases Test count increased from 18 to 27 examples.

Created test suite for previously untested Data class covering: Initialization: - Dictionary loading (5 expected dictionaries) - Adjacency graph loading (4 expected graphs) - Trie building for all dictionaries - Graph statistics pre-computation Graph statistics: - Verification of average_degree values - Verification of starting_positions values - Correctness checks for qwerty and keypad Ranked dictionaries: - Word ranking verification - Common password frequency checks Custom word lists: - Dictionary addition via add_word_list - Trie generation for custom dictionaries - Word searchability via tries - Empty list handling Test count increased from 271 to 291 examples (20 new tests).

orien added 3 commits January 2, 2026 09:34

orien merged commit 3231418 into master Jan 2, 2026
18 checks passed

orien deleted the precompute-spatial-graph-data branch January 15, 2026 05:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pre-compute spatial graph statistics during `Data` initialization #65

Pre-compute spatial graph statistics during `Data` initialization #65

Uh oh!

orien commented Jan 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pre-compute spatial graph statistics during Data initialization #65

Pre-compute spatial graph statistics during Data initialization #65

Uh oh!

Conversation

orien commented Jan 2, 2026

Context

Changes

Performance

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pre-compute spatial graph statistics during `Data` initialization #65

Pre-compute spatial graph statistics during `Data` initialization #65