Skip to content

Conversation

@orien
Copy link
Member

@orien orien commented Jan 2, 2026

Context

The spatial entropy calculation uses graph statistics (average degree and starting positions) to compute password strength for keyboard patterns. Previously, these statistics were calculated on-demand for every spatial match by iterating through the adjacency graph data, performing map/compact/inject operations each time.

Since adjacency graphs are immutable after loading, these statistics can be computed once during initialisation and reused, eliminating redundant calculations.

Changes

Data class (lib/zxcvbn/data.rb):

  • Added compute_graph_stats private method that pre-computes average degree and starting positions for all adjacency graphs during initialisation
  • Stores results in @graph_stats hash with structure: { graph_name => { average_degree:, starting_positions: } }
  • Exposes statistics via graph_stats reader

Math module (lib/zxcvbn/math.rb):

  • Updated average_degree_for_graph to retrieve pre-computed value from data.graph_stats
  • Updated starting_positions_for_graph to retrieve pre-computed value from data.graph_stats
  • Eliminated repeated map/compact/inject operations on graph data

Performance

Benchmark results (1000 iterations across 10 passwords with spatial patterns):

  • Before: 0.097ms per password
  • After: 0.088ms per password
  • Improvement: 9.3% faster

The improvement comes from eliminating repeated iterations over adjacency graph data during entropy calculations. All 262 tests pass, confirming correctness is maintained.

orien added 3 commits January 2, 2026 09:34
Instead of calculating average degree and starting positions for each spatial
match, pre-compute these statistics once when loading adjacency graphs.

This avoids repeated map/inject operations on graph data during password
matching, improving performance by approximately 9.3%.

Performance improvement: 0.097ms -> 0.088ms per password (9.3% faster)
Added test coverage for previously untested Math module methods:

lg (logarithm base 2):
- Powers of 2 (exact values)
- Non-power-of-2 values (with tolerance)
- Decimal values (negative logs)

nCk (combinations):
- Edge cases (k > n, k = 0)
- Small combinations (n=5)
- Larger values (poker hands: 52 choose 5)
- Symmetry property verification
- Basic edge cases

Test count increased from 18 to 27 examples.
Created test suite for previously untested Data class covering:

Initialization:
- Dictionary loading (5 expected dictionaries)
- Adjacency graph loading (4 expected graphs)
- Trie building for all dictionaries
- Graph statistics pre-computation

Graph statistics:
- Verification of average_degree values
- Verification of starting_positions values
- Correctness checks for qwerty and keypad

Ranked dictionaries:
- Word ranking verification
- Common password frequency checks

Custom word lists:
- Dictionary addition via add_word_list
- Trie generation for custom dictionaries
- Word searchability via tries
- Empty list handling

Test count increased from 271 to 291 examples (20 new tests).
@orien orien merged commit 3231418 into master Jan 2, 2026
18 checks passed
@orien orien deleted the precompute-spatial-graph-data branch January 15, 2026 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants