Releases · coregx/coregex

Release list

v0.12.2: fix alternation patterns misrouted to ReverseSuffixSet

kolkov released this 16 Feb 14:13

v0.12.2

f6f6a76

Fixed

Alternation patterns misrouted to ReverseSuffixSet (Issue #116) — Patterns like [cgt]gggtaaa|tttaccc[acg] (alternation without .* prefix) were incorrectly routed to UseReverseSuffixSet strategy, producing wrong match boundaries. Fix: added isSafeForReverseSuffix guard.
matchStartZero optimization too aggressive — Restricted to .* prefix only via hasDotStarPrefix(). Was previously enabled for all unanchored patterns, causing wrong match start for patterns like [^\s]+\.txt.

Fixes #116

Assets 2

v0.12.1: Bidirectional DFA fallback, bounded repetitions fix

kolkov released this 15 Feb 16:14

v0.12.1

2516b6a

Performance

DFA bidirectional fallback for BoundedBacktracker — When BoundedBacktracker can't handle large inputs (exceeds 32M entry limit), use forward DFA + reverse DFA instead of PikeVM. O(n) total vs PikeVM's O(n×states). (\w{2,8})+ on 6MB: 654ms → 184ms (3.6x vs stdlib).
Digit-run skip optimization — For \d+-leading patterns (IP addresses, version numbers), skip entire digit run on DFA failure instead of advancing one byte at a time.

Bug Fixes

Bounded repetitions blocked ReverseSuffix strategy (#115) — isSafeForReverseSuffix didn't recognize OpRepeat{min>=1} as a wildcard, blocking UseReverseSuffix for patterns with bounded repetitions. Fix: 2500ms → 0.5ms (5000x) on 100KB no-match.
CompositeSequenceDFA overmatching for bounded patterns — Bare character classes like \w (maxMatch=1) were treated as unbounded by the DFA. \w\w on "000" returned "000" instead of "00".
AVX2 Teddy assembly correctness (#74) — Fixed teddySlimAVX2_2 returning position -1 for valid candidates in short haystacks.

Benchmarks (regex-bench, AMD EPYC, 6MB input)

Pattern	Go stdlib	coregex	Rust regex	vs stdlib
inner_literal	231 ms	0.25 ms	0.31 ms	926x
suffix	234 ms	0.89 ms	1.09 ms	263x
ip	507 ms	2.16 ms	12.05 ms	235x
char_class	560 ms	41 ms	50 ms	13.6x
word_repeat	654 ms	184 ms	49 ms	3.6x

Extreme benchmarks (6MB no-match): ip 2542x, suffix 1945x, phone 863x, inner 598x vs stdlib.

Assets 2

v0.12.0: Rust-inspired optimizations

kolkov released this 06 Feb 01:15

v0.12.0

a30fd70

Performance

Anti-quadratic guard for reverse suffix/inner/suffix-set searches — prevents O(n²) degradation on high false-positive suffix workloads, falls back to PikeVM when quadratic detected
Lazy DFA 4x loop unrolling — process 4 state transitions per inner loop iteration, check special states between batches
Prefilter IsFast() gate — skip reverse search optimizations when fast SIMD-backed prefix prefilter already exists
DFA cache clear & continue — on cache overflow, clear and fall back to PikeVM for current search instead of permanently disabling DFA

Fixed

OnePass DFA capture limit — tighten from 17 to 16 capture groups (uint32 slot mask = 32 bits)

Benchmark (AMD EPYC, regex-bench)

Pattern	coregex	vs stdlib	vs Rust
suffix	0.91ms	257x	1.4x faster
email	0.70ms	383x	1.9x faster
ip	2.19ms	225x	5.5x faster
uri	0.76ms	340x	1.2x faster
multiline_php	0.60ms	171x	1.2x faster
anchored_php	0.03ms	~1x	12.0x faster

Assets 2

v0.11.9: Fix missing first-byte prefilter in FindAll

kolkov released this 01 Feb 21:33

v0.11.9

8b528fa

Fixed

Missing first-byte prefilter in FindAll state-reusing path (#107)
- findIndicesBoundedBacktrackerAtWithState was missing anchoredFirstBytes O(1) check
- Pattern ^/.*[\w-]+\.php (without $) took 377ms instead of 40µs on 6MB input
- Fix: 377ms → 40µs (9000x improvement for non-matching anchored patterns)

Full Changelog

v0.11.8...v0.11.9

Assets 2

v0.11.8: Fix UseAnchoredLiteral regression

kolkov released this 01 Feb 20:41

v0.11.8

f0f527d

Fixed

Critical regression in UseAnchoredLiteral strategy (#107)
- FindIndices* and findIndicesAtWithState were missing UseAnchoredLiteral case
- Pattern ^/.*[\w-]+\.php$ fell through to slow NFA path
- Regression: 0.01ms → 408ms (40,000x slower)
- Fix: 408ms → 0.5ms (O(1) anchored literal matching restored)

Full Changelog

v0.11.7...v0.11.8

Assets 2

v0.11.7: FindAll optimization - 1.08x faster than stdlib

kolkov released this 01 Feb 19:50

v0.11.7

1480f40

Fixed

FindAll now uses optimized state-reusing path

FindAll was using slow per-match loop instead of optimized findAllIndicesStreaming
Results for (\w{2,8})+ on 6MB: 2179ms → 834ms (2.6x faster)
Now 1.08x faster than stdlib (was 2.4x slower in regex-bench)

Full Changelog

See CHANGELOG.md

Assets 2

v0.11.6: PikeVM 6MB optimization - 1.68x faster than stdlib

kolkov released this 01 Feb 18:56

v0.11.6

fce1691

Performance

Major PikeVM optimization achieving 1.68x speedup over stdlib for large inputs (was 2.2x slower).

Key Changes

Windowed BoundedBacktracker (V12): Search in 914KB windows before PikeVM fallback
SlotTable architecture: Rust-style per-state slot storage
Dynamic slot sizing: 0 (IsMatch), 2 (Find), full (Captures)
Lightweight searchThread: 16 bytes (was 40+ bytes)

Benchmark Results

Pattern (\w{2,8})+ vs stdlib:

Size	Speedup
10KB	1.68x faster
50KB	1.88x faster
100KB	2.04x faster
1MB	1.67x faster
6MB	1.68x faster

6MB improvement: 1900ms → 628ms (3x faster)

Full Changelog

See CHANGELOG.md

Assets 2

v0.11.5: Fix checkHasWordBoundary catastrophic slowdown

kolkov released this 01 Feb 09:46

v0.11.5

de173be

Summary

Fixes catastrophic performance regression in patterns with \w{n,m} quantifiers (Issue #105).

Before: 3 minutes 22 seconds on 79KB input (7,000,000x slower than stdlib)
After: 3.6 µs on 79KB input (8.6x faster than stdlib)

Changes

Fixed

checkHasWordBoundary catastrophic slowdown (Issue #105)
- Root cause: O(N*M) complexity from scanning all NFA states per byte
- Fix: Use NewBuilderWithWordBoundary(), add hasWordBoundary guards, anchored prefilter verification

Performance

DFA state lookup: map → slice — 42% CPU time eliminated
Literal extraction from capture/repeat groups — better prefilters
- =($\w...){2} now extracts =$ (2 bytes) instead of just =

Benchmarks (79KB input)

Stage	Time	vs stdlib
Before fix	3m 22s	7,000,000x slower
After fix	3.6 µs	8.6x faster

Credits

@danslo for root cause analysis and fix suggestions

Full Changelog: v0.11.4...v0.11.5

Contributors

danslo

Assets 2

v0.11.4: FindAll multiline optimization

kolkov released this 16 Jan 15:59

v0.11.4

8baa0ef

Fixed

FindAll/FindAllIndex now use UseMultilineReverseSuffix strategy (Issue #102)
- FindIndicesAt() was missing dispatch for UseMultilineReverseSuffix
- IsMatch/Find were fast (1µs), but FindAll was slow (78ms) — 100x gap vs Rust
- After fix: FindAll on 6MB with 2000 matches: ~1ms (was 78ms)

Performance

Operation	Before	After	Improvement
FindAll (6MB, 2000 matches)	78ms	~1ms	78x faster
vs Rust gap	100x slower	~1.3x slower	Near parity!

Changed

Updated golang.org/x/sys v0.39.0 → v0.40.0

Full Changelog: v0.11.3...v0.11.4

Assets 2

v0.11.3: Prefix fast path 319-552x speedup

kolkov released this 16 Jan 14:45

v0.11.3

43efbbd

Performance

Pattern (?m)^/.*\.php now 319-552x faster than stdlib (was 3.5-5.7x in v0.11.1)

Operation	coregex	stdlib	Speedup
IsMatch	182 ns	100 µs	552x
Find	240 ns	81 µs	338x
CountAll	58 µs	18.7 ms	319x

Algorithm

Suffix prefilter finds .php candidates (SIMD memmem)
SIMD backward scan to find line start (bytes.LastIndexByte)
O(1) prefix byte check (/ at line start)
Skip-to-next-line on mismatch (avoids O(n²) worst case)
DFA fallback for complex patterns without extractable prefix

Changes

MultilineReverseSuffixSearcher.prefixBytes for O(1) verification
SetPrefixLiterals() extracts prefix from pattern
findLineStart() uses SIMD bytes.LastIndexByte
Skip-to-next-line: on prefix mismatch, jump to next \n position

Fixes #99

Assets 2

Uh oh!

Uh oh!

Releases: coregx/coregex

Release list

v0.12.2: fix alternation patterns misrouted to ReverseSuffixSet

Fixed

Uh oh!

v0.12.1: Bidirectional DFA fallback, bounded repetitions fix

Performance

Bug Fixes

Benchmarks (regex-bench, AMD EPYC, 6MB input)

Uh oh!

v0.12.0: Rust-inspired optimizations

Performance

Fixed

Benchmark (AMD EPYC, regex-bench)

Uh oh!

v0.11.9: Fix missing first-byte prefilter in FindAll

Fixed

Full Changelog

Uh oh!

v0.11.8: Fix UseAnchoredLiteral regression

Fixed

Full Changelog

Uh oh!

v0.11.7: FindAll optimization - 1.08x faster than stdlib

Fixed

Full Changelog

Uh oh!

v0.11.6: PikeVM 6MB optimization - 1.68x faster than stdlib

Performance

Key Changes

Benchmark Results

Full Changelog

Uh oh!

v0.11.5: Fix checkHasWordBoundary catastrophic slowdown

Summary

Changes

Fixed

Performance

Benchmarks (79KB input)

Credits

Contributors

Uh oh!

v0.11.4: FindAll multiline optimization

Fixed

Performance

Changed

Uh oh!

v0.11.3: Prefix fast path 319-552x speedup

Performance

Algorithm

Changes

Uh oh!