Releases · coregx/coregex

Release list

v0.3.0 - Replace and Split

kolkov released this 27 Nov 17:17

v0.3.0

d400834

What's New

Added

Replace functions: Full stdlib-compatible replacement API
- ReplaceAll() / ReplaceAllString() - replace with template expansion
- ReplaceAllLiteral() / ReplaceAllLiteralString() - literal replacement
- ReplaceAllFunc() / ReplaceAllStringFunc() - replace with callback
Split function: Split(s string, n int) - split string by regex
Template expansion: $0-$9 backreference support in replacement
FindAllIndex: FindAllIndex() / FindAllStringIndex() for batch index retrieval

Technical

Pre-allocation optimization for replacement buffers
Proper $$ escape handling (literal $)
Empty match handling to prevent infinite loops

Performance

Case-insensitive 32KB: ~200x faster than stdlib
Case-insensitive 1KB: ~90x faster than stdlib

Full Changelog: v0.2.1...v0.3.0

Assets 2

v0.2.1: Documentation Hotfix

kolkov released this 27 Nov 15:28

v0.2.1

d73558d

Documentation hotfix for v0.2.0 - updated README.md with correct performance numbers (263x) and feature table.

Assets 2

v0.2.0: Capture Groups Support

kolkov released this 28 Nov 07:21

v0.2.0

92eb84b

What's New

Capture Groups Support

Full submatch extraction via PikeVM:

FindSubmatch() / FindStringSubmatch() - returns all capture groups
FindSubmatchIndex() / FindStringSubmatchIndex() - returns group positions
NumSubexp() - returns number of capture groups

Example

re := coregex.MustCompile(`(\w+)@(\w+)\.(\w+)`)
match := re.FindStringSubmatch("user@example.com")
// match[0] = "user@example.com"
// match[1] = "user"
// match[2] = "example"
// match[3] = "com"

Performance

Pattern	Size	vs stdlib
Case-insensitive	32KB	263x faster
Case-insensitive	1KB	92x faster
Case-sensitive	1KB	3.5x faster

Technical Details

NFA StateCapture state type for group boundaries
Thread-local capture tracking in PikeVM with copy-on-write semantics
Captures follow Thompson's construction as epsilon transitions
DFA path unchanged - captures only allocated when requested

Assets 2

v0.1.4 - Documentation Update

kolkov released this 27 Nov 14:12

v0.1.4

19e062d

What's Changed

Documentation Updates

Fixed broken benchmark/ link in README
Updated CHANGELOG with release notes for v0.1.1 through v0.1.4
Updated performance claims to reflect 143x speedup on case-insensitive patterns
Updated current version references throughout README

Performance Highlights

143x faster than stdlib on case-insensitive patterns ((?i)...)
DFA prefilter working correctly after v0.1.3 cache fix

Full Changelog: v0.1.3...v0.1.4

Assets 2

v0.1.3 - Critical DFA Performance Fix

kolkov released this 27 Nov 13:52

v0.1.3

66acc1b

What's Fixed

Critical DFA Cache Bug

Problem: Start state ID was being overwritten by cache, causing EVERY DFA search to fall back to slow NFA
Impact: 200x performance regression in v0.1.0-v0.1.2 when using prefilter optimization
Solution: Preserve pre-assigned state IDs (StartState=0) in cache

Leftmost-Longest Semantics

Fixed DFA search to properly implement leftmost-longest match semantics
Now correctly returns first match position with greedy extension

Performance Improvements

Pattern Type	Before Fix	After Fix	Improvement
Literal (32KB)	887,129 ns	4,375 ns	202x faster
Case-insensitive (32KB)	842,422 ns	5,883 ns	143x faster vs stdlib

Changelog

fix: DFA cache start state registration
fix: Leftmost-longest semantics in searchAt() and findWithPrefilter()
docs: Updated README with accurate benchmark data

Full Changelog: v0.1.2...v0.1.3

Assets 2

v0.1.2 - Strategy Selection & Match Bounds Fixes

kolkov released this 27 Nov 13:21

v0.1.2

59957a2

Fixes

Strategy Selection Priority

Check for good literals BEFORE checking NFA size
Patterns with literals now use DFA+prefilter even if NFA < 20 states

Match Bounds Corrections

Complete prefilter matches now return correct bounds (was returning only first byte)
DFA matches now return correct start position (was always 0)

Testing

All tests pass
O(n) complexity verified for unanchored patterns

Full Changelog

v0.1.1...v0.1.2

Assets 2

v0.1.1 - Critical Hotfix: O(n²) PikeVM Bug

kolkov released this 27 Nov 10:10

v0.1.1

44ad961

🔴 Critical Bug Fix

This hotfix resolves a critical performance bug in PikeVM unanchored search.

The Bug

PikeVM had O(n²) time complexity for unanchored patterns due to restarting search at each position.

Impact (before fix):

Input Size	stdlib	coregex	Slowdown
16B	3.5 ns	3,768 ns	1,061x
32B	40 ns	11,797 ns	295x
1KB	263 ns	10.7 ms	40,775x
32KB	3.3 ms	11.2 sec	3,400,000x

The Fix

Implemented Thompson's parallel NFA simulation:

Add new start threads at each position (simulates `.*?` prefix)
Process all active threads in single O(n) pass
Implement leftmost-longest match semantics
Zero allocations in hot path

Performance after fix:

Consistent ~50-70 MB/s throughput for worst-case patterns
Linear O(n) time complexity verified by benchmarks
Zero allocations (0 B/op, 0 allocs/op)

Files Changed

`nfa/pikevm.go` - Core fix
`nfa/pikevm_bench_test.go` - Complexity verification benchmarks

Upgrade

```bash
go get github.com/coregx/coregex@v0.1.1
```

Full Changelog: v0.1.0...v0.1.1

Assets 2

v0.1.0 - Initial Release

kolkov released this 27 Nov 08:59

v0.1.0

1010435

coregex v0.1.0

Production-grade regex engine for Go with SIMD optimizations.

Features

Multi-engine architecture (NFA/DFA/Meta) with intelligent strategy selection
SIMD primitives (AVX2/SSSE3): memchr, memmem, Teddy multi-pattern search
Literal extraction and automatic prefilter selection
Lazy DFA with on-demand state construction
5-50x faster than stdlib for patterns with literals
88% test coverage, 0 linter issues

Installation

go get github.com/coregx/coregex

Quick Start

import "github.com/coregx/coregex"

re := coregex.MustCompile(`\w+@\w+\.\w+`)
match := re.Find([]byte("email: test@example.com"))

Status

⚠️ Experimental - API may change in future versions.

See README for full documentation.

Assets 2

Uh oh!

Uh oh!

Releases: coregx/coregex

Release list

v0.3.0 - Replace and Split

What's New

Added

Technical

Performance

Uh oh!

v0.2.1: Documentation Hotfix

Uh oh!

v0.2.0: Capture Groups Support

What's New

Capture Groups Support

Example

Performance

Technical Details

Uh oh!

v0.1.4 - Documentation Update

What's Changed

Documentation Updates

Performance Highlights

Uh oh!

v0.1.3 - Critical DFA Performance Fix

What's Fixed

Critical DFA Cache Bug

Leftmost-Longest Semantics

Performance Improvements

Changelog

Uh oh!

v0.1.2 - Strategy Selection & Match Bounds Fixes

Fixes

Strategy Selection Priority

Match Bounds Corrections

Testing

Full Changelog

Uh oh!

v0.1.1 - Critical Hotfix: O(n²) PikeVM Bug

🔴 Critical Bug Fix

The Bug

The Fix

Files Changed

Upgrade

Uh oh!

v0.1.0 - Initial Release

coregex v0.1.0

Features

Installation

Quick Start

Status

Uh oh!