Thank you for considering contributing to ahocorasick! This document outlines the development workflow and guidelines.
This project uses Git-Flow branching model for development.
main # Production-ready code (tagged releases)
└─ develop # Integration branch for next release
├─ feature/* # New features
├─ bugfix/* # Bug fixes
└─ hotfix/* # Critical fixes from main
- main: Production-ready code. Only releases are merged here.
- develop: Active development branch. All features merge here first.
- feature/*: New features. Branch from
develop, merge back todevelop. - bugfix/*: Bug fixes. Branch from
develop, merge back todevelop. - hotfix/*: Critical production fixes. Branch from
main, merge to bothmainanddevelop.
# Create feature branch from develop
git checkout develop
git pull origin develop
git checkout -b feature/my-new-feature
# Work on your feature...
git add .
git commit -m "feat: add my new feature"
# When done, merge back to develop
git checkout develop
git merge --squash feature/my-new-feature # Squash merge for clean history
git commit -m "feat: my new feature (squashed)"
git branch -d feature/my-new-feature
git push origin develop# Create bugfix branch from develop
git checkout develop
git pull origin develop
git checkout -b bugfix/fix-issue-123
# Fix the bug...
git add .
git commit -m "fix: resolve issue #123"
# Merge back to develop
git checkout develop
git merge --squash bugfix/fix-issue-123 # Squash merge for clean history
git commit -m "fix: resolve issue #123 (squashed)"
git branch -d bugfix/fix-issue-123
git push origin develop# Create release branch from develop
git checkout develop
git pull origin develop
git checkout -b release/v0.2.0
# Update version numbers, CHANGELOG, etc.
git add .
git commit -m "chore: prepare release v0.2.0"
# Merge to main and tag
git checkout main
git merge --no-ff release/v0.2.0
git tag -a v0.2.0 -m "Release v0.2.0"
# Merge back to develop
git checkout develop
git merge --no-ff release/v0.2.0
# Delete release branch
git branch -d release/v0.2.0
# Push everything
git push origin main develop --tags# Create hotfix branch from main
git checkout main
git pull origin main
git checkout -b hotfix/critical-bug
# Fix the bug...
git add .
git commit -m "fix: critical production bug"
# Merge to main and tag
git checkout main
git merge --no-ff hotfix/critical-bug
git tag -a v0.1.1 -m "Hotfix v0.1.1"
# Merge to develop
git checkout develop
git merge --no-ff hotfix/critical-bug
# Delete hotfix branch
git branch -d hotfix/critical-bug
# Push everything
git push origin main develop --tagsFollow Conventional Commits specification:
<type>(<scope>): <description>
[optional body]
[optional footer]
- feat: New feature
- fix: Bug fix
- docs: Documentation changes
- style: Code style changes (formatting, etc.)
- refactor: Code refactoring
- test: Adding or updating tests
- chore: Maintenance tasks (build, dependencies, etc.)
- perf: Performance improvements
feat: add Unicode property class support
fix: correct epsilon-closure in NFA compilation
docs: update README with performance benchmarks
refactor: simplify DFA state cache implementation
test: add fuzz tests for memchr edge cases
chore: update go.mod to Go 1.25.4
perf: optimize Teddy SIMD prefilter for AVX2-
Format code:
go fmt ./...
-
Run linter:
golangci-lint run
-
Run tests:
go test ./... -
Run tests with race detector:
go test -race ./... -
All-in-one (use pre-release script):
bash scripts/pre-release-check.sh
- Code is formatted (
go fmt ./...) - Linter passes (
golangci-lint run- 0 issues) - All tests pass (
go test ./...) - Race detector passes (
go test -race ./...) - New code has tests (minimum 70% coverage)
- Documentation updated (if applicable)
- Commit messages follow conventions
- No sensitive data (credentials, tokens, etc.)
- Benchmarks added for performance-critical code
- Go 1.25 or later
- golangci-lint
- GCC or Clang (for race detector)
- Optional: WSL2 with Go (for Windows users without GCC)
# Clone repository
git clone https://github.com/coregx/ahocorasick.git
cd ahocorasick
# Download dependencies
go mod download
# Install golangci-lint
# See: https://golangci-lint.run/welcome/install/# Run all tests
go test ./...
# Run with coverage
go test -cover ./...
# Run with race detector
go test -race ./...
# Run benchmarks
go test -bench=. -benchmem ./simd/
go test -bench=. -benchmem ./prefilter/# Run linter
golangci-lint run
# Run with verbose output
golangci-lint run -v
# Verify config
golangci-lint config verifyahocorasick/
├── .github/ # GitHub workflows and templates
│ ├── CODEOWNERS # Code ownership
│ └── workflows/ # CI/CD pipelines
├── ahocorasick.go # Package entry point
├── automaton.go # Search API (Find, IsMatch, FindAll)
├── builder.go # Pattern builder
├── byteclasses.go # Alphabet compression
├── match.go # Match types and semantics
├── nfa.go # Trie + failure links
├── *_test.go # Tests and benchmarks
├── .golangci.yml # Linter configuration
├── CHANGELOG.md # Version history
├── CONTRIBUTING.md # This file
├── LICENSE # MIT License
└── README.md # Main documentation
- Check if issue exists, if not create one
- Discuss approach in the issue
- Create feature branch from
develop - Implement feature with tests
- Update documentation
- Run quality checks (
bash scripts/pre-release-check.sh) - Create pull request to
develop - Wait for code review
- Address feedback
- Merge when approved
- Follow Go conventions and idioms
- Write self-documenting code
- Add comments for complex logic (especially SIMD/assembly code)
- Keep functions small and focused
- Use meaningful variable names
- Optimize for clarity first, performance second (except in hot paths)
- Public types/functions:
PascalCase(e.g.,Compile,Memchr) - Private types/functions:
camelCase(e.g.,epsilonClosure,selectPrefilter) - Constants:
PascalCasewith context prefix (e.g.,UseNFA,UseDFA) - Test functions:
Test*(e.g.,TestLazyDFAFind) - Benchmark functions:
Benchmark*(e.g.,BenchmarkMemchr) - Example functions:
Example*(e.g.,ExampleCompile)
- Always check and handle errors
- Use descriptive error variables (
ErrCacheFull,ErrInvalidPattern) - Return errors immediately, don't wrap unnecessarily
- Validate inputs before processing
- Use table-driven tests when appropriate
- Test both success and error cases
- Include test data in
testdata/if needed - Test edge cases (empty input, boundaries, alignment)
- Add fuzz tests for parsers and matchers
- Compare with stdlib regexp for correctness
- Benchmarks are mandatory for performance-critical code
- Trie construction followed by failure link computation
- BFS traversal for failure links (ensures correct order)
- Match propagation along failure links for overlapping patterns
- O(n + m) time complexity (n = text length, m = total pattern length)
- Use ByteClasses for alphabet compression
- Consider contiguous state layout for cache efficiency
- Minimize allocations in hot paths
// Benchmarks should include:
// - Various input sizes (1KB, 64KB, 1MB)
// - Different pattern counts (4, 16, 64, 256)
// - Memory allocation tracking
// - Throughput calculation (MB/s)
func BenchmarkFind(b *testing.B) {
ac, _ := NewBuilder().
AddStrings(patterns).
Build()
b.SetBytes(int64(len(haystack)))
for i := 0; i < b.N; i++ {
_ = ac.Find(haystack, 0)
}
}- Check existing issues and discussions
- Ask questions in GitHub Issues
- See coregex for integration examples
- Multi-pattern matching: O(n) regardless of pattern count
- Throughput: 100+ MB/s for typical workloads
- Zero allocations in IsMatch hot path
- Comparable to Rust's aho-corasick crate
By contributing, you agree that your contributions will be licensed under the MIT License.
Thank you for contributing to ahocorasick!