Version: 1.0.0
Last Updated: 2024-12-01
Target Audience: AI coding agents (Claude, GPT-4, etc.)
NornicDB is an enterprise-grade, high-performance graph database that prioritizes:
- Neo4j Compatibility - Drop-in replacement with zero code changes
- Performance - 3-52x faster than Neo4j across benchmarks
- Correctness - 90%+ test coverage, regression prevention mandatory
- Maintainability - Clean architecture, separation of concerns, DRY principles
- Documentation - Every public API fully documented with real-world examples
Every change must demonstrate measurable improvements without significant regressions:
- Performance: Benchmark before/after. Show ops/sec improvements.
- Memory: Profile memory usage. No >10% increases without justification.
- Code Quality: Maintain or improve test coverage (90% minimum).
- Compatibility: All Neo4j compatibility tests must pass.
Example:
// ❌ BAD: No proof of improvement
func optimizeQuery() { /* new algorithm */ }
// ✅ GOOD: Benchmarked and documented
// BenchmarkQueryOptimization shows 2.3x speedup:
// Before: 4,252 ops/sec
// After: 9,780 ops/sec
// Memory: -15% (reduced allocations)
func optimizeQuery() { /* new algorithm with proof */ }MANDATORY WORKFLOW for all bugs:
- Write failing test - Reproduce the exact bug condition
- Verify test fails - Confirm it catches the bug
- Fix the bug - Implement minimal fix
- Verify test passes - Confirm fix works
- Add regression tests - Prevent future occurrences
Example from codebase:
// File: pkg/cypher/aggregation_bugs_test.go
// BUG #1: WHERE ... IS NOT NULL combined with WITH aggregation returns empty results
func TestBug_WhereIsNotNullWithAggregation(t *testing.T) {
// 1. Setup test data that triggers bug
setupAggregationTestData(t, store)
// 2. Execute query that fails in production
result, err := exec.Execute(ctx, `
MATCH (f:File)
WHERE f.extension IS NOT NULL
WITH f.extension as ext, COUNT(f) as count
RETURN ext, count
`, nil)
// 3. Assert expected behavior (this WILL fail before fix)
require.NoError(t, err)
assert.Equal(t, int64(2), extCounts[".ts"])
assert.Equal(t, int64(3), extCounts[".md"])
}See .agents/testing-patterns.md for detailed testing guidelines.
Hard limit: 2500 lines per file
When a file approaches 2500 lines:
- Identify separation of concerns - What are the distinct responsibilities?
- Define clear contracts - What interfaces connect the pieces?
- Extract cohesive modules - Group related functionality
- Update imports - Maintain clean dependency graph
Example refactoring:
// Before: executor.go (3752 lines) ❌
// - Query parsing
// - Execution logic
// - Result formatting
// - Caching
// - Optimization
// After: Split into focused modules ✅
// executor.go (800 lines) - Core execution orchestration
// parser.go (600 lines) - Query parsing & AST
// optimizer.go (500 lines) - Query optimization
// cache.go (400 lines) - Result caching
// formatter.go (300 lines) - Result formattingSee .agents/refactoring-guidelines.md for detailed strategies.
Leverage function types for dependency injection:
// ✅ GOOD: Function interface enables runtime swapping
type EmbeddingFunc func(text string) ([]float32, error)
type SearchEngine struct {
embedder EmbeddingFunc // Inject different implementations
}
// Production: Use GPU-accelerated embeddings
engine := &SearchEngine{
embedder: gpuEmbeddings.Generate,
}
// Testing: Use mock embeddings
testEngine := &SearchEngine{
embedder: func(text string) ([]float32, error) {
return []float32{0.1, 0.2, 0.3}, nil
},
}
// Development: Use cached embeddings
devEngine := &SearchEngine{
embedder: cachedEmbeddings.GetOrGenerate,
}Real example from codebase:
// pkg/storage/types.go - Storage interface for DI
type Engine interface {
CreateNode(node *Node) error
GetNode(id NodeID) (*Node, error)
// ... more methods
}
// Multiple implementations:
// - MemoryEngine (testing)
// - BadgerEngine (production)
// - MockEngine (unit tests)See .agents/functional-patterns.md for advanced patterns.
Every public API must have:
- Package-level documentation - What does this package do?
- Type documentation - What does this type represent?
- Function documentation - What does this do? When to use it?
- 1-3 Real-world examples - Show actual usage
- ELI12 explanation - Complex concepts explained simply
Example from codebase:
// Package cypher provides Neo4j-compatible Cypher query execution for NornicDB.
//
// This package implements a Cypher query parser and executor that supports
// the core Neo4j Cypher query language features. It enables NornicDB to be
// compatible with existing Neo4j applications and tools.
//
// Supported Cypher Features:
// - MATCH: Pattern matching with node and relationship patterns
// - CREATE: Creating nodes and relationships
// - MERGE: Upsert operations with ON CREATE/ON MATCH clauses
// [... more features ...]
//
// Example Usage:
//
// // Create executor with storage backend
// storage := storage.NewMemoryEngine()
// executor := cypher.NewStorageExecutor(storage)
//
// // Execute Cypher queries
// result, err := executor.Execute(ctx, "CREATE (n:Person {name: 'Alice'})", nil)
//
// ELI12 (Explain Like I'm 12):
//
// Think of Cypher like asking questions about a social network:
// 1. MATCH: "Find all people named Alice"
// 2. CREATE: "Add a new person named Bob"
// 3. Relationships: "Find who Alice knows"
package cypherSee .agents/documentation-standards.md for complete guide.
Minimum 90% coverage for all new code:
# Run coverage report
go test ./... -coverprofile=coverage.out
go tool cover -func=coverage.out
# View HTML coverage report
go tool cover -html=coverage.outCoverage targets by package type:
- Core logic (cypher, storage): 95%+ coverage
- Utilities (cache, pool): 90%+ coverage
- Integration (bolt, server): 85%+ coverage
- UI/CLI: 70%+ coverage (focus on business logic)
What to test:
✅ Always test:
- Happy path (normal usage)
- Error conditions (invalid input, not found, etc.)
- Edge cases (empty, nil, boundary values)
- Concurrency (race conditions, deadlocks)
- Regression cases (all bugs get tests)
❌ Don't waste time testing:
- Third-party library internals
- Generated code (unless custom logic added)
- Trivial getters/setters (unless they have logic)
See .agents/testing-patterns.md for comprehensive testing guide.
Identify and extract common patterns:
// ❌ BAD: Repeated error handling
func GetUser(id string) (*User, error) {
if id == "" {
return nil, fmt.Errorf("invalid id: empty")
}
// ... logic
}
func GetPost(id string) (*Post, error) {
if id == "" {
return nil, fmt.Errorf("invalid id: empty")
}
// ... logic
}
// ✅ GOOD: Extract common validation
func validateID(id string) error {
if id == "" {
return fmt.Errorf("invalid id: empty")
}
return nil
}
func GetUser(id string) (*User, error) {
if err := validateID(id); err != nil {
return nil, err
}
// ... logic
}Real example from codebase:
// pkg/cypher/helpers.go - Shared helper functions
func toInt64(v interface{}) (int64, error) { /* ... */ }
func toString(v interface{}) (string, error) { /* ... */ }
func toBool(v interface{}) (bool, error) { /* ... */ }
// Used throughout cypher package instead of repeating type conversionsClear boundaries between layers:
┌─────────────────────────────────────┐
│ API Layer (bolt, http, mcp) │ ← Protocol handling
├─────────────────────────────────────┤
│ Query Layer (cypher) │ ← Query parsing & execution
├─────────────────────────────────────┤
│ Storage Layer (storage) │ ← Data persistence
├─────────────────────────────────────┤
│ Infrastructure (cache, pool) │ ← Cross-cutting concerns
└─────────────────────────────────────┘
Each layer has clear responsibilities:
- API Layer: Protocol translation, authentication, request/response formatting
- Query Layer: Cypher parsing, query planning, execution orchestration
- Storage Layer: CRUD operations, indexing, transactions
- Infrastructure: Caching, connection pooling, metrics
Example contract:
// Storage layer exposes clean interface
type Engine interface {
CreateNode(node *Node) error
GetNode(id NodeID) (*Node, error)
// ... more methods
}
// Query layer depends on interface, not implementation
type Executor struct {
storage Engine // Can swap implementations
}
// API layer depends on query layer, not storage
type BoltServer struct {
executor *Executor // Clean dependency chain
}See .agents/architecture-patterns.md for detailed architecture.
Use established patterns and conventions:
- Go conventions: Follow Effective Go
- Neo4j compatibility: Match Neo4j Cypher semantics exactly
- Bolt protocol: Implement Neo4j Bolt protocol specification
- Graph algorithms: Use standard graph theory algorithms
- Testing: Follow Go testing best practices (testify, table-driven tests)
Example - Table-driven tests:
func TestNodeValidation(t *testing.T) {
tests := []struct {
name string
node *Node
wantErr bool
}{
{
name: "valid node",
node: &Node{ID: "123", Labels: []string{"Person"}},
wantErr: false,
},
{
name: "empty ID",
node: &Node{ID: "", Labels: []string{"Person"}},
wantErr: true,
},
{
name: "nil labels",
node: &Node{ID: "123", Labels: nil},
wantErr: false, // Labels are optional
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := ValidateNode(tt.node)
if tt.wantErr {
assert.Error(t, err)
} else {
assert.NoError(t, err)
}
})
}
}Prioritize compatibility without sacrificing performance:
✅ Support:
- Neo4j Bolt protocol (industry standard)
- Cypher query language (open standard)
- JSON import/export (universal format)
- Standard graph algorithms (shortest path, PageRank, etc.)
- Multiple embedding providers (Ollama, OpenAI, local GGUF)
❌ Avoid:
- Vendor lock-in (proprietary formats)
- Single-provider dependencies (must support alternatives)
- Non-standard query syntax (unless significant benefit)
Example - Multi-provider embeddings:
// Support multiple embedding providers
type EmbeddingProvider interface {
Generate(text string) ([]float32, error)
}
// Implementations:
// - OllamaProvider (local)
// - OpenAIProvider (cloud)
// - LocalGGUFProvider (offline)
// - MockProvider (testing)nornicdb/
├── cmd/
│ └── nornicdb/ # Main application entry point
├── pkg/
│ ├── bolt/ # Neo4j Bolt protocol server
│ ├── cypher/ # Cypher query parser & executor
│ ├── storage/ # Storage engine (memory, badger)
│ ├── embed/ # Embedding generation (GPU-accelerated)
│ ├── search/ # Vector search (HNSW index)
│ ├── inference/ # Auto-relationship inference
│ ├── cache/ # Query result caching
│ ├── pool/ # Connection pooling
│ ├── config/ # Configuration management
│ ├── audit/ # Audit logging
│ └── auth/ # Authentication
├── docs/ # Documentation
│ ├── getting-started/ # Installation & quick start
│ ├── api-reference/ # API documentation
│ ├── user-guides/ # Usage examples
│ ├── architecture/ # System design
│ └── performance/ # Benchmarks
├── docker/ # Docker build files
├── scripts/ # Build & deployment scripts
├── ui/ # Web UI (optional)
└── .agents/ # AI agent development guides
├── testing-patterns.md
├── refactoring-guidelines.md
├── functional-patterns.md
├── documentation-standards.md
├── architecture-patterns.md
├── performance-optimization.md
├── bug-fix-workflow.md
└── cypher-compatibility.md
# Update dependencies
go mod tidy
# Run full test suite
go test ./... -v
# Check test coverage
go test ./... -coverprofile=coverage.out
go tool cover -func=coverage.out
# Run benchmarks (baseline)
go test ./pkg/cypher -bench=. -benchmem > before.txt# Run tests for package you're working on
go test ./pkg/cypher -v
# Run specific test
go test ./pkg/cypher -run TestBug_WhereIsNotNull -v
# Watch mode (requires entr or similar)
find . -name "*.go" | entr -c go test ./pkg/cypher -v
# Check for race conditions
go test ./... -race# Format code
go fmt ./...
# Run linter
golangci-lint run
# Run all tests
go test ./... -v
# Check coverage (must be ≥90%)
go test ./... -coverprofile=coverage.out
go tool cover -func=coverage.out | grep total
# Run benchmarks (compare with baseline)
go test ./pkg/cypher -bench=. -benchmem > after.txt
benchcmp before.txt after.txt
# Build binary
go build -o nornicdb ./cmd/nornicdb<type>(<scope>): <subject>
<body>
<footer>
Types:
feat: New featurefix: Bug fixperf: Performance improvementrefactor: Code refactoringtest: Adding testsdocs: Documentation changeschore: Maintenance tasks
Example:
fix(cypher): resolve WHERE IS NOT NULL with aggregation bug
The WHERE clause with IS NOT NULL was incorrectly filtering out all
results when combined with WITH aggregation. This was caused by the
filter being applied after the aggregation instead of before.
Changes:
- Apply WHERE filters before WITH aggregation
- Add regression test TestBug_WhereIsNotNullWithAggregation
- Update executor to preserve filter order
Fixes #123
Benchmark: No performance impact (same ops/sec)
Coverage: +2.3% (added 15 test cases)
┌─────────────┐
│ E2E (5%) │ ← Full system tests
├─────────────┤
│ Integration │ ← Multi-component tests
│ (15%) │
├─────────────┤
│ Unit │ ← Single-function tests
│ (80%) │
└─────────────┘
For every new feature or bug fix:
- Unit tests for core logic (80% of tests)
- Integration tests for component interaction (15% of tests)
- End-to-end tests for critical paths (5% of tests)
- Benchmark tests for performance-critical code
- Race condition tests (
go test -race) - Error path tests (all error returns tested)
- Edge case tests (nil, empty, boundary values)
- Regression tests (all bugs get permanent tests)
func TestFeatureName(t *testing.T) {
// Setup
store := storage.NewMemoryEngine()
exec := NewStorageExecutor(store)
ctx := context.Background()
t.Run("happy path", func(t *testing.T) {
// Test normal usage
})
t.Run("error: invalid input", func(t *testing.T) {
// Test error handling
})
t.Run("edge case: empty result", func(t *testing.T) {
// Test edge cases
})
t.Run("concurrent access", func(t *testing.T) {
// Test concurrency
})
}See .agents/testing-patterns.md for comprehensive testing guide.
Every performance optimization must include:
- Baseline benchmark - Before optimization
- Optimized benchmark - After optimization
- Comparison - Show improvement percentage
- Memory profile - Show memory impact
- Justification - Why this optimization matters
Example:
// BenchmarkQueryExecution measures query execution performance
func BenchmarkQueryExecution(b *testing.B) {
store := setupBenchmarkData()
exec := NewStorageExecutor(store)
ctx := context.Background()
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, err := exec.Execute(ctx, "MATCH (n:Person) RETURN n LIMIT 100", nil)
if err != nil {
b.Fatal(err)
}
}
}
// Results:
// Before: 4,252 ops/sec, 2.3 MB/op
// After: 9,780 ops/sec, 1.8 MB/op
// Improvement: 2.3x faster, 22% less memory- Algorithmic improvements - O(n²) → O(n log n)
- Memory allocations - Reduce allocations in hot paths
- Concurrency - Parallelize independent operations
- Caching - Cache expensive computations
- I/O optimization - Batch operations, reduce syscalls
See .agents/performance-optimization.md for detailed strategies.
MANDATORY PROCESS - No exceptions:
// Create failing test that reproduces the exact bug
func TestBug_DescriptiveName(t *testing.T) {
// Setup conditions that trigger bug
store := storage.NewMemoryEngine()
setupBugConditions(t, store)
// Execute operation that fails
result, err := performBuggyOperation()
// Assert expected behavior (this WILL fail before fix)
assert.NoError(t, err)
assert.Equal(t, expectedValue, result)
}# Run test - it MUST fail
go test ./pkg/cypher -run TestBug_DescriptiveName -v
# Output should show failure:
# --- FAIL: TestBug_DescriptiveName (0.00s)
# Expected: 2 rows
# Got: 0 rows// Implement minimal fix
// Add comments explaining WHY the bug occurred
func fixedFunction() {
// BUG FIX: Previously applied filter after aggregation,
// causing all results to be filtered out. Now apply
// filter before aggregation as per Cypher semantics.
applyFilters() // ← Moved before aggregation
performAggregation()
}# Run test - it MUST pass now
go test ./pkg/cypher -run TestBug_DescriptiveName -v
# Output should show success:
# --- PASS: TestBug_DescriptiveName (0.00s)// Add variations to prevent similar bugs
func TestBug_DescriptiveName_Variations(t *testing.T) {
t.Run("with ORDER BY", func(t *testing.T) { /* ... */ })
t.Run("with LIMIT", func(t *testing.T) { /* ... */ })
t.Run("with multiple filters", func(t *testing.T) { /* ... */ })
}See .agents/bug-fix-workflow.md for detailed workflow.
- Testing Patterns - Comprehensive testing guide
- Refactoring Guidelines - How to refactor large files
- Functional Patterns - Advanced Go patterns
- Documentation Standards - Documentation requirements
- Architecture Patterns - System design principles
- Performance Optimization - Optimization strategies
- Bug Fix Workflow - Bug fixing process
- Cypher Compatibility - Neo4j compatibility guide
- Effective Go - Go best practices
- Neo4j Cypher Manual - Cypher reference
- Neo4j Bolt Protocol - Protocol specification
- Go Testing - Testing package documentation
- Read this AGENTS.md thoroughly
- Explore
pkg/storage/types.go- Core data structures - Read
pkg/cypher/executor.go- Query execution - Run test suite:
go test ./... -v - Read existing tests to understand patterns
- Find a "good first issue" or documentation improvement
- Follow the development workflow
- Write tests first (TDD)
- Submit PR with benchmarks and coverage report
- Study
pkg/cypher/- Complex query execution - Understand
pkg/storage/badger_engine.go- Persistence - Learn
pkg/embed/- GPU-accelerated embeddings - Contribute a performance optimization
Before submitting any code:
- All tests pass:
go test ./... -v - Coverage ≥90%:
go tool cover -func=coverage.out - No race conditions:
go test ./... -race - Code formatted:
go fmt ./... - Linter passes:
golangci-lint run - Benchmarks run:
go test -bench=. -benchmem - Documentation updated (if public API changed)
- Examples added (if new feature)
- CHANGELOG.md updated
- Commit message follows format
- Correctness: Does it work? Are there edge cases?
- Tests: 90%+ coverage? All error paths tested?
- Performance: Benchmarks provided? No regressions?
- Documentation: Public APIs documented? Examples included?
- Style: Follows Go conventions? DRY principle?
- Architecture: Separation of concerns? Clean interfaces?
- "Add test for error case"
- "Extract this repeated logic"
- "Document this public function"
- "Show benchmark results"
- "This file is >2500 lines, please refactor"
- "Add ELI12 explanation for this complex concept"
- Test Coverage: ≥90% for all packages
- Benchmark Performance: No regressions >5%
- Memory Usage: No increases >10% without justification
- File Size: No files >2500 lines
- Documentation: 100% of public APIs documented
- Neo4j Compatibility: 100% of supported Cypher features work identically
- Bolt Protocol: Full protocol compliance
- Import/Export: Neo4j JSON format compatibility
- Query Execution: 3-52x faster than Neo4j (maintain or improve)
- Memory Footprint: 100-500 MB vs 1-4 GB for Neo4j
- Cold Start: <1s vs 10-30s for Neo4j
- Documentation: Check
docs/directory first - Examples: See
docs/user-guides/complete-examples.md - Tests: Read existing tests for patterns
- Issues: Search GitHub issues for similar problems
- Architecture: Read
.agents/architecture-patterns.md
Remember: Quality over speed. Take time to write tests, document code, and prove improvements. NornicDB's reputation depends on reliability and performance.
Happy coding! 🚀