Skip to content

Commit ae4961e

Browse files
ruvnetclaude
andauthored
Feat/ruvector postgres v2 (#82)
* feat(postgres): Add RuVector Postgres v2 implementation plan Complete specification for RuVector Postgres v2 with: Architecture: - PostgreSQL extension (pgrx) with hybrid architecture - SQL handles ACID/joins, RuVector engine handles vectors/graphs/learning - Backward compatible with pgvector SQL surface - Shared memory IPC with bounded contracts (64KB inline, 16MB shared) 4-Phase Implementation: - Phase 1: pgvector-compatible search (1a: function-based, 1b: Index AM) - Phase 2: Tiered storage with compression and exactness GUC - Phase 3: Graph engine with Cypher and SQL join keys - Phase 4: Dynamic mincut integrity gating (key differentiator) Key Technical Details: - lambda_cut: Minimum cut value via Stoer-Wagner (PRIMARY integrity metric) - lambda2: Algebraic connectivity (OPTIONAL drift signal) - DIFFERENT from mincut! - Contracted operational graph (~1000 nodes) - never compute on full similarity graph - Hysteresis model with consecutive samples and cooldown - Operation risk classification (Low/Medium/High) - MVCC visibility with incremental paging API - WAL replay with idempotency and LSN ordering - Partition map versioning and epoch fencing for cluster mode Files: - 00-overview.md: Architecture, consistency contract, benchmark spec - 01-sql-schema.md: SQL schema and types - 02-background-workers.md: IPC contract, mincut worker - 03-index-access-methods.md: Index AM specification - 04-integrity-events.md: Events, hysteresis, operation classes - 05-phase1-pgvector-compat.md: Phase 1a/1b incremental path - 06-phase2-tiered-storage.md: Tiered storage with GUC exactness - 07-phase3-graph-cypher.md: Graph engine with SQL joins - 08-phase4-integrity-control.md: Mincut gating with Stoer-Wagner - 09-migration-guide.md: Migration from pgvector - 10-consistency-replication.md: Consistency and replication model 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs(postgres): Rewrite v2 overview with compelling framing Replace technical executive summary with clear explanation of why RuVector matters: - From symptom monitoring to causal monitoring - Mincut as leading indicator, not metric - Algorithm becomes control signal (control plane, not analytics) - Failure mode class change: cascading → graceful degradation - Explainable operations via witness edges Key message: "We're not making vector search faster. We're making vector infrastructure survivable." 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(postgres): Add hybrid search, multi-tenancy, and self-healing specs Three high-impact additions to RuVector Postgres v2: ## 11-hybrid-search.md - BM25 + Vector Fusion - Single query combines semantic and keyword search - Proper BM25 implementation (not just ts_rank) - Fusion algorithms: RRF (default), linear, learned - Integrity-aware degradation (stress → single branch) - Parallel branch execution - GUC configuration ## 12-multi-tenancy.md - First-Class Tenant Isolation - SET ruvector.tenant_id for transparent scoping - Isolation levels: shared, partition, dedicated - Automatic promotion based on vector count - Per-tenant integrity (stress in one doesn't affect others) - Per-tenant contracted graphs - Resource quotas and rate limiting - Fair scheduling (no noisy neighbors) - RLS integration for defense in depth ## 13-self-healing.md - Automated Remediation - Completes the control loop: sensor → actuator - Problem classification from witness edges: - Hotspot congestion - Centroid skew - Replication lag - Maintenance contention - Index fragmentation - Memory pressure - Built-in strategies: - Rebalance partitions - Pause maintenance jobs - Throttle ingestion - Scale read replicas (K8s) - Compact fragmented indexes - Safety: reversible actions, blast radius limits - Learning: outcome tracking, strategy weight updates - The key insight: "We built the sensor. Now we build the actuator." 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(intelligence): Add self-learning intelligence layer with v3 features Comprehensive intelligence system for Claude Code hooks: Core Features (v2): - VectorMemory with @ruvector/core native HNSW (150x faster) - Hyperbolic distance (Poincaré ball) for hierarchical embeddings - ReasoningBank with Q-learning and pattern decay (7-day half-life) - Confidence Calibration tracking (predicted vs actual accuracy) - A/B Testing with 10% holdout for measuring intelligence lift - Feedback Loop for tracking suggestion follow-through - Active Learning for identifying uncertain states v3 Improvements: - Error Pattern Learning (Rust E0xxx, TypeScript TSxxxx, npm errors) - File Sequence Learning (tracks which files are edited together) - Test Suggestion Triggers (suggests cargo test after source edits) - Hive-Mind swarm coordination (11 agents, 38 edges) Pretrained from memory.db: - 7,697 commands processed - 4,023 vector memories - 117 Q-table states with decay metadata - 8,520 calibration samples Anti-overfitting measures: - Q-values capped at 0.8, floored at -0.5 - Decaying learning rate: 0.3/sqrt(count) - Pattern decay with timestamps 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(intelligence): Fix Q-table lookups - learning now has real effect Three critical bugs were preventing the intelligence layer from using learned patterns: 1. State format mismatch: CLI used spaces ("editing rs in project") but Q-table used underscores ("edit_rs_in_project") - Fixed in cli.js: all states now use underscore format 2. stateKey() hyphen normalization: Function converted hyphens to underscores, but Q-table keys had hyphens (e.g. "ruvector-core") - Fixed regex: /[^a-z0-9-]+/g preserves hyphens 3. A/B testing control group: 10% random sessions ignored learning - Reduced holdout to 5% with persistent session assignment - Added INTELLIGENCE_MODE=treatment env override for development Result: Agent recommendations now show 80% confidence for Rust files using learned Q-values, instead of 0% with random selection. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(hooks): Display intelligence guidance to Claude in foreground Critical fix: PreToolUse hooks were running in background (&) which meant Claude never saw the intelligence output. Now: - PreToolUse: Foreground execution (Claude sees guidance) - pre-edit: Shows recommended agent + confidence + similar edits - pre-command: Shows command patterns + suggestions - Added 3s timeout to prevent blocking - PostToolUse: Background execution (async learning) - post-edit: Records success/failure, learns patterns - post-command: Captures errors, updates Q-values - SessionStart: New hook shows learned patterns at session start - Displays pattern count, memory stats - Shows top 3 learned state-action pairs with Q-values Claude now receives self-learning guidance like: "🧠 Intelligence Analysis: 📁 ruvector-core/lib.rs 🤖 Recommended: rust-developer (80% confidence) 📚 3 similar past edits found" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent ecf4483 commit ae4961e

45 files changed

Lines changed: 604676 additions & 42 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/hooks/bench-runner.sh

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
#!/bin/bash
2+
# Benchmark runner with baseline comparison for RuVector
3+
# Integrates with criterion benchmarks and stores results
4+
5+
set -e
6+
7+
CRATE="${1:-all}"
8+
BASELINE_DIR="/workspaces/ruvector/.claude-flow/metrics/benchmarks"
9+
mkdir -p "$BASELINE_DIR"
10+
11+
cd /workspaces/ruvector
12+
13+
echo "📊 RuVector Benchmark Runner"
14+
echo "============================"
15+
echo ""
16+
17+
run_bench() {
18+
local crate=$1
19+
local bench_name=$2
20+
local output_file="$BASELINE_DIR/${crate}-$(date +%Y%m%d-%H%M%S).json"
21+
22+
echo "🏃 Running: cargo bench -p $crate"
23+
24+
# Run benchmark and capture output
25+
if cargo bench -p "$crate" -- --noplot 2>&1 | tee /tmp/bench-output.txt; then
26+
# Extract timing info from criterion output
27+
grep -E "time:" /tmp/bench-output.txt | head -10
28+
29+
# Store raw output
30+
cp /tmp/bench-output.txt "$output_file.txt"
31+
echo ""
32+
echo "📁 Results saved to: $output_file.txt"
33+
else
34+
echo "⚠️ Benchmark failed for $crate"
35+
fi
36+
}
37+
38+
case "$CRATE" in
39+
"all")
40+
echo "Running all available benchmarks..."
41+
echo ""
42+
43+
# Core benchmarks
44+
if [ -d "crates/ruvector-bench" ]; then
45+
run_bench "ruvector-bench" "core"
46+
fi
47+
48+
# MinCut benchmarks
49+
if [ -d "crates/ruvector-mincut" ]; then
50+
run_bench "ruvector-mincut" "mincut"
51+
fi
52+
53+
# Attention benchmarks
54+
if [ -d "crates/ruvector-attention" ]; then
55+
run_bench "ruvector-attention" "attention"
56+
fi
57+
;;
58+
59+
"core"|"ruvector-bench")
60+
run_bench "ruvector-bench" "core"
61+
;;
62+
63+
"mincut"|"ruvector-mincut")
64+
run_bench "ruvector-mincut" "mincut"
65+
;;
66+
67+
"attention"|"ruvector-attention")
68+
run_bench "ruvector-attention" "attention"
69+
;;
70+
71+
"graph"|"ruvector-graph")
72+
run_bench "ruvector-graph" "graph"
73+
;;
74+
75+
"quick")
76+
echo "Running quick sanity benchmarks..."
77+
cargo bench -p ruvector-bench -- --noplot "insert" 2>&1 | tail -10
78+
;;
79+
80+
*)
81+
echo "Usage: $0 [all|core|mincut|attention|graph|quick|<crate-name>]"
82+
echo ""
83+
echo "Available benchmark crates:"
84+
echo " core/ruvector-bench - Core vector operations"
85+
echo " mincut - Min-cut algorithms"
86+
echo " attention - Attention mechanisms"
87+
echo " graph - Graph operations"
88+
echo " quick - Fast sanity check"
89+
exit 1
90+
;;
91+
esac
92+
93+
echo ""
94+
echo "✅ Benchmarks complete"
95+
echo "📁 Results in: $BASELINE_DIR/"

.claude/hooks/crate-context.sh

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
#!/bin/bash
2+
# Load crate-specific context for intelligent code assistance
3+
# Outputs relevant examples, tests, and documentation paths
4+
5+
set -e
6+
7+
FILE="$1"
8+
if [ -z "$FILE" ]; then
9+
echo "Usage: $0 <file_path>"
10+
exit 1
11+
fi
12+
13+
cd /workspaces/ruvector
14+
15+
# Detect crate from file path
16+
CRATE_DIR=$(echo "$FILE" | grep -oP "crates/[^/]+" | head -1 || echo "")
17+
CRATE_NAME=""
18+
19+
if [ -n "$CRATE_DIR" ]; then
20+
CRATE_NAME=$(basename "$CRATE_DIR")
21+
fi
22+
23+
echo "{"
24+
echo " \"file\": \"$FILE\","
25+
echo " \"crate\": \"$CRATE_NAME\","
26+
27+
# Find related test files
28+
echo " \"tests\": ["
29+
TESTS=$(find "$CRATE_DIR/tests" -name "*.rs" 2>/dev/null | head -5 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
30+
echo "$TESTS"
31+
echo " ],"
32+
33+
# Find related examples
34+
echo " \"examples\": ["
35+
EXAMPLES=$(find "$CRATE_DIR/examples" -name "*.rs" 2>/dev/null | head -5 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
36+
if [ -z "$EXAMPLES" ]; then
37+
# Check examples/ directory at root
38+
case "$CRATE_NAME" in
39+
"ruvector-core"|"ruvector-wasm")
40+
EXAMPLES=$(find "examples/wasm" "examples/wasm-react" -name "*.ts" -o -name "*.tsx" 2>/dev/null | head -3 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
41+
;;
42+
"ruvector-graph"*)
43+
EXAMPLES=$(find "examples" -path "*graph*" -name "*.rs" 2>/dev/null | head -3 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
44+
;;
45+
"ruvector-mincut"*)
46+
EXAMPLES=$(find "examples/mincut" -name "*.rs" 2>/dev/null | head -3 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
47+
;;
48+
esac
49+
fi
50+
echo "$EXAMPLES"
51+
echo " ],"
52+
53+
# Find related documentation
54+
echo " \"docs\": ["
55+
DOCS=$(find "$CRATE_DIR" -name "*.md" 2>/dev/null | head -5 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
56+
if [ -z "$DOCS" ]; then
57+
case "$CRATE_NAME" in
58+
"ruvector-postgres"*)
59+
DOCS=$(find "docs/postgres" -name "*.md" 2>/dev/null | head -5 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
60+
;;
61+
"rvlite")
62+
DOCS=$(find "crates/rvlite/docs" -name "*.md" 2>/dev/null | head -5 | while read f; do echo " \"$f\","; done | sed '$ s/,$//')
63+
;;
64+
esac
65+
fi
66+
echo "$DOCS"
67+
echo " ],"
68+
69+
# Key dependencies
70+
echo " \"key_deps\": ["
71+
if [ -f "$CRATE_DIR/Cargo.toml" ]; then
72+
grep -E "^\[dependencies\]" -A 20 "$CRATE_DIR/Cargo.toml" 2>/dev/null | grep -E "^[a-z]" | head -5 | while read line; do
73+
DEP=$(echo "$line" | cut -d'=' -f1 | tr -d ' ')
74+
echo " \"$DEP\","
75+
done | sed '$ s/,$//'
76+
fi
77+
echo " ],"
78+
79+
# Suggest related commands
80+
echo " \"commands\": {"
81+
case "$CRATE_NAME" in
82+
"ruvector-core"|"ruvector-bench")
83+
echo " \"test\": \"cargo test -p $CRATE_NAME\","
84+
echo " \"bench\": \"cargo bench -p ruvector-bench\","
85+
echo " \"check\": \"cargo check -p $CRATE_NAME\""
86+
;;
87+
"rvlite"|"ruvector-wasm"|"ruvector-graph-wasm"|"ruvector-gnn-wasm")
88+
echo " \"build\": \"wasm-pack build --target web --release\","
89+
echo " \"test\": \"wasm-pack test --headless --chrome\","
90+
echo " \"size\": \".claude/hooks/wasm-size-check.sh $CRATE_NAME\""
91+
;;
92+
"ruvector-postgres")
93+
echo " \"build\": \"cargo pgrx package\","
94+
echo " \"test\": \"cargo pgrx test\","
95+
echo " \"run\": \"cargo pgrx run\""
96+
;;
97+
*)
98+
echo " \"test\": \"cargo test -p $CRATE_NAME\","
99+
echo " \"check\": \"cargo check -p $CRATE_NAME\""
100+
;;
101+
esac
102+
echo " }"
103+
104+
echo "}"

.claude/hooks/post-rust-edit.sh

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
#!/bin/bash
2+
# Post-edit hook for Rust files in RuVector
3+
# Runs format check, clippy, and optional benchmarks
4+
5+
set -e
6+
7+
FILE="$1"
8+
RUN_BENCH="${2:-false}"
9+
10+
if [ -z "$FILE" ]; then
11+
echo "Usage: $0 <file_path> [run_bench]"
12+
exit 1
13+
fi
14+
15+
EXT="${FILE##*.}"
16+
if [ "$EXT" != "rs" ]; then
17+
exit 0 # Not a Rust file
18+
fi
19+
20+
cd /workspaces/ruvector
21+
22+
# Detect crate
23+
CRATE_DIR=$(echo "$FILE" | grep -oP "crates/[^/]+" | head -1 || echo "")
24+
CRATE_NAME=""
25+
26+
if [ -n "$CRATE_DIR" ]; then
27+
CRATE_NAME=$(basename "$CRATE_DIR")
28+
fi
29+
30+
echo "🦀 Post-edit checks for: $FILE"
31+
32+
# 1. Format check (don't auto-fix, just report)
33+
echo ""
34+
echo "📐 Checking format..."
35+
if cargo fmt --check -- "$FILE" 2>/dev/null; then
36+
echo " ✅ Format OK"
37+
else
38+
echo " ⚠️ Format issues detected (run: cargo fmt)"
39+
fi
40+
41+
# 2. Quick clippy check for the crate
42+
if [ -n "$CRATE_NAME" ]; then
43+
echo ""
44+
echo "📎 Running clippy for $CRATE_NAME..."
45+
CLIPPY_OUT=$(cargo clippy -p "$CRATE_NAME" --message-format=short 2>&1 | grep -E "^(warning|error)" | head -5)
46+
if [ -z "$CLIPPY_OUT" ]; then
47+
echo " ✅ No clippy warnings"
48+
else
49+
echo "$CLIPPY_OUT"
50+
fi
51+
fi
52+
53+
# 3. Check for test file and suggest tests
54+
TEST_FILE="${FILE%.rs}_test.rs"
55+
if [ -f "$TEST_FILE" ]; then
56+
echo ""
57+
echo "🧪 Test file exists: $TEST_FILE"
58+
fi
59+
60+
# 4. WASM size check for wasm crates
61+
if echo "$FILE" | grep -qE "wasm|rvlite"; then
62+
echo ""
63+
echo "📏 WASM crate modified - consider running:"
64+
echo " cd crates/rvlite && wasm-pack build --release"
65+
echo " ls -lh pkg/*.wasm"
66+
fi
67+
68+
# 5. Optional benchmark for performance-critical crates
69+
if [ "$RUN_BENCH" = "true" ]; then
70+
case "$CRATE_NAME" in
71+
"ruvector-core"|"ruvector-bench")
72+
echo ""
73+
echo "📊 Running benchmarks..."
74+
cargo bench -p ruvector-bench -- --noplot 2>&1 | tail -20
75+
;;
76+
"ruvector-mincut")
77+
echo ""
78+
echo "📊 Running mincut benchmarks..."
79+
cargo bench -p ruvector-mincut -- --noplot 2>&1 | tail -20
80+
;;
81+
"ruvector-attention")
82+
echo ""
83+
echo "📊 Running attention benchmarks..."
84+
cargo bench -p ruvector-attention -- --noplot 2>&1 | tail -20
85+
;;
86+
esac
87+
fi
88+
89+
# Store metrics
90+
METRICS_DIR="/workspaces/ruvector/.claude-flow/metrics"
91+
mkdir -p "$METRICS_DIR"
92+
93+
# Record edit in metrics
94+
echo "{\"file\": \"$FILE\", \"crate\": \"$CRATE_NAME\", \"timestamp\": \"$(date -Iseconds)\"}" >> "$METRICS_DIR/edit-log.jsonl"
95+
96+
echo ""
97+
echo "✅ Post-edit checks complete"

.claude/hooks/rust-check.sh

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
#!/bin/bash
2+
# Rust-specific pre-edit hook for RuVector
3+
# Runs cargo check, clippy hints, and detects crate context
4+
5+
set -e
6+
7+
FILE="$1"
8+
if [ -z "$FILE" ]; then
9+
echo "Usage: $0 <file_path>"
10+
exit 1
11+
fi
12+
13+
EXT="${FILE##*.}"
14+
if [ "$EXT" != "rs" ]; then
15+
exit 0 # Not a Rust file
16+
fi
17+
18+
cd /workspaces/ruvector
19+
20+
# Detect which crate this file belongs to
21+
CRATE_DIR=$(echo "$FILE" | grep -oP "crates/[^/]+" | head -1 || echo "")
22+
CRATE_NAME=""
23+
24+
if [ -n "$CRATE_DIR" ]; then
25+
CRATE_NAME=$(basename "$CRATE_DIR")
26+
echo "🦀 Crate: $CRATE_NAME"
27+
28+
# Show crate-specific context
29+
case "$CRATE_NAME" in
30+
"ruvector-core")
31+
echo " 📊 Core vector engine (HNSW, SIMD, quantization)"
32+
echo " 📦 Key: VectorStore, HnswIndex, Distance metrics"
33+
;;
34+
"rvlite")
35+
echo " 🌐 WASM standalone DB (SQL/SPARQL/Cypher)"
36+
echo " 📦 Key: RvLite, SqlExecutor, CypherParser"
37+
echo " ⚠️ Size target: <3MB gzipped"
38+
;;
39+
"ruvector-wasm")
40+
echo " 🌐 WASM bindings for ruvector-core"
41+
echo " 📦 Key: WasmVectorStore, IndexedDB storage"
42+
;;
43+
"ruvector-graph"|"ruvector-graph-wasm"|"ruvector-graph-node")
44+
echo " 🕸️ Graph database with Cypher support"
45+
echo " 📦 Key: GraphStore, CypherQuery, HyperEdge"
46+
;;
47+
"ruvector-gnn"|"ruvector-gnn-wasm"|"ruvector-gnn-node")
48+
echo " 🧠 Graph Neural Networks (GCN, GraphSAGE, GAT)"
49+
echo " 📦 Key: GnnLayer, MessagePassing, Aggregation"
50+
;;
51+
"ruvector-postgres")
52+
echo " 🐘 PostgreSQL extension (pgvector compatible)"
53+
echo " 📦 Key: pgrx, SQL functions, background workers"
54+
;;
55+
"sona")
56+
echo " 🎓 ReasoningBank with 9 RL algorithms"
57+
echo " 📦 Key: Trajectory, Verdict, LoRA, EWC++"
58+
;;
59+
"ruvector-mincut"|"ruvector-mincut-wasm"|"ruvector-mincut-node")
60+
echo " ✂️ Subpolynomial dynamic min-cut algorithm"
61+
echo " 📦 Key: ContractedGraph, LambdaCut, SparseCertificate"
62+
;;
63+
"ruvector-attention"|"ruvector-attention-wasm"|"ruvector-attention-node")
64+
echo " 👁️ 39+ attention mechanisms"
65+
echo " 📦 Key: MultiHeadAttention, GeometricAttention"
66+
;;
67+
"ruvector-tiny-dancer"|"ruvector-tiny-dancer-wasm"|"ruvector-tiny-dancer-node")
68+
echo " 💃 FastGRNN neural router for agents"
69+
echo " 📦 Key: Router, FastGRNN, CircuitBreaker"
70+
;;
71+
"ruvector-cli")
72+
echo " ⌨️ CLI and MCP server"
73+
echo " 📦 Key: Commands, MCP protocol, REST API"
74+
;;
75+
*)
76+
echo " 📦 Crate: $CRATE_NAME"
77+
;;
78+
esac
79+
80+
# Quick cargo check for the specific crate
81+
echo ""
82+
echo "🔍 Running cargo check -p $CRATE_NAME..."
83+
if cargo check -p "$CRATE_NAME" --message-format=short 2>&1 | head -10; then
84+
echo "✅ Cargo check passed"
85+
else
86+
echo "⚠️ Check for warnings/errors above"
87+
fi
88+
fi
89+
90+
# Check for WASM-related files
91+
if echo "$FILE" | grep -qE "wasm|rvlite"; then
92+
echo ""
93+
echo "📏 WASM file detected - size considerations apply"
94+
echo " Target: <3MB gzipped for rvlite"
95+
fi
96+
97+
echo ""

0 commit comments

Comments
 (0)