Roadmap-Hinweis: Vage Bullets ohne Akzeptanzkriterien in Checkbox-Tasks überführen. Format:
- [ ] <Task> (Target: <Q/Jahr>).
Production-Ready — Core graph query optimization (cost-based algorithm selection, constrained path finding, traversal algorithm selection, adaptive optimization, parallel traversal, structural plan reuse) is functional. Distributed graph query execution across shards is implemented. EXPLAIN endpoint (POST /api/v1/graph/query/explain) for dry-run plan inspection is now available (Issue: #1816).
- Graph query optimizer with cost-based algorithm selection
- Constrained path finding (min/max length, required/forbidden nodes and edges)
- Traversal algorithm selection: BFS, DFS, Dijkstra, A*, Bidirectional
- Query plan generation with cost estimates
- Query plan explanation and alternative strategy reporting
- Execution statistics tracking for adaptive optimization
- Query plan caching
- Path validation and constraint checking
- Integration with GraphIndexManager for graph operations
- Integration with AQL for graph query execution
- Query plan reuse across structurally similar queries
- Parallel multi-source BFS/DFS for large graphs (Issue: #1808)
- Adaptive cost model: EMA-based per-algorithm learning, enabled by default
- Adaptive plan selection using execution feedback (cost model learning) (Issue: #1812)
- Cost model calibration from real execution feedback (Issue: #2386)
- Property graph schema-aware optimizer hints (Issue: #1819)
- Distributed graph query execution across shards (Issue: #1826)
- Incremental graph query execution on live updates (Issue: #1825)
- Plan cache eviction with size and TTL controls (Issue: #1827)
- Graph query result streaming for large path sets (Issue: #1822)
- Integration with analytics module for graph algorithm reuse (Issue: #1821)
- Parallel multi-source traversal for large fan-out queries — fan_out_threshold + intra-frontier parallelism (Issue: #1811)
- Subgraph isomorphism queries (pattern matching) (Issue: #2390)
- EXPLAIN HTTP endpoint (
POST /api/v1/graph/query/explain) for all query types (Issue: #1816) - Query Rewriting for Graph Optimization (Issue: #250):
GraphQueryRewriterwith predicate pushdown, CSE, join reordering, materialized view utilisation, and query decomposition for parallelism (include/graph/graph_query_rewriter.h,src/graph/graph_query_rewriter.cpp)
- [~] GPU-accelerated BFS/DFS for massive graphs (
graph/gpu_traversal.cpp, CPU fallback active; real CUDA kernels under THEMIS_ENABLE_CUDA; focused tests GPU-01..19 added 2026-05-11)
- EXPLAIN output in AQL for graph query plans (Issue: #1816)
- Scheduled Semantic Graph Edge Refresh with vector similarity scoring, temporal decay, and ACID batch updates (Issue: #FEATURE/ScheduledGraphEdgeRefresh)
- [I] GPU-accelerated BFS/DFS for massive graphs (Issue: #1829)
- Ontologie-Integration:
OntologyManager+ semantische Pfad-Constraints (Target: Q3 2026)- Affected:
include/graph/ontology_manager.h,src/graph/ontology_manager.cpp,include/graph/path_constraints.h,src/graph/path_constraints.cpp - Expected behavior:
PathConstraints::addSemanticConstraint(ontology, ruleset)prüft Kanten- und Knotentypen gegen OWL-lite Konzepthierarchie; Violation-Liste zurückgeben - Errors: unbekannte Konzept-IDs → unconstrained (WARN); Parsing-Fehler →
Status::Error - Tests: OM-01..OM-12 (
tests/graph/test_ontology_manager.cpp) + SC-01..SC-10 (tests/graph/test_path_constraints_semantic.cpp) - Perf: ≤ 5 µs per edge constraint check; Ontologie-Load ≤ 100 ms für 10 000 Konzepte
- Detail:
src/graph/FUTURE_ENHANCEMENTS.md→ Ontology-based Semantic Constraints
- Affected:
- [~] Knowledge Graph Reasoning mit AI/ML + LoRA (Target: Q4 2026 – Q3 2027)
- Affected:
include/graph/knowledge_graph_reasoner.h,src/graph/knowledge_graph_reasoner.cpp,include/rag/knowledge_graph_retriever.h,src/rag/knowledge_graph_retriever.cpp, Integration mitsrc/llm/multi_lora_manager.cpp - Expected behavior: Horn-Klausel-Forward-Chaining + Erklärungsketten + incremental CDC-Trigger; LoRA-Adapter liefert Soft-Plausibility-Score pro Inferenzkante; Mustererkennung in Graphpfaden
- Errors: Regelwiderspruch →
ConflictError; LoRA-Adapter nicht geladen → deterministische Fallback-Regeln - Tests: KGR-01..KGR-20 (
test_knowledge_graph_reasoner.cpp) - Perf: Forward-chaining 1 M Kanten ≤ 2 s kalt; ≤ 50 ms incremental
- Detail:
src/graph/FUTURE_ENHANCEMENTS.md→ Knowledge Graph Reasoning with Ontology & ML/LoRA
- Affected:
- Graph query optimizer with cost-based algorithm selection (
graph/query_optimizer.cpp) - Constrained path finding (min/max length, required/forbidden nodes and edges)
- Traversal algorithm selection: BFS, DFS, Dijkstra, A*, Bidirectional
- Query plan generation with cost estimates and explanation output
- Execution statistics tracking for adaptive optimization
- Query plan caching (
graph/plan_cache.cpp) - Path validation and constraint checking
- Integration with GraphIndexManager for graph operations
- Integration with AQL for graph query execution
- Parallel multi-source BFS/DFS for large graphs (
graph/parallel_traversal.cpp, Target: Q2 2026) (Issue: #1833) - Query plan reuse across structurally similar queries (Target: Q2 2026)
- Adaptive cost model: EMA per algorithm, confidence-weighted blending into cost estimates
- Advanced cost model calibration from real execution feedback (Target: Q3 2026)
- Subgraph isomorphism queries (pattern matching)
- Distributed graph query execution across shards
- Plan cache eviction with size and TTL controls
- Temporal graph query optimization (time-ranged traversals)
- [~] GPU-accelerated BFS/DFS for massive graphs (
graph/gpu_traversal.cpp, CPU fallback active; real CUDA kernels planned for THEMIS_ENABLE_CUDA)- Progress 2026-05-11 (unit tests): Added 19 focused tests (GPU-01..GPU-19) in
tests/graph/test_gpu_traversal.cppcovering load, BFS/DFS correctness, depth limits, forbidden vertices, max_results truncation, cyclic-graph termination, disconnected graphs, error cases (unknown vertex, pre-load calls), andused_cpu_fallbackverification; dedicated CMake targettest_gpu_traversal_focused.
- Progress 2026-05-11 (unit tests): Added 19 focused tests (GPU-01..GPU-19) in
-
ScheduledGraphEdgeRefreshEngineclass withRefreshPolicyconfig interface and background scheduler (include/graph/scheduled_edge_refresh.h,src/graph/scheduled_edge_refresh.cpp, Target: Q4 2026)- Affected files:
include/graph/scheduled_edge_refresh.h,src/graph/scheduled_edge_refresh.cpp,include/index/graph_index.h,src/index/graph_index.cpp - Runtime: background thread wakes at
refresh_interval; synchronoustriggerRefresh()also available - Error handling: safety gate aborts batch; commit failure logged; invalid policy throws
std::invalid_argument - Tests:
tests/graph/test_scheduled_edge_refresh.cpp(45+ tests: unit, integration, regression, ChangeFeed, anomaly) - Performance target: cycle completes in O(V·K) per vertex for candidate discovery; brute-force for ≤10k nodes, ANN for larger graphs
- Compatibility: no breaking changes to
GraphIndexManagerpublic API;createWriteBatch()is an additive method
- Affected files:
- Vector similarity scoring: cosine, dot-product, Euclidean (Target: Q4 2026)
- Temporal decay factor for edge relevance scoring (exponential half-life, Target: Q4 2026)
- Centrality-based edge weight (inverse log-degree dampening, Target: Q4 2026)
- ACID batch transactions with rollback on safety-gate violations (
createWriteBatch()on GraphIndexManager, Target: Q4 2026) - Audit trail for all edge mutations (in-memory ring buffer, 10k entries, Target: Q4 2026)
- Anomaly detection:
removal_rate+anomaly_high_removal_rateinRefreshStats;anomaly_threshold_removal_rateinRefreshPolicy(Target: Q4 2026) - Changefeed integration:
setChangefeed()→recordEvent()per mutation with key prefixgraph_edge_refresh:(Target: Q4 2026) - Integration tests: large graph (50+ nodes), cluster-embedding scenario, regression (stable graph), changefeed event verification (Target: Q4 2026)
- Integration with acceleration module for ANN/GNN top-k candidate edges —
setANNIndex(IAnnIndex*),rebuildANNIndex(), ANN path indiscoverCandidateEdges()when vertex count >policy.ann_min_vertices(Target: Q1 2027) - CEP event emission for edge mutations via
analytics/cep_engine—setCEPEventCallback(std::function<void(themisdb::analytics::Event)>),EDGE_CREATE/EDGE_DELETEevents emitted after successful batch commit (Target: Q1 2027) - Bilingual documentation EN (
docs/scheduled_edge_refresh.md) and DE (docs/de/scheduled_edge_refresh.md) including anomaly detection + Changefeed sections (Target: Q4 2026)
-
GraphQueryRewriterclass (include/graph/graph_query_rewriter.h,src/graph/graph_query_rewriter.cpp, Target: Q2 2026)- Affected files:
include/graph/graph_query_rewriter.h,src/graph/graph_query_rewriter.cpp,tests/test_graph_query_rewriter.cpp - Runtime: pure JSON-plan transformer; no database I/O; thread-safe (stateless rules); fixed-point iteration (max 5 passes)
- Rules:
PREDICATE_PUSHDOWN/PRUNE_EARLY(promotes vertex filters to prune conditions for early BFS/DFS branch cutting),COMMON_SUBEXPRESSION(replaces duplicate traversals with LET-scoped refs),JOIN_REORDERING(swaps traversal_join operands by heuristic cardinality),MATERIALIZED_VIEW(tags sub-graph traversals for precomputed materialisation),QUERY_DECOMPOSITION(splits multi-start traversals into independent parallel subqueries) - Error handling:
addCustomRule(nullptr)throwsstd::invalid_argument;rewrite_time_limit_msprovides a wall-clock escape hatch - Tests:
tests/test_graph_query_rewriter.cpp(38 unit tests covering all acceptance criteria; standalone targettest_graph_query_rewriter) - Performance: O(n) per rule pass where n = JSON plan nodes; total rewrite for typical AQL plan < 1 ms
- Compatibility: additive new class; no changes to
GraphQueryOptimizerpublic API
- Affected files:
- Common subexpression elimination for graph traversal plans
- Predicate pushdown to graph traversal layer (prune early)
- Join reordering for graph traversal patterns based on estimated selectivity
- Materialized view utilisation tags for frequently accessed subgraphs
- Query decomposition for multi-start traversal parallelism
-
estimateSpeedup()heuristic for pre/post rewrite plan comparison -
explainRewrites()human-readable transformation summary - Custom rule hook via
addCustomRule() - Selective rule enablement via
RewriteConfig::enabled_rules - Wall-clock time budget via
RewriteConfig::rewrite_time_limit_ms
-
OntologyManager— JSON/YAML-Loader,isA()transitive Konzepthierarchie,allowedEdgeTypes()(Target: Q3 2026)- Affected:
include/graph/ontology_manager.h,src/graph/ontology_manager.cpp - Runtime: immutable nach
build(); thread-safe read-only; LRU-Cache fürisA()(1 000 Einträge) - Error handling: unbekannte Konzept-IDs → unconstrained (WARN); Parsing-Fehler →
Status::Error - Tests:
tests/graph/test_ontology_manager.cpp(OM-01..OM-12) - Perf: Load ≤ 100 ms für 10 000 Konzepte;
isA()≤ 5 µs inkl. Cache-Lookup
- Affected:
-
PathConstraints::addSemanticConstraint()— OWL-lite Pfad-Validierung, prune-first BFS (Target: Q4 2026)- Affected:
include/graph/path_constraints.h,src/graph/path_constraints.cpp - Runtime:
validateSemanticPath()iteriert über alle Pfadkanten; Violation-Liste zurückgeben - Tests:
tests/graph/test_path_constraints_semantic.cpp(SC-01..SC-10) - Detail:
src/graph/FUTURE_ENHANCEMENTS.md→ Ontology-based Semantic Constraints
- Affected:
-
KnowledgeGraphReasoner— Horn-Klausel-Forward-Chaining +InferenceStore+ Erklärungsketten (Target: Q4 2026)- Affected:
include/graph/knowledge_graph_reasoner.h,src/graph/knowledge_graph_reasoner.cpp - Runtime:
infer(subjectId, depth)→InferenceChain;explain(factId)→ Proof-Trace als Triple-Sequenz - Error handling: Regelwiderspruch →
ConflictError; Zirkelbeweis → Depth-Limit mitCycleDetected - Tests:
tests/graph/test_knowledge_graph_reasoner.cpp(KGR-01..KGR-22) - Perf: 1 M Kanten kalt ≤ 2 s; incremental CDC ≤ 50 ms
- Affected:
-
Incremental CDC-Trigger:
KnowledgeGraphReasoner::onCDCEvent()für Forward-Chaining bei Kanten-Inserts (Target: Q1 2027) -
LoRA-Adapter-Integration:
applyLoRAScore()— Soft-Plausibility-Scoring viaMultiLoRAManagerfür Mustererkennung (Target: Q2 2027)- Affected: Integration mit
src/llm/multi_lora_manager.cpp - Runtime: Graph-Kontext → LoRA-Adapter-Selektion → Konfidenzwert (0.0–1.0) pro Inferenzkante
- Guard:
THEMIS_ENABLE_LLM; deterministischer Regel-Fallback wenn LoRA nicht geladen - Perf: LoRA-Scoring 1 000 Kanten ≤ 500 ms
- Progress 2026-05-11:
KnowledgeGraphReasonersupports optional directMultiLoRAManagerinjection (setMultiLoRAManager(...)).applyLoRAScore()now uses manager-backed adapter metadata (scale + premise-complexity penalty) when no explicit scorer callback is injected; deterministic fallback remains active when no adapter is available. Test KGR-23 verifies the bridge path.
- Affected: Integration mit
-
RAG-Integration:
KnowledgeGraphRetrievernutztKnowledgeGraphReasonerfür Multi-Hop-Reasoning (Target: Q3 2027)- Affected:
include/rag/knowledge_graph_retriever.h,src/rag/knowledge_graph_retriever.cpp - Detail:
src/graph/FUTURE_ENHANCEMENTS.md→ Knowledge Graph Reasoning with Ontology & ML/LoRA
- Affected:
-
[I] Unit tests coverage > 80% (Issue: #1830)
-
Integration tests (query optimizer, constrained path finding, AQL integration)
-
Performance benchmarks (traversal latency vs graph size) (Issue: #1831)
-
[I] Security audit (query injection via path constraints) (Issue: #1832)
-
Documentation complete
-
API stability guaranteed for graph query optimizer and path finder
Wissenschaftliche Basis: Yadav et al. 2023 (TIES-Merging, NeurIPS); Stoudenmire & Schwab 2016 (TN for ML); Rajaraman & Ullman 2011 (LSH)
-
TensorFingerprint— 128-element MinHash + core-norm vector + total_norm + max_rank -
FingerprintGraphConfig—similarity_threshold,num_hash_funcs,num_bands,max_candidates,top_k -
SimilarTensorResult—{tensor_id, similarity, tenant, collection, field} -
TensorFingerprintGraph— node/edge graph with LSH-based nearest-neighbour search -
DeduplicationConfig—similarity_threshold=0.999,delta_eps,delta_max_rank,allow_full_storage_fallback -
StoredTensorRecord—{tensor_id, reference_id, is_canonical, compressed_bytes, saved_bytes, similarity_to_reference} -
DeduplicationStats—{total_tensors, canonical_tensors, delta_tensors, total_bytes_stored, bytes_saved, dedup_ratio} -
TensorDeduplicationManager— write path (store + delta), read path (retrieve), stats
-
TensorFingerprintGraph::computeFingerprint()— FNV-1a-based MinHash (128 functions), core-norm quantisation, total-norm scaling -
TensorFingerprintGraph::insertIntoBuckets()— LSH banding (b=32 bands, r=4 rows/band by default) -
TensorFingerprintGraph::insert()— fingerprint → LSH → candidate set → Jaccard approximation → edge insertion -
TensorFingerprintGraph::remove()— removes node + edges from adjacency list; O(neighbours) cleanup -
TensorFingerprintGraph::findSimilar()— LSH lookup + Jaccard ranking + top_k truncation -
TensorFingerprintGraph::neighbours()— direct adjacency list lookup -
TensorDeduplicationManager::store()— fingerprint graph query → delta-encode if reference found → canonical otherwise -
TensorDeduplicationManager::computeDelta()— dense subtraction + TT-recompression of residual -
TensorDeduplicationManager::getStats()— atomic counters for bytes_stored, bytes_saved, dedup_ratio
- Invalid
FingerprintGraphConfig(num_hash_funcs not divisible by num_bands) →std::invalid_argument - Null dependencies in
TensorDeduplicationManager→std::invalid_argument -
remove()returns false for unknown tensor_id -
findSimilaron empty graph returns empty vector -
allow_full_storage_fallback = true— no data loss when reference not loadable
- 20 unit tests (TFG-01..TFG-20) in
tests/graph/test_tensor_fingerprint_graph.cpp - 5 deduplication manager tests (TDM-01..TDM-05) in
tests/graph/test_tensor_fingerprint_graph.cpp
- Fingerprint + LSH insert ≤ 10ms per tensor (Target: Q2 2027)
- Profiling baseline: sequential MinHash over 128 hash functions on 8-mode train
- Progress 2026-05-10 (Bloom-filter optimization): Added
lsh_nonempty_presence set toTensorFingerprintGraph.lshCandidates()now performs an O(1)lsh_nonempty_.count(bucket_key)check beforelsh_buckets_.find(), skipping empty bands without map lookup overhead.insertIntoBuckets()andremoveFromBuckets()maintainlsh_nonempty_in sync. - Progress 2026-05-10 (Performance benchmark):
BM_FingerprintInsertinbenchmarks/bench_tensor_fingerprint.cppmeasures insert latency for mode sizes 8/16/32/64 at 200 iterations each (THEMIS_BUILD_BENCHMARKS=ON).
- Graph query ≤ 50ms for 100K nodes (Target: Q2 2027)
- Profiling baseline: LSH band scan over 32 bands × 100K total entries
- Progress 2026-05-10 (Bloom-filter optimization): Same
lsh_nonempty_gate eliminates redundant map lookups inlshCandidates()for sparse bucket distributions, improvingfindSimilar()throughput for large graphs. - Progress 2026-05-10 (Performance benchmark):
BM_FindSimilar_100Kinbenchmarks/bench_tensor_fingerprint.cppmeasuresfindSimilar()latency for the fixed 100K-node acceptance target.
- LSH bucket cleanup on remove/update to prevent stale candidate IDs (2026-05-07)
TensorFingerprintGraph::removeFromBuckets()now removes tensor IDs from all band buckets both onremove()and overwrite path ininsert().- Regression tests TFG-22 and TFG-23 verify no stale IDs are returned via
findSimilar()after delete/update.
- Candidate hard cap in LSH lookup to bound worst-case query cost (2026-05-11)
TensorFingerprintGraph::lshCandidates()now stops within-band enumeration as soon asmax_candidatesis reached (instead of collecting full buckets before the outer-loop break), and returns immediately whenmax_candidates=0.- Regression test TFG-29 (
TFG29_MaxCandidatesHardCapBoundedResolverCalls) verifies bounded resolver invocations and bounded result size.
- Exact TT-cosine similarity verification for edge creation (replace Jaccard approximation) (Target: Q2 2027)
- Progress 2026-05-07:
TensorFingerprintGraph::insert()andfindSimilar()now useTensorTrainDecomposer::cosineSimilarity()for exact compressed-domain verification/ranking;NodeEntrystores the insertedTTTrainfor candidate checks; tests TFG-03 + TFG-21 verify edge creation with near-1.0 cosine and exact score parity. - Progress 2026-05-07 (integration hardening): Added
setTrainLoadFn()+ configcache_trains_in_memory=falseto resolve candidate TT trains externally when node cache is not retained in memory; tests TFG-24/TFG-25 cover resolver path and safe no-loader behavior. - Progress 2026-05-07 (storage wiring):
TensorDeduplicationManagernow wiressetTrainLoadFn()toTensorNetworkStorageEngine::getCompressed()+ dequantize so exact-similarity checks can resolve candidate TT trains without in-memory cache (integration test TDM-09 withcache_trains_in_memory=false). - Progress 2026-05-07 (recovery bootstrap):
TensorFingerprintGraphnow supports export/import of persisted fingerprint-node metadata to rebuild LSH buckets after restart (exportPersistedNodes()/importPersistedNodes(); test TFG-26). - Progress 2026-05-07 (edge persistence): Added persisted edge export/import for
adjacency re-hydration (
exportPersistedEdges()/importPersistedEdges()), including robust import behavior for dangling/duplicate directed edges (test TFG-27). - Progress 2026-05-07 (snapshot lifecycle): Added one-shot full snapshot APIs
(
exportPersistedGraph()/importPersistedGraph()) to atomically restore nodes + adjacency in one call; import filters self/dangling/duplicate edges (test TFG-28). - O(d·n·r³) per candidate pair — bounded by
max_candidates=1000
- Progress 2026-05-07:
- CDC changefeed integration for incremental graph updates (Target: Q2 2027)
- Progress 2026-05-07:
TensorNetworkStorageEnginenow exposesTensorWriteObserverFn/TensorDeleteObserverFncallback types +setWriteObserverFn()/setDeleteObserverFn()setters. Observers are invoked outside the write lock after successfulput()/remove(). Wiring toTensorFingerprintGraph::insert()/remove()verified by tests TNSE-OBS-01..03. - Progress 2026-05-07 (TDM mapping wiring):
TensorDeduplicationManagernow formalises canonicaltensor_id↔TensorFieldKeymapping and wires storage observers so external canonical-key writes update mapped graph nodes and canonical-key deletes remove mapped graph nodes + dedup records (tests TDM-10/TDM-11).
- Progress 2026-05-07:
- Expected ≥ 40% storage reduction for LLM weight repositories (Target: Q2 2027)
- Benchmark: 100 Transformer block weight sets with shared FFN matrices
- Progress 2026-05-10 (benchmark):
BM_StorageReductionRatioinbenchmarks/bench_tensor_fingerprint.cppmeasures deduplication effectiveness for 10 and 100 weight-set configurations with 80% shared blocks; reportsdedup_ratio,savings_pct,canonical_tensors, anddelta_tensorsas Google Benchmark counters.
-
GraphIndexpersistence for the fingerprint graph (Target: Q2 2027)- Progress 2026-05-07 (TDM snapshot/restore):
TensorDeduplicationManager::snapshotGraph()/restoreGraph()serialize the fullPersistedFingerprintGraphSnapshot(all nodes + edges) to/from the storage backend via newTensorNetworkStorageEngine::putRawMetadata()/getRawMetadata()helpers (namespaced under__tfgmeta__:). After a process restart the fingerprint graph can be fully restored without re-inserting any tensor data. Tests TDM-12 (node+edge round-trip) and TDM-13 (findSimilar on restored graph) added. - Progress 2026-05-08 (restore state hardening): Dedup snapshots now also persist
StoredTensorRecordentries, byte counters, and canonical key mappings so a restoredTensorDeduplicationManagercontinues to servegetRecord()/getStats()and external canonical deletes still remove mapped graph nodes (tests TDM-12/TDM-14).restoreGraph()remains backward-compatible with older graph-only snapshot blobs. - Progress 2026-05-08 (restore diagnostics hardening): Added explicit debug/warn diagnostics for dedup snapshot parse failures (magic/version mismatch, malformed lengths, trailing bytes) and for fallback behavior to legacy graph-only payload parsing; expanded malformed-payload regression coverage for bad magic/version, invalid graph lengths, embedded-graph corruption, and trailing bytes (TDM-15).
- Progress 2026-05-08 (incremental recovery journal): Added persisted post-snapshot
mutation journaling for tensor upserts/deletes so
restoreGraph()now replays incremental graph/record changes after the last full snapshot; overwrite/delete paths also keep byte counters consistent during replay and external canonical deletes (tests TDM-16/TDM-17). - Progress 2026-05-08 (journal compaction): Incremental mutation journals now compact superseded repeated updates for the same tensor before persistence and after replay so snapshot-following overwrite streams stay bounded by latest per-tensor state instead of growing linearly with every overwrite (test TDM-18).
- Progress 2026-05-08 (journal key lifecycle hardening): Mutation journals now persist
under a dedicated metadata namespace key (
__tfgmeta__:wal:<snapshot>) while restore keeps backward compatibility with legacy<snapshot>::walpayloads and normalizes legacy data back into the namespaced key after replay (test TDM-19). - Progress 2026-05-08 (journal corruption reset hardening): Restore now clears invalid namespaced/legacy mutation journal payloads in metadata on parse failure, emits explicit reset diagnostics, and safely continues from the base snapshot without replaying corrupt journal bytes (test TDM-20).
- Progress 2026-05-08 (journal write compaction hardening): Mutation journal persistence now avoids redundant metadata rewrites when the compacted namespaced payload is unchanged and only clears legacy journal keys when they still contain data, reducing unnecessary write churn during repeated no-op canonical updates (test TDM-21).
- Progress 2026-05-08 (storage-backed per-entry journal integration):
TensorDeduplicationManagernow auto-wires its per-entry journal hooks toTensorNetworkStorageEngineraw metadata keys (__tfgjournal__:<snapshot>:<tensor_id>), so post-snapshot upsert/delete journaling rewrites only the affected tensor entry instead of the entire monolithic blob; restore remains backward-compatible by falling back to the legacy blob journal when no per-entry entries exist, and per-entry replay takes precedence when both journal formats coexist for the same snapshot (tests TDM-22/TDM-24). - Progress 2026-05-10 (GraphIndex-backed journal wiring): Added free function
wireGraphIndexJournalHooks(TensorDeduplicationManager&, GraphIndexManager&, const std::string&)ininclude/graph/tensor_deduplication_manager.h/src/graph/tensor_deduplication_manager.cpp. Persists per-entry payloads via GraphIndex edge fields and enumerates throughoutAdjacency()anchored at__tfgj_anchor__:<snap>. Behavioral contract verified by test TDM-25 (TDM25_GraphIndexJournalHooksPersistAndReplay) with end-to-end replay via real RocksDB +GraphIndexManager.
- Progress 2026-05-07 (TDM snapshot/restore):
-
research/papers/tensor_networks_themisdb.md— P6 (TIES-Merging), P7 (Stoudenmire), P9 (LSH) entries -
research/best_practices/tensor_train_storage.md— delta encoding guidelines
Acceptance Criteria:
- Fingerprint + LSH insert ≤ 10ms per tensor
- Similar-tensor graph query ≤ 50ms for 100K nodes
- ≥ 40% storage reduction for LLM weight repositories with shared Transformer blocks
- 29 TFG + 25 TDM = 54 tests passing (TFG-01..29 + TDM-01..25)
- Adaptive plan selection using execution feedback is now active;
selectAlgorithmuses learned EMA costs when confidence > 0, falling back to static depth heuristics otherwise - Advanced cost model calibration from real execution feedback is implemented:
calibrateFromHistory()re-seeds EMA models from batch history and computes cost accuracy metrics (mean_estimated_ms,mean_absolute_error_ms,cost_ratio) whenExecutionStats::estimated_cost_msis populated (automatic in all execute* methods) - Incremental query execution is BFS-only; DFS/Dijkstra/A* incremental modes are planned
- Incremental query execution is not thread-safe (same as the optimizer itself)
- Incremental query HTTP API (
POST /graph/query/incremental,DELETE /graph/query/incremental/:handle,POST /graph/changes) is exposed viaGraphApiHandler; edge mutations viaPOST /graph/edgeandDELETE /graph/edge/:idautomatically notify registered queries on success - Subgraph isomorphism (pattern matching) is implemented via
executeSubgraphIsomorphism(VF2-style backtracking) - Cross-shard edge following (edges whose endpoints reside on different shards) requires caller-side coordination; the current distributed query model executes intra-shard queries in parallel and returns the globally cheapest result
- Distributed graph query introduces shard-aware plan nodes (new plan format, backward-compatible with single-node)
- Subgraph isomorphism query syntax will extend AQL graph traversal syntax (additive)
Stand: 2026-04-20 – Quelle: src/UNUSED_FUNCTIONS_REPORT.md
LocalShardGraphExecutor– Führt Graph-Traversals lokal auf einem Shard aus; getestet in test_graph_distributedAktion: ROADMAP-Ticket für Produktions-Integration ergänzen oder als CANDIDATE_FOR_REMOVAL markieren.