Roadmap-Hinweis: Vage Bullets ohne Akzeptanzkriterien in Checkbox-Tasks überführen. Format: - [ ] <Task> (Target: <Q/Jahr>).

Storage Module Roadmap

Current Status

v1.8.0 – Production-grade persistent storage layer built on RocksDB with MVCC, WAL, BlobDB, multi-model key schema, backup/PITR, compression, field-level encryption, columnar format support, NVMe optimizations, Reed-Solomon erasure coding, distributed 2PC transactions, and Write-Optimized Merge (WOM) Tree.

Completed ✅

In Progress 🚧

(none)

Planned Features 📋

Short-term (Next 3-6 months)

IndexAnalyzer — hot/warm/cold-aware index analysis with cron scheduling and AI/ML intervention (v1.9.0)
- Inputs: RocksDBWrapper instance + YAML config (config/index_analyze.yaml)
- Outputs: IndexAnalysisReport per index: fragmentation_pct, recommendation (NONE / UPDATE_STATS / REORGANIZE / PARTIAL_REBUILD / FULL_REBUILD), optional ai_recommendation
- Affected files: include/storage/index_analyzer.h, src/storage/index_analyzer.cpp, config/index_analyze.yaml, tests/test_index_analyzer.cpp
- Tier thresholds: hot (10/20/35%), warm (18/32/50%), cold (30/50/70%); all configurable per-tier and per-index via YAML
- Cron: uses existing CronExpression; background scheduler thread fires analyzeAll() at configured times
- AI/ML hook: IIndexAnalysisAdvisor interface; rule-based recommendation passed to advisor->advise() which may override it; disabled by default
- Errors: cron parse failure → scheduler logs error + retries in 60s; null advisor skipped gracefully
- Tests: IA-01…IA-15 in tests/test_index_analyzer.cpp (IndexAnalyzerFocusedTests); pure-logic tests require no live RocksDB
- Perf: scheduler thread wakes only at cron fire time; analyzeAll() wall-clock overhead ≤ 50 ms for ≤ 100 indices
Tiered storage (hot/warm/cold) with automatic data migration based on access patterns (Target: v1.6.0)
- Inputs: RocksDB key-space with per-key last-access timestamps tracked by AccessTracker
- Outputs: transparent reads/writes regardless of tier; background TierMigrationWorker moves data between NVMe (hot), SATA (warm), and object storage (cold)
- Affected files: new tiered_storage.cpp, extend StorageEngine, DiskSpaceMonitor
- Migration policy: age-based (hot→warm after 30 days, warm→cold after 90 days) and access-based (zero-read in window → demote); configurable via TieredStorageConfig
- Errors: migration failure rolls back and retries with exponential backoff; partial migration never leaves data inconsistent (copy-then-delete)
- Tests: unit tests for policy evaluation; integration test verifying read-after-migrate returns original data
- Perf: migration background overhead < 5% of sustained write throughput; cold-tier read latency documented in config (object storage SLA dependent)
GCS (Google Cloud Storage) blob backend (blob_backend_gcs.cpp) (Target: v1.6.0)
- Inputs/outputs: same IBlobBackend interface as existing S3/Azure backends
- Affected files: new blob_backend_gcs.cpp, ../include/storage/blob_backend_gcs.h; register in BlobStorageManager
- Auth: ADC (Application Default Credentials) via GOOGLE_APPLICATION_CREDENTIALS; fail-closed if credentials absent
- Errors: GCS API errors mapped to StorageError; retry on transient 5xx/429 with jitter backoff
- Tests: integration test against GCS emulator (fake-gcs-server)
NVMe Optimizations via NVMeManager (Target: v1.6.0) ✅
- Inputs: NVMeConfig with device_path, queue_depth, feature flags
- Outputs: NVMeCapabilities report; adjusted RocksDB Direct I/O flags; background-thread count recommendation; io_uring async I/O submit/poll; ZNS zone reset/finish/wp-query
- Affected files: src/storage/nvme_manager.cpp, include/storage/nvme_manager.h; RocksDBWrapper::Config extended with enable_nvme_optimizations, nvme_* fields; configureOptions() applies NVMe flags
- Constraints: all Linux-specific code (#ifdef linux); io_uring requires THEMIS_ENABLE_IO_URING compile flag; graceful fallback to pread/pwrite when unavailable
- Errors: io_uring setup failure → disable io_uring and log WARN; ZNS unavailable → log WARN and skip; direct I/O unsupported filesystem → fall back to buffered I/O
- Tests: tests/test_nvme_manager.cpp — 20+ focused unit tests (lifecycle, capabilities, Direct I/O flags, background threads, async I/O fallback, ZNS no-ops, RocksDB integration)
- CI: .github/workflows/02-feature-modules_storage_nvme-manager-ci.yml
- Perf target: 30–50% lower p99 latency vs. buffered I/O on NVMe with io_uring + Direct I/O enabled
Erasure coding redundancy mode in BlobRedundancyManager (space-efficient alternative to RAID-1) (Target: v1.7.0) ✅
- Implemented as RedundancyMode::PARITY using ErasureCodingBackend (Reed-Solomon)
- Algorithm: RS(k,m) with configurable data/parity shards; RS(4,2) default (1.5× overhead vs 3× for RAID-1)
- Affected files: blob_redundancy_manager.cpp (PARITY path); erasure_coding_backend.cpp for RS encode/decode
- Errors: reconstruction failure if fewer than k shards available
- Tests: tests/test_erasure_coding_backend.cpp — RS(10,4)/RS(6,3)/RS(4,2) encode/decode, multi-shard fault tolerance, BlobRedundancyManager PARITY mode integration
- CI: .github/workflows/02-feature-modules_storage_erasure-coding-blob-storage-ci.yml

Long-term (6-12 months)

Distributed transactions across shards via two-phase commit (2PC) with Raft coordination (Target: v1.7.0) ✅
- Implemented (v1.7.0): DistributedTransactionManager + IDistributedShardParticipant + DistributedTransaction
  - Storage-layer 2PC coordinator (Phase 1 PREPARE / Phase 2 COMMIT|ABORT)
  - ManagerSharedState shared ownership: transactions safely outlive their manager
  - shared_ptr participant references: no use-after-free on concurrent unregisterShard()
- Inputs: multi-shard operations spanning separate RocksDB instances
- Outputs: atomic commit or rollback across all participant shards
- Affected files: distributed_transaction_manager.h/.cpp; coordinates via existing RaftMVCCBridge
- Tests: 27 unit tests in tests/test_distributed_transactions.cpp
- Perf: 2PC round-trip adds ≤ 2× single-shard latency on same-datacenter nodes
Vectorized execution support in ColumnarFormat (SIMD batch processing) (Target: v2.0.0) ✅
- Inputs: columnar scan batches of up to 8,192 rows
- Outputs: filtered/projected result batches processed via AVX2 SIMD instructions
- Affected files: new include/storage/simd_filter.h, src/storage/simd_filter.cpp; no ABI change to ColumnarFormat public API
- Dispatch: AVX-512 → AVX2 → NEON → scalar; runtime CPU feature detection (memoised)
- SIMDColumnFilter::scan() integrates zone-map early-out on ColumnSegment
- Tests: 25 tests in tests/test_simd_columnar_filter.cpp (SIMDColumnarFilterFocusedTests); scalar-reference parity verified for all 6 ops + all 4 numeric types
- Perf: throughput SLO gated by THEMIS_RUN_PERF_TESTS=1 (SF-25)
Native Parquet export from columnar format (Target: v2.0.0) ✅
- Inputs: ColumnarFormat table scan result (vector<vector<ColumnSegment>>)
- Outputs: Apache Parquet v2 file compatible with Spark, DuckDB, Pandas; PAR1 magic, Thrift binary FileMetaData
- Affected files: new include/storage/storage_parquet_exporter.h, src/storage/storage_parquet_exporter.cpp
- Supports INT32, INT64, FLOAT32, FLOAT64, BOOL, STRING column types with native Parquet type mapping
- With ARROW_ENABLED: delegates to Apache Arrow / Parquet C++ library; without it: portable Thrift binary Parquet v2 writer
- Errors: ERR_EXPORT_FORMAT_INVALID for unknown types; ERR_EXPORT_IO_ERROR for file write failures; ERR_EXPORT_CONFIG_INVALID for mismatched schema
- Tests: 15 tests in tests/test_storage_parquet_exporter.cpp (StorageParquetExporterFocusedTests)

Implementation Phases

Phase 1: Core RocksDB Layer (Status: Completed ✅)

RocksDBWrapper with MVCC, WAL, BlobDB, multi-path SSTables, async I/O
MVCCStore snapshot isolation
WALStorage management and replay
KeySchema multi-model encoding
StorageEngine with dependency injection
MergeOperators for counter and list operations
HLC hybrid logical clocks

Phase 2: Backup, Recovery & Blob Storage (Status: Completed ✅)

BackupManager incremental and full backups
PITRManager point-in-time recovery
Blob backends: Filesystem, RocksDB BlobDB, S3, Azure Blob, WebDAV
BlobRedundancyManager RAID-1 mirror

Phase 3: Compression, Optimization & Security (Status: Completed ✅)

CompressionStrategy pluggable per-table compression
ColumnarFormat analytical storage
BatchWriteOptimizer adaptive write batching
CompactionManager manual and scheduled compaction
SecuritySignature field-level AES-GCM encryption
SecuritySignatureManager HMAC-SHA256 tamper detection
StorageAuditLogger structured audit trail
Production-mode safety flag

Phase 4: Operational & Advanced Features (Status: Completed ✅)

DiskSpaceMonitor quota monitoring
IndexMaintenance background optimization
TransactionRetryManager exponential backoff
DatabaseConnectionManager connection pooling
RaftMVCCBridge Raft-to-MVCC integration
HistoryManager version tracking
NLPMetadataExtractor automatic metadata

Phase 5: Tiered Storage & Distributed Transactions (Status: Completed ✅ — v1.7.0)

Tiered storage (hot/warm/cold) with age- and access-based migration policies
GCS blob backend
NVMe Optimizations: NVMeManager with io_uring, multi-queue, ZNS, Direct I/O (CI: nvme-manager-ci.yml)
Erasure coding in BlobRedundancyManager (PARITY mode via ErasureCodingBackend, RS(k,m)) (CI: erasure-coding-blob-storage-ci.yml)
2PC distributed transactions (DistributedTransactionManager, v1.7.0)

Phase 6: Write-Optimized Storage (Status: Completed ✅ — v1.8.0)

WomTree – Write-Optimized Merge (WOM) Tree: Bε-tree alternative to LSM for write-heavy workloads
- Write amplification 2–5× (vs. 10–30× for LSM)
- Lazy buffer propagation — reduced compaction overhead
- Configurable fanout, leaf capacity, buffer size
- Full put/get/remove/scan/compact API with thread safety
- Write-amplification and read-hit-ratio metrics in Stats

Phase 6b: PERF-D6 – Concurrent Write Controller (Status: Completed ✅ — v2.0.0)

Phase 5.5: Build System Audit & Stub Fixes (Status: Completed ✅ — March 2026)

All src/storage/*.cpp files verified registered in cmake build system (main CMakeLists.txt + StorageEnhancements.cmake + BlobStorage.cmake)
21+ focused standalone test targets added in tests/CMakeLists.txt: StorageEngineDI, StorageEngineProd, StorageAuditLogger, StorageFuzz, StorageLatencyBench, BlobStorage, BlobTransferCheckpoint, CompressionStrategy, TieredStorage, WalStorage, WalManager, WalArchiving, WalBackupManager, WalChaos, WalManifestCorruption, WalReplication, WalReplicationIntegration, WalGrpcApply, MvccStore, MvccHistory, MvccWalIntegration, NVMeFocusedTests, ErasureCodingFocusedTests, RocksDBSizeCalculationFocusedTests, BlobRedundancyEventListenerFocusedTests
RocksDBWrapper::getApproximateSize() returns real on-disk SST size (rocksdb.total-sst-files-size) — was returning 0 (CI: rocksdb-size-calculation-ci.yml)
SecuritySignatureManager::listAllSignatures() iterates full RocksDB key-range via iterateRange — was stub (CI: security-signature-rocksdb-iteration-ci.yml)
BlobRedundancyManager::createRocksDBListener() returns working RocksDBBlobListener — was returning error stub (CI: blob-redundancy-event-listener-ci.yml)

Phase 7: Streaming Blob Write Path (Status: Completed ✅ — v2.0.0, PERF-D5)

RocksDBWrapper::putBlob() — streaming write path for large blobs (Issue PERF-D5)
- Blobs ≥ blob_streaming_threshold_bytes (default 64 KB) are split into blob_chunk_size_bytes (default 128 KB) chunks
- Parallel chunk encoding via std::async thread pool (blob_streaming_threads, default 4)
- All chunks + manifest committed atomically via single WriteBatch::Write() — bypasses per-write transaction overhead
- Internal key scheme: manifest __tmbs_m__:<key>, chunks __tmbs_c__:<key>:<6-digit-index>
- Backward compatible: blobs below threshold fall back to regular put()
RocksDBWrapper::getBlob() — transparent reassembly from manifest + chunk keys via MultiGet
RocksDBWrapper::delBlob() — atomic manifest + chunk deletion via WriteBatch
New Config fields: enable_blob_streaming, blob_streaming_threshold_bytes, blob_chunk_size_bytes, blob_streaming_threads
tests/test_blob_streaming.cpp — 16 focused tests (BlobStreamingFocusedTests): round-trip, boundary, parallel, integrity, delete, overwrite, fallback, custom chunk size, single thread, zero-length, non-aligned
benchmarks/bench_comprehensive.cpp StoreLargeBlobs_1MB updated to use putBlob() with 8-thread encoding

Phase 8: Tensor-Native Storage Engine (QTC) (Status: [~] In Progress — Phase 1 complete 2026-05-05)

Wissenschaftliche Basis: Oseledets 2011 (DOI: 10.1137/090752142); Khoromskij 2011; Dettmers et al. 2023 (NF4)

Phase 8.1 — Design / API Contract (Target: Q3 2026) ✅

TensorTrainConfig — eps, max_rank, dtype, svd_power_iterations parameters
TTCore — 3-D core tensor struct (r_left × n × r_right) with row-major data layout
TTTrain — full TT-decomposition: cores, mode_sizes, originalNorm, achievedEps, serialize()/deserialize()
TensorStorageConfig — engine configuration: tt_config, quant_type, version_retention, min_compression_ratio
TensorFieldKey — logical address {tenant, collection, field} + TensorFieldKeyHash
ITensorStorageBackend — abstraction interface for RocksDB / in-memory backends
Key schema defined: __ttn__:<tenant>:<collection>:<field>:G<k>:<version> (meta + core keys)
StorageLayoutAdvisor integration: new TENSOR_NETWORK layout type (Target: Q3 2026)

Phase 8.2 — Core Implementation (Target: Q3 2026) ✅

Phase 8.3 — Error Handling & Edge Cases (Target: Q3 2026) ✅

Shape mismatch → std::invalid_argument in decompose() and put()
Null backend → std::invalid_argument in TensorNetworkStorageEngine constructor
min_compression_ratio guard — fall back to raw storage when TT compression not beneficial
Version retention: old core keys purged on configurable version_retention threshold
Zero tensor, constant tensor — handled without division-by-zero in error bound calculation

Phase 8.4 — Tests (Target: Q3 2026) ✅

16 unit tests (TTD-01..TTD-16) in tests/storage/test_tensor_train_decomposer.cpp
- TTD-01..05: TT-SVD correctness, compression ratio, core shapes
- TTD-06..08: inner product, cosine similarity (identical, zero-norm)
- TTD-09..12: serialisation, F64, invalid input, max_rank
- TTD-13..16: INT8 / NF4 quantisation round-trip, QuantizedTrain serialisation, bytesPerElement
8 storage engine tests (TNS-01..TNS-08) in tests/storage/test_tensor_train_decomposer.cpp

Phase 8.5 — Performance & Hardening (Target: Q4 2026)

LAPACK dgesdd integration (THEMIS_USE_LAPACK_SVD=ON CMake option) (Target: Q3 2026)
- Inputs: unfolding matrices up to 4096 × 4096; Outputs: exact singular values
- Required for: TT-SVD of 10⁶-element 6D tensor ≤ 500ms CPU, ≤ 80ms GPU (acceptance criterion)
CUDA cuSOLVER SVD path (THEMIS_ENABLE_CUDA) (Target: Q4 2026)
- Inputs: float32 unfolding matrices; Outputs: GPU singular values
- Guard: graceful CPU fallback when CUDA not available
- Perf target: ≤ 80ms for 10⁶-element 6D tensor
RocksDB backend (RocksDBTensorBackend : ITensorStorageBackend) (Target: Q4 2026)
- Put: core bytes stored in column-family cf_tensor_cores
- Get: MultiGet for parallel core loading
- Compaction: use RocksDB DeleteRange for version purging
Compression benchmark: ≥ 10× for LLM attention matrices at ε ≤ 1% (Target: Q4 2026)
Reconstruction error benchmark: ≤ ε × ‖T‖ for all test cases (Target: Q4 2026)

Phase 8.6 — Documentation & Acceptance (Target: Q4 2026)

research/papers/tensor_networks_themisdb.md — DOI + BibTeX for all 9 papers
research/best_practices/tensor_train_storage.md — implementation guidelines
API Stability declaration for TensorTrainDecomposer, TTQuantizer, TensorNetworkStorageEngine (Target: Q4 2026)

Acceptance Criteria:

TT-SVD ≤ 500ms CPU / ≤ 80ms GPU for 10⁶-element 6D tensor
Compression ≥ 10× at ε ≤ 1% for typical LLM attention matrices
Reconstruction error ≤ ε × ‖T‖_F (verified in TTD-04)
Tests: 16 TTD + 8 TNS = 24 tests passing

Production Readiness Checklist

Known Issues & Limitations

NLPMetadataExtractor depends on an external NLP model; slow startup if model is not pre-warmed
Tiered storage uses flat filesystem files per key; for large datasets a more efficient store (e.g. RocksDB column-family per tier) is recommended
[R-1] rocksdb_wrapper.cpp::close(): TOCTOU race — db_lifecycle_mutex_ released before busy-wait; new OperationGuards can start after the lock is released; db_.reset() races with active operations → use-after-free.
[R-2] rocksdb_wrapper.cpp::putBlob(): Multi-chunk blob WriteBatch bypasses TransactionDB MVCC; concurrent snapshot-reads can observe partially written blobs.
[R-5] rocksdb_wrapper.cpp: write_options_->sync = false default — acknowledged writes can be lost on power failure.

Breaking Changes

StorageEngine::createDefault() factory is deprecated; use the DI constructor with explicit IExpressionEvaluatorPtr, IFieldEncryptionPtr, IKeyProviderPtr, and IIndexManagerPtr to avoid insecure defaults in production
KeySchema key format v1.5.0+ (prefix-based) is not backward-compatible with keys created before v1.5.0; migration required for existing data

Latente Symbole (Unused-Functions-Audit)

Stand: 2026-04-20 – Quelle: src/UNUSED_FUNCTIONS_REPORT.md

🧪 NUR_TESTS (implementiert, kein Produktions-Aufrufer)

AdaptiveCompactionScheduler – Plant und steuert RocksDB-Compactions adaptiv; Tests vorhanden

Aktion: ROADMAP-Ticket für Produktions-Integration ergänzen oder als CANDIDATE_FOR_REMOVAL markieren.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storage Module Roadmap

Current Status

Completed ✅

In Progress 🚧

Planned Features 📋

Short-term (Next 3-6 months)

Long-term (6-12 months)

Implementation Phases

Phase 1: Core RocksDB Layer (Status: Completed ✅)

Phase 2: Backup, Recovery & Blob Storage (Status: Completed ✅)

Phase 3: Compression, Optimization & Security (Status: Completed ✅)

Phase 4: Operational & Advanced Features (Status: Completed ✅)

Phase 5: Tiered Storage & Distributed Transactions (Status: Completed ✅ — v1.7.0)

Phase 6: Write-Optimized Storage (Status: Completed ✅ — v1.8.0)

Phase 6b: PERF-D6 – Concurrent Write Controller (Status: Completed ✅ — v2.0.0)

Phase 5.5: Build System Audit & Stub Fixes (Status: Completed ✅ — March 2026)

Phase 7: Streaming Blob Write Path (Status: Completed ✅ — v2.0.0, PERF-D5)

Phase 8: Tensor-Native Storage Engine (QTC) (Status: [~] In Progress — Phase 1 complete 2026-05-05)

Phase 8.1 — Design / API Contract (Target: Q3 2026) ✅

Phase 8.2 — Core Implementation (Target: Q3 2026) ✅

Phase 8.3 — Error Handling & Edge Cases (Target: Q3 2026) ✅

Phase 8.4 — Tests (Target: Q3 2026) ✅

Phase 8.5 — Performance & Hardening (Target: Q4 2026)

Phase 8.6 — Documentation & Acceptance (Target: Q4 2026)

Production Readiness Checklist

Known Issues & Limitations

Breaking Changes

Latente Symbole (Unused-Functions-Audit)

🧪 NUR_TESTS (implementiert, kein Produktions-Aufrufer)

FilesExpand file tree

ROADMAP.md

Latest commit

History

ROADMAP.md

File metadata and controls

Storage Module Roadmap

Current Status

Completed ✅

In Progress 🚧

Planned Features 📋

Short-term (Next 3-6 months)

Long-term (6-12 months)

Implementation Phases

Phase 1: Core RocksDB Layer (Status: Completed ✅)

Phase 2: Backup, Recovery & Blob Storage (Status: Completed ✅)

Phase 3: Compression, Optimization & Security (Status: Completed ✅)

Phase 4: Operational & Advanced Features (Status: Completed ✅)

Phase 5: Tiered Storage & Distributed Transactions (Status: Completed ✅ — v1.7.0)

Phase 6: Write-Optimized Storage (Status: Completed ✅ — v1.8.0)

Phase 6b: PERF-D6 – Concurrent Write Controller (Status: Completed ✅ — v2.0.0)

Phase 5.5: Build System Audit & Stub Fixes (Status: Completed ✅ — March 2026)

Phase 7: Streaming Blob Write Path (Status: Completed ✅ — v2.0.0, PERF-D5)

Phase 8: Tensor-Native Storage Engine (QTC) (Status: [~] In Progress — Phase 1 complete 2026-05-05)

Phase 8.1 — Design / API Contract (Target: Q3 2026) ✅

Phase 8.2 — Core Implementation (Target: Q3 2026) ✅

Phase 8.3 — Error Handling & Edge Cases (Target: Q3 2026) ✅

Phase 8.4 — Tests (Target: Q3 2026) ✅

Phase 8.5 — Performance & Hardening (Target: Q4 2026)

Phase 8.6 — Documentation & Acceptance (Target: Q4 2026)

Production Readiness Checklist

Known Issues & Limitations

Breaking Changes

Latente Symbole (Unused-Functions-Audit)

🧪 NUR_TESTS (implementiert, kein Produktions-Aufrufer)