Roadmap-Hinweis: Vage Bullets ohne Akzeptanzkriterien in Checkbox-Tasks überführen. Format:
- [ ] <Task> (Target: <Q/Jahr>).
v1.8.0 – Production-grade persistent storage layer built on RocksDB with MVCC, WAL, BlobDB, multi-model key schema, backup/PITR, compression, field-level encryption, columnar format support, NVMe optimizations, Reed-Solomon erasure coding, distributed 2PC transactions, and Write-Optimized Merge (WOM) Tree.
-
RocksDBWrapper– MVCC transactions, WAL, BlobDB, multi-path SSTables, async I/O -
MVCCStore– multi-version concurrency control with snapshot isolation -
WALStorage– write-ahead log management and replay (wal_storage.cpp) -
KeySchema– unified multi-model key encoding (relational, document, graph, vector, timeseries) -
StorageEngine– high-level abstraction with dependency injection (evaluator, encryption, key provider, index manager) -
BackupManager– incremental and full backup with checksum verification -
BackupManager– in-memory backup scheduling (scheduleBackup,cancelScheduledBackup,listScheduledBackups); cloud upload/restore routing viaTHEMIS_ENABLE_S3/AZURE/GCScompile-time flags (uploadBackupToCloud,restoreFromCloud) -
PITRManager– point-in-time recovery via WAL replay and snapshot restore -
Blob storage backends: INLINE, RocksDB BlobDB, Filesystem, S3, Azure Blob, WebDAV
-
BlobRedundancyManager– RAID-1 mirror across multiple backends -
BlobRedundancyManager– Reed-Solomon erasure coding (PARITYmode viaErasureCodingBackend); configurable RS(k,m) with copy-then-delete shard migration -
BlobRedundancyManager–RocksDBBlobListener(createRocksDBListener()) reacts to SST file deletions and marks affected locations unhealthy -
CompressionStrategy– pluggable per-table compression (Snappy, Zstd, LZ4, Brotli, None) -
ColumnarFormat– columnar storage for analytical workloads -
BatchWriteOptimizer– adaptive batching to reduce write amplification -
SecuritySignature+SecuritySignatureManager– field-level AES-GCM encryption and HMAC-SHA256 tamper detection; full RocksDB key-range iteration viaiterateRangeforlistAllSignatures() -
StorageAuditLogger– structured audit trail for all storage operations -
IndexMaintenance– background index rebuild, optimize, and consistency checks -
CompactionManager– manual and scheduled RocksDB compaction control -
DiskSpaceMonitor– real-time disk quota monitoring and alerting -
DatabaseConnectionManager– connection pooling and lifecycle management -
TransactionRetryManager– exponential backoff retry for failed transactions -
MergeOperators– custom RocksDB merge operators (counters, list appends) -
RaftMVCCBridge– integration between Raft consensus log and MVCC storage -
HLC(Hybrid Logical Clock) – causally consistent timestamps across nodes -
HistoryManager– version history and change tracking per key -
NLPMetadataExtractor– automatic metadata extraction for ingested documents -
CompressedStorage– transparent compression/decompression layer -
BaseEntity– common base type for all storage-layer entities -
RocksDBWrapper–getApproximateSize()returns real on-disk SST file size viarocksdb.total-sst-files-sizeproperty with fallback toGetApproximateSizes -
DistributedTransactionManager– storage-layer Two-Phase Commit (2PC) coordinator withIDistributedShardParticipant;ManagerSharedStateshared ownership for safe concurrent shard lifecycle -
NVMeManager– io_uring async I/O (Linux ≥ 5.1), multi-queue NVMe, ZNS zone management, Direct I/O flag recommendation; RocksDBWrapper NVMe integration (enable_nvme_optimizationsconfig) -
WomTree– Write-Optimized Merge (WOM) Tree: Bε-tree alternative to LSM; write amplification 2–5× vs 10–30× for LSM; lazy buffer propagation; full put/get/remove/scan/compact API -
Production-mode safety flag (
THEMIS_PRODUCTION_MODE) preventing no-op encryption defaults -
StreamingIngestManager— ring-buffer + flush-thread → singleWriteBatch(Issue: #4571) (2026-04-12)include/storage/streaming_ingest_manager.h+src/storage/streaming_ingest_manager.cppflush_interval=10 ms,max_buffer=1 M events,max_batch=65 536;OverflowPolicy::BLOCK/DROP- API:
ingest(key, value)/ingestBatch()/flush()/stats() - 10 focused tests (SM-01…SM-10) in
tests/test_streaming_ingest_manager.cpp
-
ColumnarCache— LRU in-memory columnar cache for analytics acceleration (Issue: #4572) (2026-04-12)include/storage/columnar_cache.h+src/storage/columnar_cache.cpp- LRU eviction +
PinGuardRAII;SegmentDType(Int64/Double/String/Bool);byteSize()accounting on_evictcallback; hit/miss counters; default ctor delegates toConfig{}- 12 focused tests (CC-01…CC-12) in
tests/test_columnar_cache.cpp
-
AdaptiveCompactionScheduler– load-aware compaction scheduling with IO-sample history and impact prediction (adaptive_compaction.h/.cpp,namespace themis) -
MVCCChainPruner– background MVCC version-chain pruning for old snapshot cleanup (mvcc_chain_pruner.h/.cpp) -
VectorIndexBackend–IVectorIndexBackendinterface +InMemoryVectorIndexflat-scan implementation (vector_index_backend.h/.cpp,namespace themis::storage) -
ZeroCopyBlobTransfer– mmap + sendfile zero-copy blob transfer;MmapBlobViewRAII view (zero_copy_blob_transfer.h/.cpp,namespace themis::storage) -
EncryptedBlobBackend– AES-256-GCM encryption wrapper for anyIBlobStorageBackend;IEncryptionKeyProviderinterface +StaticKeyProvider(encrypted_blob_backend.h/.cpp,namespace themis::storage) -
OnlineSchemaMigration– zero-downtime DDL viaSchemaMigrator; supports add/drop columns, rename, type change, add/drop indexes, partition (online_schema_migration.h/.cpp,namespace themis::storage) -
SchemaDeadWeightDetector– detects unused schema fields and stale indexes;GdprFieldRegistryfor PII field tracking (schema_dead_weight_detector.h/.cpp,namespace themis::storage) -
StorageLayoutAdvisor– recommends layout type (row vs columnar vs tiered vs vector) based onCollectionAccessStatsandSchemaInfo(storage_layout_advisor.h/.cpp,namespace themis::storage) -
IndexAnalyzer– per-index analyze function with tier-aware thresholds, cron scheduling, and AI/ML advisor hook (v1.9.0)include/storage/index_analyzer.h,src/storage/index_analyzer.cpp,config/index_analyze.yaml- 15 focused tests (IA-01…IA-15) in
tests/test_index_analyzer.cpp(IndexAnalyzerFocusedTests)
(none)
-
IndexAnalyzer— hot/warm/cold-aware index analysis with cron scheduling and AI/ML intervention (v1.9.0)- Inputs:
RocksDBWrapperinstance + YAML config (config/index_analyze.yaml) - Outputs:
IndexAnalysisReportper index:fragmentation_pct,recommendation(NONE / UPDATE_STATS / REORGANIZE / PARTIAL_REBUILD / FULL_REBUILD), optionalai_recommendation - Affected files:
include/storage/index_analyzer.h,src/storage/index_analyzer.cpp,config/index_analyze.yaml,tests/test_index_analyzer.cpp - Tier thresholds: hot (10/20/35%), warm (18/32/50%), cold (30/50/70%); all configurable per-tier and per-index via YAML
- Cron: uses existing
CronExpression; background scheduler thread firesanalyzeAll()at configured times - AI/ML hook:
IIndexAnalysisAdvisorinterface; rule-based recommendation passed toadvisor->advise()which may override it; disabled by default - Errors: cron parse failure → scheduler logs error + retries in 60s; null advisor skipped gracefully
- Tests: IA-01…IA-15 in
tests/test_index_analyzer.cpp(IndexAnalyzerFocusedTests); pure-logic tests require no live RocksDB - Perf: scheduler thread wakes only at cron fire time; analyzeAll() wall-clock overhead ≤ 50 ms for ≤ 100 indices
- Inputs:
-
Tiered storage (hot/warm/cold) with automatic data migration based on access patterns (Target: v1.6.0)
- Inputs: RocksDB key-space with per-key last-access timestamps tracked by
AccessTracker - Outputs: transparent reads/writes regardless of tier; background
TierMigrationWorkermoves data between NVMe (hot), SATA (warm), and object storage (cold) - Affected files: new
tiered_storage.cpp, extendStorageEngine,DiskSpaceMonitor - Migration policy: age-based (hot→warm after 30 days, warm→cold after 90 days) and access-based (zero-read in window → demote); configurable via
TieredStorageConfig - Errors: migration failure rolls back and retries with exponential backoff; partial migration never leaves data inconsistent (copy-then-delete)
- Tests: unit tests for policy evaluation; integration test verifying read-after-migrate returns original data
- Perf: migration background overhead < 5% of sustained write throughput; cold-tier read latency documented in config (object storage SLA dependent)
- Inputs: RocksDB key-space with per-key last-access timestamps tracked by
-
GCS (Google Cloud Storage) blob backend (
blob_backend_gcs.cpp) (Target: v1.6.0)- Inputs/outputs: same
IBlobBackendinterface as existing S3/Azure backends - Affected files: new
blob_backend_gcs.cpp,../include/storage/blob_backend_gcs.h; register inBlobStorageManager - Auth: ADC (Application Default Credentials) via
GOOGLE_APPLICATION_CREDENTIALS; fail-closed if credentials absent - Errors: GCS API errors mapped to
StorageError; retry on transient 5xx/429 with jitter backoff - Tests: integration test against GCS emulator (
fake-gcs-server)
- Inputs/outputs: same
-
NVMe Optimizations via
NVMeManager(Target: v1.6.0) ✅- Inputs:
NVMeConfigwith device_path, queue_depth, feature flags - Outputs:
NVMeCapabilitiesreport; adjusted RocksDB Direct I/O flags; background-thread count recommendation; io_uring async I/O submit/poll; ZNS zone reset/finish/wp-query - Affected files:
src/storage/nvme_manager.cpp,include/storage/nvme_manager.h;RocksDBWrapper::Configextended withenable_nvme_optimizations,nvme_*fields;configureOptions()applies NVMe flags - Constraints: all Linux-specific code (#ifdef linux); io_uring requires THEMIS_ENABLE_IO_URING compile flag; graceful fallback to pread/pwrite when unavailable
- Errors: io_uring setup failure → disable io_uring and log WARN; ZNS unavailable → log WARN and skip; direct I/O unsupported filesystem → fall back to buffered I/O
- Tests:
tests/test_nvme_manager.cpp— 20+ focused unit tests (lifecycle, capabilities, Direct I/O flags, background threads, async I/O fallback, ZNS no-ops, RocksDB integration) - CI:
.github/workflows/02-feature-modules_storage_nvme-manager-ci.yml - Perf target: 30–50% lower p99 latency vs. buffered I/O on NVMe with io_uring + Direct I/O enabled
- Inputs:
-
Erasure coding redundancy mode in
BlobRedundancyManager(space-efficient alternative to RAID-1) (Target: v1.7.0) ✅- Implemented as
RedundancyMode::PARITYusingErasureCodingBackend(Reed-Solomon) - Algorithm: RS(k,m) with configurable data/parity shards; RS(4,2) default (1.5× overhead vs 3× for RAID-1)
- Affected files:
blob_redundancy_manager.cpp(PARITY path);erasure_coding_backend.cppfor RS encode/decode - Errors: reconstruction failure if fewer than k shards available
- Tests:
tests/test_erasure_coding_backend.cpp— RS(10,4)/RS(6,3)/RS(4,2) encode/decode, multi-shard fault tolerance, BlobRedundancyManager PARITY mode integration - CI:
.github/workflows/02-feature-modules_storage_erasure-coding-blob-storage-ci.yml
- Implemented as
- Distributed transactions across shards via two-phase commit (2PC) with Raft coordination (Target: v1.7.0) ✅
- Implemented (v1.7.0):
DistributedTransactionManager+IDistributedShardParticipant+DistributedTransaction- Storage-layer 2PC coordinator (Phase 1 PREPARE / Phase 2 COMMIT|ABORT)
ManagerSharedStateshared ownership: transactions safely outlive their managershared_ptrparticipant references: no use-after-free on concurrentunregisterShard()
- Inputs: multi-shard operations spanning separate RocksDB instances
- Outputs: atomic commit or rollback across all participant shards
- Affected files:
distributed_transaction_manager.h/.cpp; coordinates via existingRaftMVCCBridge - Tests: 27 unit tests in
tests/test_distributed_transactions.cpp - Perf: 2PC round-trip adds ≤ 2× single-shard latency on same-datacenter nodes
- Implemented (v1.7.0):
- Vectorized execution support in
ColumnarFormat(SIMD batch processing) (Target: v2.0.0) ✅- Inputs: columnar scan batches of up to 8,192 rows
- Outputs: filtered/projected result batches processed via AVX2 SIMD instructions
- Affected files: new
include/storage/simd_filter.h,src/storage/simd_filter.cpp; no ABI change toColumnarFormatpublic API - Dispatch: AVX-512 → AVX2 → NEON → scalar; runtime CPU feature detection (memoised)
SIMDColumnFilter::scan()integrates zone-map early-out onColumnSegment- Tests: 25 tests in
tests/test_simd_columnar_filter.cpp(SIMDColumnarFilterFocusedTests); scalar-reference parity verified for all 6 ops + all 4 numeric types - Perf: throughput SLO gated by
THEMIS_RUN_PERF_TESTS=1(SF-25)
- Native Parquet export from columnar format (Target: v2.0.0) ✅
- Inputs:
ColumnarFormattable scan result (vector<vector<ColumnSegment>>) - Outputs: Apache Parquet v2 file compatible with Spark, DuckDB, Pandas; PAR1 magic, Thrift binary FileMetaData
- Affected files: new
include/storage/storage_parquet_exporter.h,src/storage/storage_parquet_exporter.cpp - Supports INT32, INT64, FLOAT32, FLOAT64, BOOL, STRING column types with native Parquet type mapping
- With
ARROW_ENABLED: delegates to Apache Arrow / Parquet C++ library; without it: portable Thrift binary Parquet v2 writer - Errors:
ERR_EXPORT_FORMAT_INVALIDfor unknown types;ERR_EXPORT_IO_ERRORfor file write failures;ERR_EXPORT_CONFIG_INVALIDfor mismatched schema - Tests: 15 tests in
tests/test_storage_parquet_exporter.cpp(StorageParquetExporterFocusedTests)
- Inputs:
-
RocksDBWrapperwith MVCC, WAL, BlobDB, multi-path SSTables, async I/O -
MVCCStoresnapshot isolation -
WALStoragemanagement and replay -
KeySchemamulti-model encoding -
StorageEnginewith dependency injection -
MergeOperatorsfor counter and list operations -
HLChybrid logical clocks
-
BackupManagerincremental and full backups -
PITRManagerpoint-in-time recovery - Blob backends: Filesystem, RocksDB BlobDB, S3, Azure Blob, WebDAV
-
BlobRedundancyManagerRAID-1 mirror
-
CompressionStrategypluggable per-table compression -
ColumnarFormatanalytical storage -
BatchWriteOptimizeradaptive write batching -
CompactionManagermanual and scheduled compaction -
SecuritySignaturefield-level AES-GCM encryption -
SecuritySignatureManagerHMAC-SHA256 tamper detection -
StorageAuditLoggerstructured audit trail - Production-mode safety flag
-
DiskSpaceMonitorquota monitoring -
IndexMaintenancebackground optimization -
TransactionRetryManagerexponential backoff -
DatabaseConnectionManagerconnection pooling -
RaftMVCCBridgeRaft-to-MVCC integration -
HistoryManagerversion tracking -
NLPMetadataExtractorautomatic metadata
- Tiered storage (hot/warm/cold) with age- and access-based migration policies
- GCS blob backend
- NVMe Optimizations:
NVMeManagerwith io_uring, multi-queue, ZNS, Direct I/O (CI: nvme-manager-ci.yml) - Erasure coding in
BlobRedundancyManager(PARITYmode viaErasureCodingBackend, RS(k,m)) (CI: erasure-coding-blob-storage-ci.yml) - 2PC distributed transactions (
DistributedTransactionManager, v1.7.0)
-
WomTree– Write-Optimized Merge (WOM) Tree: Bε-tree alternative to LSM for write-heavy workloads- Write amplification 2–5× (vs. 10–30× for LSM)
- Lazy buffer propagation — reduced compaction overhead
- Configurable fanout, leaf capacity, buffer size
- Full put/get/remove/scan/compact API with thread safety
- Write-amplification and read-hit-ratio metrics in Stats
-
ConcurrentWriteController– bounded FIFO write-concurrency semaphore (Target: v2.0.0) (Issue: PERF-D6) — implemented ininclude/storage/concurrent_write_controller.h,src/storage/concurrent_write_controller.cpp -
ConcurrentWriteControllerConfig–max_concurrent_writes(0 = hw_concurrency/2),max_queue_depth(0 = unlimited),acquire_timeout(0 = blocking) -
WriteGuard– RAII slot holder; move-only; destructor wakes the next FIFO waiter -
acquire()– blocking acquire; FIFO wakeup viastd::promise/std::futurechain; raisesstd::runtime_erroron queue-full, timeout, or shutdown -
tryAcquire()– non-blocking; returnsstd::nulloptif at capacity -
shutdown()– unblocks all waiters atomically (no deadlock on destruction) - EWMA statistics:
avg_wait_usupdated on every acquire (α ≈ 0.1, integer-scaled) - P99 latency: sliding window of last 128 wait times; sorted on read via
getStats() - Lifetime max and total acquired / rejected counters (lock-free atomics)
- Throughput: ≥ 50k acquires/s (measured: ~616k ops/s, 8 threads) ✅
- CV regression guard: 10-thread stress test (AC-D6-21) verifies CV < 20 % — eliminates the thundering-herd variance that caused D-6
- Tests: 25 tests in
tests/test_concurrent_write_controller.cpp(ConcurrentWriteControllerFocusedTests)- AC-D6-1–20: unit tests (config, guard lifecycle, FIFO, stats, timeout, shutdown)
- AC-D6-21: 10-thread stress CV < 20 % regression guard
- AC-D6-22–24: move semantics, idempotent release, unlimited queue
- AC-D6-25: throughput SLO (gated by
THEMIS_RUN_PERF_TESTS=1)
- All
src/storage/*.cppfiles verified registered in cmake build system (mainCMakeLists.txt+StorageEnhancements.cmake+BlobStorage.cmake) - 21+ focused standalone test targets added in
tests/CMakeLists.txt: StorageEngineDI, StorageEngineProd, StorageAuditLogger, StorageFuzz, StorageLatencyBench, BlobStorage, BlobTransferCheckpoint, CompressionStrategy, TieredStorage, WalStorage, WalManager, WalArchiving, WalBackupManager, WalChaos, WalManifestCorruption, WalReplication, WalReplicationIntegration, WalGrpcApply, MvccStore, MvccHistory, MvccWalIntegration, NVMeFocusedTests, ErasureCodingFocusedTests, RocksDBSizeCalculationFocusedTests, BlobRedundancyEventListenerFocusedTests -
RocksDBWrapper::getApproximateSize()returns real on-disk SST size (rocksdb.total-sst-files-size) — was returning 0 (CI: rocksdb-size-calculation-ci.yml) -
SecuritySignatureManager::listAllSignatures()iterates full RocksDB key-range viaiterateRange— was stub (CI: security-signature-rocksdb-iteration-ci.yml) -
BlobRedundancyManager::createRocksDBListener()returns workingRocksDBBlobListener— was returning error stub (CI: blob-redundancy-event-listener-ci.yml)
-
RocksDBWrapper::putBlob()— streaming write path for large blobs (Issue PERF-D5)- Blobs ≥
blob_streaming_threshold_bytes(default 64 KB) are split intoblob_chunk_size_bytes(default 128 KB) chunks - Parallel chunk encoding via
std::asyncthread pool (blob_streaming_threads, default 4) - All chunks + manifest committed atomically via single
WriteBatch::Write()— bypasses per-write transaction overhead - Internal key scheme: manifest
__tmbs_m__:<key>, chunks__tmbs_c__:<key>:<6-digit-index> - Backward compatible: blobs below threshold fall back to regular
put()
- Blobs ≥
-
RocksDBWrapper::getBlob()— transparent reassembly from manifest + chunk keys viaMultiGet -
RocksDBWrapper::delBlob()— atomic manifest + chunk deletion viaWriteBatch - New Config fields:
enable_blob_streaming,blob_streaming_threshold_bytes,blob_chunk_size_bytes,blob_streaming_threads -
tests/test_blob_streaming.cpp— 16 focused tests (BlobStreamingFocusedTests): round-trip, boundary, parallel, integrity, delete, overwrite, fallback, custom chunk size, single thread, zero-length, non-aligned -
benchmarks/bench_comprehensive.cppStoreLargeBlobs_1MBupdated to useputBlob()with 8-thread encoding
Wissenschaftliche Basis: Oseledets 2011 (DOI: 10.1137/090752142); Khoromskij 2011; Dettmers et al. 2023 (NF4)
-
TensorTrainConfig—eps,max_rank,dtype,svd_power_iterationsparameters -
TTCore— 3-D core tensor struct (r_left × n × r_right) with row-major data layout -
TTTrain— full TT-decomposition:cores,mode_sizes,originalNorm,achievedEps,serialize()/deserialize() -
TensorStorageConfig— engine configuration:tt_config,quant_type,version_retention,min_compression_ratio -
TensorFieldKey— logical address{tenant, collection, field}+TensorFieldKeyHash -
ITensorStorageBackend— abstraction interface for RocksDB / in-memory backends - Key schema defined:
__ttn__:<tenant>:<collection>:<field>:G<k>:<version>(meta + core keys) -
StorageLayoutAdvisorintegration: newTENSOR_NETWORKlayout type (Target: Q3 2026)
-
TensorTrainDecomposer::decompose()— TT-SVD (Oseledets 2011, Algorithm 1)- Per-step threshold:
δ = ε · ‖T‖_F / √(d-1)— global error bound guaranteed - Internal Golub-Reinsch bidiagonalisation SVD (LAPACK
dgesddviaTHEMIS_USE_LAPACK_SVD=ON, Q3 2026) - Input: dense float32/float64; Output:
TTTrainwithachieved_eps
- Per-step threshold:
-
TensorTrainDecomposer::round()— TT-Rounding (Algorithm 2) -
TensorTrainDecomposer::innerProduct()/frobeniusNorm()/cosineSimilarity()— transfer-matrix algorithm O(d·n·r³) -
TTQuantizer::quantize(INT8)— symmetric per-core channel-wise scaling, 1 byte/element + 4 bytes scale -
TTQuantizer::quantize(NF4)— 16-entry lookup table (Dettmers 2023 Table 1), 2 values/byte -
TTQuantizer::dequantize()— inverse INT8 / NF4 round-trip -
InMemoryTensorBackend— testing KV-store (no RocksDB required in unit tests) -
TensorNetworkStorageEngine::put()— decompose + quantise + persist (all cores per version) -
TensorNetworkStorageEngine::get()/getVersion()— load + dequantise + reconstruct -
TensorNetworkStorageEngine::getCompressed()— returnsQuantizedTrainwithout decompression -
TensorNetworkStorageEngine::compact()— version pruning by retention threshold
- Shape mismatch →
std::invalid_argumentindecompose()andput() - Null backend →
std::invalid_argumentinTensorNetworkStorageEngineconstructor -
min_compression_ratioguard — fall back to raw storage when TT compression not beneficial - Version retention: old core keys purged on configurable
version_retentionthreshold - Zero tensor, constant tensor — handled without division-by-zero in error bound calculation
- 16 unit tests (TTD-01..TTD-16) in
tests/storage/test_tensor_train_decomposer.cpp- TTD-01..05: TT-SVD correctness, compression ratio, core shapes
- TTD-06..08: inner product, cosine similarity (identical, zero-norm)
- TTD-09..12: serialisation, F64, invalid input, max_rank
- TTD-13..16: INT8 / NF4 quantisation round-trip, QuantizedTrain serialisation, bytesPerElement
- 8 storage engine tests (TNS-01..TNS-08) in
tests/storage/test_tensor_train_decomposer.cpp
- LAPACK
dgesddintegration (THEMIS_USE_LAPACK_SVD=ONCMake option) (Target: Q3 2026)- Inputs: unfolding matrices up to 4096 × 4096; Outputs: exact singular values
- Required for: TT-SVD of 10⁶-element 6D tensor ≤ 500ms CPU, ≤ 80ms GPU (acceptance criterion)
- CUDA cuSOLVER SVD path (
THEMIS_ENABLE_CUDA) (Target: Q4 2026)- Inputs: float32 unfolding matrices; Outputs: GPU singular values
- Guard: graceful CPU fallback when CUDA not available
- Perf target: ≤ 80ms for 10⁶-element 6D tensor
- RocksDB backend (
RocksDBTensorBackend : ITensorStorageBackend) (Target: Q4 2026)- Put: core bytes stored in column-family
cf_tensor_cores - Get:
MultiGetfor parallel core loading - Compaction: use RocksDB
DeleteRangefor version purging
- Put: core bytes stored in column-family
- Compression benchmark: ≥ 10× for LLM attention matrices at ε ≤ 1% (Target: Q4 2026)
- Reconstruction error benchmark: ≤ ε × ‖T‖ for all test cases (Target: Q4 2026)
-
research/papers/tensor_networks_themisdb.md— DOI + BibTeX for all 9 papers -
research/best_practices/tensor_train_storage.md— implementation guidelines - API Stability declaration for
TensorTrainDecomposer,TTQuantizer,TensorNetworkStorageEngine(Target: Q4 2026)
Acceptance Criteria:
- TT-SVD ≤ 500ms CPU / ≤ 80ms GPU for 10⁶-element 6D tensor
- Compression ≥ 10× at ε ≤ 1% for typical LLM attention matrices
- Reconstruction error ≤ ε × ‖T‖_F (verified in TTD-04)
- Tests: 16 TTD + 8 TNS = 24 tests passing
- Unit test coverage for core storage paths
- Integration tests with real RocksDB instance
- Backup and PITR restore validation
- Encryption enabled in all production deployments (
THEMIS_PRODUCTION_MODE) - Audit logging for all write operations
- Performance benchmarks for tiered storage migration
- NVMe optimizations with graceful fallback on non-NVMe hosts
- Erasure coding (Reed-Solomon) for space-efficient blob redundancy
- Distributed 2PC transactions for cross-shard atomicity
- Write-Optimized Merge Tree for write-heavy workloads
- Streaming blob write path (
putBlob/getBlob/delBlob) for high-throughput 1 MB+ blob storage (PERF-D5, v2.0.0) -
RocksDBWrapper::putBatch(vector<KeyValuePair>)— atomic N-key WriteBatch commit; eliminates per-write MVCC overhead for OLTP bulk-write paths (B3, 2026-04-20) - Chaos/fault-injection tests for blob backend failover (Target: v2.0.0)
NLPMetadataExtractordepends on an external NLP model; slow startup if model is not pre-warmed- Tiered storage uses flat filesystem files per key; for large datasets a more efficient store (e.g. RocksDB column-family per tier) is recommended
- [R-1]
rocksdb_wrapper.cpp::close(): TOCTOU race —db_lifecycle_mutex_released before busy-wait; newOperationGuards can start after the lock is released;db_.reset()races with active operations → use-after-free. - [R-2]
rocksdb_wrapper.cpp::putBlob(): Multi-chunk blobWriteBatchbypasses TransactionDB MVCC; concurrent snapshot-reads can observe partially written blobs. - [R-5]
rocksdb_wrapper.cpp:write_options_->sync = falsedefault — acknowledged writes can be lost on power failure.
StorageEngine::createDefault()factory is deprecated; use the DI constructor with explicitIExpressionEvaluatorPtr,IFieldEncryptionPtr,IKeyProviderPtr, andIIndexManagerPtrto avoid insecure defaults in productionKeySchemakey format v1.5.0+ (prefix-based) is not backward-compatible with keys created before v1.5.0; migration required for existing data
Stand: 2026-04-20 – Quelle: src/UNUSED_FUNCTIONS_REPORT.md
AdaptiveCompactionScheduler– Plant und steuert RocksDB-Compactions adaptiv; Tests vorhandenAktion: ROADMAP-Ticket für Produktions-Integration ergänzen oder als CANDIDATE_FOR_REMOVAL markieren.