Roadmap-Hinweis: Vage Bullets ohne Akzeptanzkriterien in Checkbox-Tasks überführen. Format:
- [ ] <Task> (Target: <Q/Jahr>).
v1.6.0 – AdaLoRA (adaptive rank pruning), LoRAAdapterMerger (TIES + linear), and LoRA+ (asymmetric LR) added. Full LoRA fine-tuning toolchain including checkpoint/resume, adapter versioning, quantization (QLoRA), and multi-GPU training.
- LegalAutoLabeler – automated training sample generation from legal documents via NLP modality extraction
-
labelAll(),labelDocument(),labelQuery()APIs - AQL query executor wired to
labelAll()andlabelQuery()for DB document-ID fetch (v1.7.0) —auto_labeler.cpp - Low-confidence sample flagging and human-review queue
-
updateSampleConfidence()for recording human review decisions - German (
de) and multi-language support - IncrementalLoRATrainer – full LoRA lifecycle (train, evaluate, deploy, rollback)
- INITIAL and INCREMENTAL training modes
- Checkpoint save and resume (
resumeFromCheckpoint) - Adapter version management (
deployVersion,rollbackVersion,listVersions) - Configurable LoRA rank, alpha, learning rate
- Training progress callback (epoch, step, loss)
- KnowledgeGraphEnricher – AQL graph traversal context enrichment
-
findRelatedProvisions(),findRelatedCaseLaw(),findSimilarDocuments() -
findSimilarDocuments()wired toVectorIndexManagerfor real cosine-similarity search viasetVectorIndex()(v1.6.0) —knowledge_graph_enricher.h/.cpp,tests/test_kge_vector_search.cpp - Custom AQL query registration for domain-specific traversals
- Pimpl pattern for ABI stability across all three components
- LoRA Checkpoint Manager with SHA-256 integrity validation (Target: Q1 2026) —
lora_checkpoint_manager.h/.cpp(b2342851) - Training Sample Provenance and Lineage Tracking (Target: Q1 2026) —
provenance_tracker.h/.cpp(b2342851) - Knowledge Graph Enrichment LRU Cache (Target: Q1 2026) —
EnrichmentLRUCacheinknowledge_graph_enricher.cpp(b2342851) - ContentModality enum for multi-modality sample tracking (Target: Q1 2026) —
auto_labeler.h(b2342851) - Confidence-Threshold Auto-Calibration via isotonic regression (Target: Q1 2026) —
ConfidenceCalibratorintraining_pipeline.h/.cpp(b2342851) - Multi-modality parser (
ModalityDetector,TextClauseExtractor,TableExtractor,CitationExtractor,OCRExtractor) (Target: Q1 2026) —modality_parser.h/.cpp(b2342851) - Real LoRA weight manipulation in
IncrementalLoRATrainer(Target: Q1 2026) — replacedcomputeSimulatedLoss()withLoRALayer+AdamOptimizerforward/backward/step; CUDA/HIP viaGPULoRALayer; binary checkpoint serialization for B and A matrices (incremental_lora_trainer.cpp) - Dedicated LoRA adapter weight manipulation layer
LoRAAdapter(Target: v1.7.0) —lora_adapter.h/.cpp; real forward pass (input @ B @ A × scaling), additive single/batch weight updates, Kaiming-B/zero-A init, export/import for checkpoint integration; 39 unit tests (test_training_lora_adapter.cpp) - Multi-GPU distributed training coordination (Target: Q2 2026) —
IncrementalTrainingConfig.num_gpus/gpu_ids/sync_stepsfields;MultiGPULoRATrainerwired inincremental_lora_trainer.cpp; data-parallel sharding, all-reduce gradient sync; device-count mismatch →std::runtime_error; fallback to single-GPU on init failure;TrainingResult.gpus_usedfield - Model quantization configuration (Target: Q2 2026) —
TrainingQuantizationTypeenum (NONE/FP16/INT8/NF4),QuantizationConfigstruct,IncrementalTrainingConfig.quantizationfield; validated invalidateHyperparameters(); INT8 and NF4 activateQLoRALayerin the CPU training path (base weights frozen/compressed, only LoRA adapters A and B trained in full-precision); NONE/FP16 use standardLoRALayer - Training metrics tracking (Target: Q2 2026) —
EpochMetrics(per-epoch loss/accuracy/lr/elapsed),TrainingMetrics(step_losses, epoch_metrics, best_train_loss, best_val_loss, total_elapsed_seconds);IncrementalLoRATrainer::getMetrics()public API; metrics reset at start of eachtrain()call -
LoRACheckpointManagerintegration inIncrementalLoRATrainer(Target: Q2 2026) —IncrementalTrainingConfig.checkpoint_dirfield; when set, eachsaveCheckpoint()call delegates toLoRACheckpointManager::save()for atomic writes, SHA-256 integrity, and rolling-window rotation (3 checkpoints default) - AdaLoRA adaptive rank allocation (Target: Q2 2026) —
ada_lora_adapter.h/.cpp; importance scoring via B/A norm products;reallocateRanks()proportional budget distribution; active-rank forward pass; 36 tests (test_ada_lora_adapter.cpp);AdaLoRAFocusedTestsCMake target - LoRAAdapterMerger linear + TIES merging (Target: Q2 2026) —
lora_adapter_merger.h/.cpp;mergeLinear()weighted ΔW sum + SVD factorisation;mergeTIES()Trim–Resolve–Merge (Yadav et al.);*All()batch overloads; 32 tests (test_lora_adapter_merger.cpp);LoRAMergerFocusedTestsCMake target - LoRA+ asymmetric learning rates (Target: Q2 2026) —
IncrementalTrainingConfig::lora_plus_lambda; when > 1.0, B useslr*λand A useslr(Hayou et al., 2024); dualAdamOptimizerinstances inIncrementalLoRATrainer::Impl
-
Automated hyperparameter search (LoRA rank, learning rate sweep) (Target: Q2 2026) —
HyperparamSearchConfig,HyperparamResult,HyperparamTrialResult,HyperparamSearchCallbackintraining_pipeline.h;runHyperparamSearch()intraining_pipeline.cpp; 9 tests intests/test_training_pipeline_e2e.cpp- Subsystems:
src/training/training_pipeline.cpp(newHyperparamSearchinner class),ConfidenceCalibrator - Inputs:
HyperparamSearchConfig{rank_candidates, lr_candidates, max_trials, budget_seconds}; validation split fraction - Outputs:
HyperparamResult{best_rank, best_lr, best_val_loss, trial_log}; best config auto-applied to pipeline - Constraints: deterministic trial ordering (seeded random); concurrent trials capped at
num_gpus - Errors: no improvement after
max_trials→ return best seen; budget exceeded → early-stop, return best so far - Tests: unit — mock trainer, verify trial scheduling; integration — sweep over 3 rank values on synthetic dataset
- Perf: trial overhead (excluding training) ≤ 50 ms/trial; total sweep for 9-trial 3×3 grid ≤ 3× single-train time
- Subsystems:
-
Adapter serving integration with LLM inference layer (Target: Q3 2026) —
ILLMRouterabstract interface +DeployResultininclude/training/adapter_serving.h;setLLMRouter(ILLMRouter*)onIncrementalLoRATrainer;deployVersionEx()/rollbackVersionEx()propagate weight to router; 29 focused tests intest_training_phase2.cpp
- [?] Support for additional legal jurisdictions beyond German law
- [?] Active learning loop (auto-select most informative samples for labeling)
- [?] Training data deduplication and near-duplicate filtering
- [?] Evaluation metrics dashboard (validation loss curves, accuracy)
- [?] Export labeled datasets in standard formats (JSONL, Hugging Face datasets)
Full research:
research/ADALORA_TT_BRIDGE_RESEARCH.md
Specification:include/training/adalora_tt_bridge.h
- Phase 1 (Q2 2027): Core conversion
AdaLoRA ↔ TT—AdaLoraTTBridge::exportToTT()/importFromTT()- Mathematical basis: G₀[0,:,i] = P[:,i]·√λᵢ, G₁[i,:,0] = Q[:,i]·√λᵢ (bijective for 2D matrices)
- QR sign-normalisation + orthogonality validation (‖P^T·P − I‖_F < ε_orth = 1e-4)
- Round-trip error < machine epsilon; 15+ unit tests
- Acceptance: lossless for active_rank ≤ 64;
std::invalid_argumentfor rank > max_tt_rank
- Phase 2 (Q2 2027): Storage integration —
store()/loadAdapter()viaTensorNetworkStorageEngine- Key schema:
__lora_adapters__:<tenant>:<adapter>:<layer>:G<0|1> LoRACheckpointManagerbackendTT_STORAGE; adapter-load latency target ≤ 11 ms (7B, r=64)
- Key schema:
- Phase 3 (Q3 2027): Deduplication + serving —
TensorFingerprintGraphintegrationfindSimilarAdapters()for FLARE live adapter switch ≤ 15 msGgmlTensorBridge::mapAdapter()zero-copy mmap path- Expected: ≥40% storage reduction for 100 domain-related adapters
- Phase 4 (Q4 2027): Unified rank control —
roundAndReallocate()- TT-rounding as globally optimal alternative to AdaLoRA greedy pruning
- Comparison study: AdaLoRA pruning vs. TT-SVD vs. combined (target: ≥2% better Frobenius-optimal rank cut)
- [?] Reinforcement learning from human feedback (RLHF) training loop
- [?] Multi-modal training samples (text + table + chart)
- Domain adaptation beyond legal (medical, financial) —
DomainTypeLEGAL/MEDICAL/FINANCIAL inauto_labeler.h; domain-specific keyword extraction for medical/financial domains inauto_labeler.cpp - Federated learning for privacy-preserving cross-institution training (Target: Q2 2026) —
LoRAFederationCoordinator+IncrementalLoRATrainer::exportGradient()/applyGlobalDelta()indistributed_knowledgeandtrainingmodules - Model distillation from large to small adapters (Target: Q2 2026) —
FederatedDistillationCoordinatorindistributed_knowledge/federated_distillation_coordinator.h/.cpp; teacher submits DP-protected soft labels; student nodes receive viaregisterStudent()callback;PolicyGate+ rollback trigger + audit hook; FDF-01..10 tests intests/test_federated_distillation_coordinator.cpp
- LegalAutoLabeler – NLP modality extraction from legal documents
-
labelAll(),labelDocument(),labelQuery()public APIs -
labelAll()andlabelQuery()fetch document IDs from the DB via AQL query executor (executeAql()); offline/nullptr-engine fallback for tests - Low-confidence sample flagging and human-review queue with
updateSampleConfidence() - IncrementalLoRATrainer – full LoRA lifecycle (train, evaluate, deploy, rollback)
- INITIAL and INCREMENTAL training modes with configurable rank/alpha/lr
- Checkpoint save and resume (
resumeFromCheckpoint()) - Adapter version management (
deployVersion,rollbackVersion,listVersions) - KnowledgeGraphEnricher – AQL graph traversal context enrichment (
findRelatedProvisions,findRelatedCaseLaw) - Confidence-threshold filtering for automatic sample acceptance
- Pimpl pattern for ABI stability across all three components
- Adapter version management: atomic deploy/rollback with integrity verification (Target: Q2 2026) —
deployVersionEx()/rollbackVersionEx()inincremental_lora_trainer.h/.cpp;verifyAdapterIntegrity()callsLoRACheckpointManager::validate()whencheckpoint_diris set; bypass for unmanaged adapters;DeployResult{success,active_version,split_applied,error}result struct; error codes:"version_not_found","integrity_failure","router_unavailable","invalid_split" - Multi-domain support beyond German legal text (medical, financial) (Target: Q2 2026) —
DomainTypeenum (LEGAL/MEDICAL/FINANCIAL) added toauto_labeler.h;AutoLabelConfig::domain_typefield;extractFallbackModalities()inauto_labeler.cppdispatches domain-specific obligation/recommendation/permission/prohibition patterns for medical (must/shall/required/should/recommended/may/contraindicated/verboten) and financial (must/shall/required/should/may/prohibited/forbidden/disclose/report/offenlegen/melden) domains; German and English terms both covered - Automated hyperparameter search (LoRA rank and learning rate sweep) (Target: Q2 2026)
- Adapter serving integration with the LLM inference layer (Target: Q3 2026) —
ILLMRouterabstract interface (adapter_serving.h/.cpp):setAdapterWeight(version,weight),isAvailable(),activeVersion();IncrementalLoRATrainer::setLLMRouter(ILLMRouter*)wires the router;deployVersionEx()/rollbackVersionEx()propagate weight updates to the router atomically after local registry update; unavailable router →DeployResult.error = "router_unavailable"
-
ContentModalityenum (TEXT_CLAUSE, TABLE, CITATION, OCR_IMAGE, UNKNOWN) added toauto_labeler.h -
modalityfield added toTrainingSamplestruct for per-modality confidence thresholds -
LoRACheckpointManager– SHA-256 integrity validation, atomic rotation, rolling 3-checkpoint window, manifest JSON (lora_checkpoint_manager.h/.cpp) -
ProvenanceTracker– ProvenanceRecord, write(), recordFilteredSample(), queryLineage(), getRecord() (provenance_tracker.h/.cpp) -
EnrichmentLRUCache– thread-safe LRU map insideKnowledgeGraphEnricher, enableCache/disableCache/getCacheStats API -
ConfidenceCalibrator– isotonic regression (PAV algorithm) per-category threshold selection intraining_pipeline.h/.cpp - Multi-modality full parser (
training/modality_parser.h/.cpp):ModalityDetector,TextClauseExtractor,TableExtractor,CitationExtractor,OCRExtractor - Standalone focused test targets for training module (
ModalityParserFocusedTests,TrainingConvergenceFocusedTests) - [?] Active learning loop (auto-select most informative unlabelled samples)
Paper 1 — §5 Training Data Pipeline / §7.4 Golden Dataset Construction Issue: docs/issues/lora_loops/IMPL-A1-dataset-construction.md
- Add
DomainType::DATABASE_OPTIMIZERtoDomainTypeenum ininclude/training/auto_labeler.h - Implement
DatabaseDomainAutoLabelerclass: extendsLegalAutoLabelerinfrastructure, labels(query, plan, Δlatency)triples —include/training/database_domain_auto_labeler.h+src/training/database_domain_auto_labeler.cpp - Add
DATABASE_OPTIMIZERbranch toLegalAutoLabeler::categorize()dispatch table - Add domain keywords (EXPLAIN, index scan, seq scan, hash join, latency, p99) to
LoRADataSelectionConfig - Confidence score:
tanh(|Δlatency_ms| / 50)— labels with |Δlatency| < 5 ms auto-rejected - Validation against
LoRADataSelectionPipelinequality filters (duplicate-query dedup, min confidence 0.85) - 8 unit tests in
tests/test_training_database_optimizer.cpp(DBO-01..08)DBO-01categorize() returnsDATABASE_OPTIMIZERfor EXPLAIN output sampleDBO-02confidence 0.0 for |Δlatency| = 0 msDBO-03confidence ≥ 0.85 for |Δlatency| = 50 msDBO-04domain keyword match triggers correct domain typeDBO-05CLI export produces valid JSONLDBO-06duplicate query filtered by LoRADataSelectionPipelineDBO-07medical/legal domains unaffected by DATABASE_OPTIMIZER branchDBO-081 000 sample golden dataset passes all quality filters
- Implement
DatabaseDomainAutoLabelerclass (include/training/database_domain_auto_labeler.h,src/training/database_domain_auto_labeler.cpp): labels(query, plan, Δlatency)triples - Add
DATABASE_OPTIMIZERbranch toLegalAutoLabeler::categorize()dispatch table - Add domain keywords (EXPLAIN, index scan, seq scan, hash join, latency, p99) to
LoRADataSelectionConfig - Implement optimizer-log export CLI: emits JSONL with
(query, explain_plan, latency_delta_ms)fields —DatabaseDomainAutoLabeler::exportToJsonl()static method - Confidence score:
tanh(|Δlatency_ms| / 50)— labels with |Δlatency| < 5 ms auto-rejected - Validation against
LoRADataSelectionPipelinequality filters (duplicate-query dedup, min confidence 0.85) — DBO-06 usesDataSelectionPipeline::deduplicate() - Collect 1 000 labeled pairs from all 4 loops as minimum viable golden dataset — DBO-08 validates 1000 synthetic samples, all confidence ≥ 0.85
- 8 new unit tests:
DBO-01…DBO-08intests/test_training_database_optimizer.cpp(test_training_database_optimizer_focusedtarget)
Paper 1+3 — §4.5 Adapter Lifecycle / Distributed Knowledge §Layer B Issue: docs/issues/lora_loops/IMPL-A3-federation-hooks.md
-
IncrementalLoRATrainer::exportGradient()→EncryptedGradient(opaque blob, AES-256-GCM) —include/training/incremental_lora_trainer.h -
IncrementalLoRATrainer::applyGlobalDelta(const GlobalAdapterDelta&)→ applies FedAvg aggregate to local adapter weights -
EncryptedGradientandGlobalAdapterDeltastructs intraining_interfaces.h - Privacy invariant:
exportGradient()output must never contain raw training samples — enforced by unit test - 5 unit tests in
tests/test_incremental_lora_trainer.cpp(FED-01..05)FED-01exportGradient()returns non-empty blob after trainingFED-02applyGlobalDelta()verifiably changes adapter weights (weight-diff ≠ 0)FED-03applying zero-delta leaves weights unchangedFED-04privacy: raw sample text absent fromEncryptedGradientserialised bytesFED-05double-apply is idempotent when delta == 0
- Phase 1 — Protokoll-Design: Federated Distillation protocol for
Client/Coordinator/Verifierroles specified and wired viaIncrementalLoRATrainer,LoRAFederationCoordinator, and governance/audit hooks (Target: Q2 2026) - Phase 1 — Threat Model: honest-but-curious + Byzantine client model, membership-inference/model-inversion risk coverage documented in module security/audit docs (Target: Q2 2026)
- Phase 2 — Baseline + Privacy Controls: central-vs-federated baseline path, Gaussian DP controls (
dp_epsilon,dp_delta), secure cross-shard gradient exchange (EncryptedGradient) integrated (Target: Q2 2026) - Phase 2 — Robust Aggregation: non-IID-resilient median/FedAvg aggregation and poisoning/outlier protection paths validated by distributed-knowledge tests (Target: Q2 2026)
- Phase 3 — Evaluation: non-IID and cross-domain federation scenarios validated in
tests/test_distributed_knowledge_integration.cppand resilience suitetests/test_distributed_knowledge_or.cpp(Target: Q2 2026) - Phase 3 — Trade-off Measurement: privacy/utility and failure-mode observability exposed via coordinator stats (
getStats()) and audit callbacks (Target: Q2 2026) - Phase 4 — Productive Rollout: canary-style staged federation enablement, model governance controls, and rollback path through
deployVersionEx()/rollbackVersionEx()and federation admin integration (Target: Q3 2026) - Phase 4 — Fallback Safety: policy/quality guardrails enforce safe fallback to local adapters if federation or governance checks fail (Target: Q3 2026)
- >= 90% task quality vs. centralized baseline at configured privacy budget (Target: Q3 2026)
- Federated round overhead <= 15% versus non-federated update in focused DK benchmarks (Target: Q3 2026)
- 0 unprotected raw-data exfiltration along training/federation paths (Target: Continuous)
- Demonstrated robustness under simulated poisoning and timeout scenarios (Target: Q3 2026)
- Technical protocol + threat model documentation (Target: Q2 2026)
- Reproducible evaluation suite for federated rounds and resilience scenarios (Target: Q2 2026)
- Governance/release criteria with model rollback safety checks (Target: Q3 2026)
- Implementation backlog for production integration and hardening (Target: Q3 2026)
- Privacy/utility trade-off measurable and reviewable by stakeholders
- Security mechanisms validated through tests and attack/failure simulations
- Rollout and rollback path documented and testable in training + federation flows
- Unit tests coverage > 80% (8 test files, 4,381 lines; ConfidenceCalibrator, ModalityParser, Pipeline E2E, Data Selection, Checkpoint, Provenance all covered)
- Integration tests (label → train → evaluate → deploy lifecycle) –
test_training_pipeline_e2e.cpp - Performance benchmarks –
benchmarks/bench_legal_lora_pipeline.cpp - [?] Security audit (PII scanning, tenant isolation, checkpoint encryption at rest – see FUTURE_ENHANCEMENTS.md Security/Reliability section)
- Documentation complete (README.md, ARCHITECTURE.md, ROADMAP.md, FUTURE_ENHANCEMENTS.md)
- API stability guaranteed (Pimpl pattern;
TrainingSamplestruct stable from v1.x)
- NLP modality extractor is provided externally (
analytics::NlpTextAnalyzer); not bundled. - Multi-GPU training requires
THEMIS_ENABLE_LLM && THEMIS_ENABLE_GPUat build time; single-GPU fallback is automatic. IncrementalTrainingConfig.quantizationgoverns the training-module view of quantization; INT8/NF4 useQLoRALayer(fromllm/lora_framework/quantized_model.h) so only LoRA adapters are updated in full precision while the base weights remain compressed. The LLM inference layer uses a separateQuantizationTypedefined inllm/lora_framework/quantization.h.- LoRA adapter serving (inference) must be handled by the LLM integration layer.
- Real LoRA weight updates use the embedded Tensor framework; base-model tokenization (llama.cpp) is not yet wired — training batches are encoded as float feature vectors from sample hashes.
LoRAAdapter(training module) operates independently of the LLM-layerLoRALayer; integration withIncrementalLoRATrainercheckpoints is the caller's responsibility viaexportWeights()/importWeights().
TrainingSamplestruct is stable from v1.x; new optional fields only.IncrementalTrainingConfigmay gain new hyperparameter fields in v1.5.0; backward-compatible.
Stand: 2026-04-20 – Quelle: src/UNUSED_FUNCTIONS_REPORT.md
AdaLoRAAdapter– AdaLoRA-Adapter für Parameter-effizientes Fine-Tuning; Tests vorhandenAktion: ROADMAP-Ticket für Produktions-Integration ergänzen oder als CANDIDATE_FOR_REMOVAL markieren.