-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Labels
enhancementNew feature or requestNew feature or request
Description
🚀 Is your feature request related to a problem?
Current point/range read paths still rely mostly on per-SST (table-level) filters and regular level traversal. For range-heavy and scan-heavy workloads, repeated filter probes and cross-level checks can add avoidable CPU overhead.
We want a cleaner, reusable way to experiment with global-level filtering strategies (inspired by the GRF paper) without tightly coupling the logic to one specific LSM implementation path.
Paper reference:
💡 Proposed Solution
Introduce a plugin-style Global Range Filter (GRF) module in NoKV that can be enabled/disabled and evolved independently.
High-level goals:
- Provide a pluggable GRF interface for run/level pruning decisions.
- Keep compatibility with existing Bloom/table filter path.
- Support A/B benchmarking via runtime config flags.
- Maintain correctness-first semantics with fallback to existing path on uncertainty.
🎨 Alternatives Considered
- Hardwire GRF into current levels/table traversal.
- Pros: fast to prototype.
- Cons: high coupling, harder rollback/testing, difficult to compare algorithms.
- Keep only per-table filters.
- Pros: simpler.
- Cons: misses potential CPU reduction on range/scan-heavy workloads.
- Plugin-style GRF module (proposed).
- Pros: easier experimentation, clean abstraction, safer rollout.
- Cons: additional interfaces and integration complexity.
📝 Technical Details / Implementation Plan
- Define
GlobalFilterinterface (build/load/query/metrics hooks). - Add GRF config options (enable flag, memory budget, build mode, fallback strategy).
- Integrate GRF query path into levels/range pruning decision pipeline.
- Ensure correctness fallback: when GRF result is uncertain, use legacy traversal.
- Add manifest/version metadata for GRF state compatibility.
- Add unit tests for key-shape edge cases (long keys, boundary keys, empty ranges).
- Add integration tests for point/range/scan correctness under compaction and restart.
- Add benchmark matrix (with/without GRF, varying scan length and levels).
- Add observability counters (hit/miss/fallback/pruned-runs/query-latency).
- Document architecture and usage in docs.
🧐 Additional Context
This issue tracks the plugin architecture and engineering integration for GRF.
Suggested milestones:
- Interface + no-op implementation + wiring.
- Baseline GRF implementation and correctness tests.
- Performance validation and tuning.
- Optional advanced variants (learned/shape-aware enhancements).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Projects
Status
Todo