Summary
Support remote dynamic filter propagation for distributed queries — let frontend-side dynamic filters produced by joins be registered, serialized, and propagated to remote datanode scans so they can prune data earlier. This is an optimization-only feature: failures must safely degrade to local-only dynamic filtering and must not affect query correctness.
Related PRs
Component Breakdown
| Component |
Description |
Status |
| RFC / design |
Defines the high-level remote dyn filter propagation model and phased rollout |
✅ |
| Wire ABI / query identity |
Defines query_id, filter_id, epoch / completion semantics, payload boundaries, and safe downgrade rules |
✅ |
| Region RPC control plane |
Adds a unary frontend -> datanode control-plane entry for remote dyn filter update / unregister messages |
🔄 |
| Frontend producer / bridge |
Identifies distributed join dynamic filters, generates stable query-local filter_ids, stores minimal query-scoped frontend state, and attaches bounded initial-register metadata to the first remote read |
🔄 |
| Datanode initial registration |
Receives initial-register metadata and prepares query-scoped handoff state for later consumer/runtime installation |
🔄 |
| Datanode apply runtime |
Maintains query_id + filter_id state, applies ordered updates, installs remote dynamic filter wrappers into scan predicates, handles remap, and owns successful-path cleanup |
🔜 |
| Unregister / lifecycle cleanup |
Cleans up frontend and datanode state on end-of-use, query finish, cancel, stream drop, or TTL fallback |
🔜 |
| Observability / fallback |
Adds metrics, tracing, budgets, error downgrade visibility, and control-plane protection |
🔜 |
| End-to-end validation |
Covers distributed join pruning, correctness under downgrade/failure, epoch ordering, cleanup, and performance baselines |
🔜 |
| Large build-side membership |
Designs a transportable representation for non-serializable HashTableLookupExpr-like membership, likely via Bloom / custom payloads |
🔜 |
Summary
Support remote dynamic filter propagation for distributed queries — let frontend-side dynamic filters produced by joins be registered, serialized, and propagated to remote datanode scans so they can prune data earlier. This is an optimization-only feature: failures must safely degrade to local-only dynamic filtering and must not affect query correctness.
Related PRs
Component Breakdown
query_id,filter_id, epoch / completion semantics, payload boundaries, and safe downgrade rulesupdate/unregistermessagesfilter_ids, stores minimal query-scoped frontend state, and attaches bounded initial-register metadata to the first remote readquery_id + filter_idstate, applies ordered updates, installs remote dynamic filter wrappers into scan predicates, handles remap, and owns successful-path cleanupHashTableLookupExpr-like membership, likely via Bloom / custom payloads