Summary
Track the implementation of sample-based NDV collection for large-table ANALYZE.
Goal
Reduce TiKV-side NDV collection cost for very large analyze jobs while keeping NDV accuracy acceptable.
Scope
- implement sample-based NDV collection for Analyze V2
- define how it is enabled (explicit knob and/or auto-trigger for very large / slow analyze jobs)
- benchmark both NDV accuracy and resource usage on representative datasets
- document behavior and limitations clearly
Out of scope
The first step does not promise full-scan / IOPS reduction. The initial implementation mainly targets TiKV CPU cost.
Summary
Track the implementation of sample-based NDV collection for large-table
ANALYZE.Goal
Reduce TiKV-side NDV collection cost for very large analyze jobs while keeping NDV accuracy acceptable.
Scope
Out of scope
The first step does not promise full-scan / IOPS reduction. The initial implementation mainly targets TiKV CPU cost.