enhance: persist segment summary metadata on SegmentInfo#50410
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tedxu The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## master #50410 +/- ##
===========================================
- Coverage 78.96% 44.75% -34.22%
===========================================
Files 2239 12 -2227
Lines 396917 2098 -394819
===========================================
- Hits 313441 939 -312502
+ Misses 73900 1107 -72793
+ Partials 9576 52 -9524
🚀 New features to boost your workflow:
|
6bce5d6 to
a2f100b
Compare
Add a Statistics message embedded on SegmentInfo so DataCoord reads aggregate metrics (sizes, counts, timestamp range, null counts, quantiles) from a persisted field instead of iterating FieldBinlog arrays on every scheduling decision. Wire formats: SaveBinlogPathsRequest.stats is populated by every flush. The receiver consumes only stats_binlog_size, and only for V3 segments where stats live in the manifest; V2 derives the cumulative footprint from the statslog FieldBinlog array. CompactionSegment.stats is populated by every compactor at write completion and copied verbatim onto the new SegmentInfo. Insert and delta aggregates come from the FieldBinlog arrays the compactor ships; stats blob footprint comes from a counter the BinlogRecordWriter accumulates inside writeStats / appendV3Stats and exposes via GetStatsBlobSize(). For V3 segments, AlterSegments skips per-FieldBinlog KV writes. The manifest is the authoritative path source and Stats carries the aggregates DataCoord needs. Pre-existing V3 segments with binlog KVs still load correctly via applyBinlogInfo. Migration is lazy: NewSegmentInfo back-fills Stats from arrays on first load when the persisted field is nil. The back-fill lives in memory until the next mutating operator persists the SegmentInfo proto. ShouldDoSingleCompaction reports expired_fraction one bucket below the precise lower bound to prevent byte-size over-trigger when binlog sizes vary. EnsureStats is non-mutating to avoid a data race with concurrent RLock readers; eager init happens at NewSegmentInfo and under the meta.segMu write lock in mutating operators. V3 L0 compaction outputs now carry TimestampFrom/To on the resulting deltalog, matching the V1 path. issue: milvus-io#50406 Signed-off-by: Ted Xu <ted.xu@zilliz.com>
a2f100b to
3982aea
Compare
See #50406
DataCoord today derives aggregate segment metrics by iterating
SegmentInfo's FieldBinlog arrays on every scheduling decision.
For V3 segments the per-field binlog KV entries in etcd are pure
write amplification because the manifest is already the
authoritative path source.
This PR adds a Statistics message embedded on SegmentInfo so
DataCoord reads aggregates (sizes, counts, timestamp range,
null counts, quantiles) directly from a persisted field with no
branching on storage version at the read sites.
Wire formats
SaveBinlogPathsRequest.stats is populated by every flush. The
receiver consumes only stats_binlog_size, and only for V3
segments where stats live in the manifest. V2 derives the
cumulative footprint from the statslog FieldBinlog array.
CompactionSegment.stats is populated by every compactor at
write completion and copied verbatim onto the new SegmentInfo.
Insert and delta aggregates come from the FieldBinlog arrays
the compactor ships; stats blob footprint comes from a counter
the BinlogRecordWriter accumulates inside writeStats /
appendV3Stats and exposes via GetStatsBlobSize.
V3 etcd KV write skip
AlterSegments skips per-FieldBinlog KV writes for V3 segments;
the manifest is authoritative for paths and Stats carries the
aggregates DataCoord needs. Pre-existing V3 segments with binlog
KVs still load correctly via applyBinlogInfo.
Migration
Lazy. NewSegmentInfo back-fills Stats from arrays on first load
when the persisted field is nil. The back-fill lives in memory
until the next mutating operator persists the SegmentInfo proto.
Other notable changes
ShouldDoSingleCompaction reports expired_fraction one bucket
below the precise lower bound to prevent byte-size over-trigger
when binlog sizes vary.
EnsureStats is non-mutating to avoid a data race with
concurrent RLock readers; eager init happens at NewSegmentInfo
and under the meta.segMu write lock in mutating operators.
V3 L0 compaction outputs now carry TimestampFrom/To on the
resulting deltalog, matching the V1 path.
Test plan
BuildStatsFromFieldBinlogs