-
Notifications
You must be signed in to change notification settings - Fork 684
[WIP] bucket index header reader #13797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
091e0ac to
6f61f93
Compare
6515349 to
7b26e7d
Compare
|
💻 Deploy preview available (Set up benchmarks to compare disk and bucket readers for index header V2): |
Signed-off-by: Vladimir Varankin <[email protected]> wip! encoding: stream reader Signed-off-by: Vladimir Varankin <[email protected]> wip! encoding: reset reader Signed-off-by: Vladimir Varankin <[email protected]> rebase Signed-off-by: Vladimir Varankin <[email protected]> wip! reduce net buffer Signed-off-by: Vladimir Varankin <[email protected]> comments Signed-off-by: Vladimir Varankin <[email protected]> cap skippable bytes Signed-off-by: Vladimir Varankin <[email protected]>
6f61f93 to
492c9c1
Compare
| return NewStreamBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, p.metrics.streamReader, cfg) | ||
| //return NewStreamBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, p.metrics.streamReader, cfg) | ||
| return NewBucketBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, cfg) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Production reader factory accidentally changed to bucket reader
The readerFactory in NewBinaryReader has been changed from NewStreamBinaryReader to NewBucketBinaryReader in production code. The PR title indicates this is meant to "Set up benchmarks to compare disk and bucket readers" but this change affects all production usage of the reader pool. The commented-out NewStreamBinaryReader line suggests this was intended as a temporary testing change. The PR reviewer comment "Wrong." likely refers to this issue.
| //}) | ||
| //if err := g.Wait(); err != nil { | ||
| // return nil, err | ||
| //} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Index header download logic commented out in production
The ensureIndexHeaderOnDisk call and its parallel execution with tryDownloadSparseHeader have been commented out, removing the pre-download of index headers during LazyBinaryReader initialization. The import for errgroup is also removed. This breaks the expected initialization behavior where both the sparse header and index header would be prepared before lazy loading occurs. For a benchmark-only PR, these production code changes appear to be accidentally committed debugging code.
What this PR does
This PR wires the bucket index reader into the existing Label Values Offsets benchmarks.
Nothing else fancy.
Considering we are concerned about moving something from bucket to disk, we would likely want to follow up on this to add some instrumentation to record object storage calls or cache access patterns.
Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]. If changelog entry is not needed, please add thechangelog-not-neededlabel to the PR.about-versioning.mdupdated with experimental features.Note
Introduces a bucket-backed index-header reader and shared decoding interfaces, and extends benchmarks/tests to compare bucket vs disk readers.
BucketBinaryReaderreading symbols/postings directly from object storage; builds/loads sparse headers; caches symbols.BucketBinaryReader(replacing diskStreamBinaryReader).encoding.BucketDecbufFactorywith buffered streaming overobjstore.GetRange.Decbufto areaderinterface; updated symbols/postings to accept aDecbufFactoryinterface.writeSparseHeadersToFileto accept proto payload; call sites updated.SymbolsandPostingOffsetTable(V1/V2)refactored to use factory interface; added constructors for sparse headers.Written by Cursor Bugbot for commit 492c9c1. This will update automatically on new commits. Configure here.