-
Notifications
You must be signed in to change notification settings - Fork 3.7k
refactor: extractors can return multiple samples #17064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7e9be11
to
eb615eb
Compare
pkg/chunkenc/memchunk.go
Outdated
cur: []logproto.Sample{}, | ||
currLabels: []log.LabelsResult{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for the sake of consistency I would name them curr
and currLabels
pkg/chunkenc/memchunk.go
Outdated
if len(e.cur) == 1 { | ||
e.cur = e.cur[:0] | ||
e.currLabels = e.currLabels[:0] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was very confusing for me. Could you add a comment what's happening here: element at index 0 is the current one and there are no more next elements, meaning that in order to load next batch, we need to clear the current slice curr
5fef96f
to
6b61619
Compare
Signed-off-by: Trevor Whitney <[email protected]>
What this PR does / why we need it:
This PR refactors the
SampleExtractor
interface to allowProcess
to return multiple samples for a specific line. This refactor is a prerequisite for #17149, which allows extractors in a Multi-Variant query to pre-process common stages in an extraction pipeline (such as line/label filters, logfmt, etc) and then run only the extraction specific stage over the pre-process log line, which should significantly reduce the amount of work done when process a multi-variant query.Originally, the multi-variant query work refactored sample evaluation to be able to accept multiple extractors. this was the wrong approach as it limits our ability to pre-process a log line. The extractors do not expose much about their internal pipeline, so it would be difficult to find common stages. Furthermore, the extractor does not know the context of the line being processed until the call to
.Process()
, making it hard for the extractors to share state per log line between them.Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR