Skip to content

refactor: extractors can return multiple samples #17064

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 9, 2025

Conversation

trevorwhitney
Copy link
Collaborator

@trevorwhitney trevorwhitney commented Apr 8, 2025

What this PR does / why we need it:

This PR refactors the SampleExtractor interface to allow Process to return multiple samples for a specific line. This refactor is a prerequisite for #17149, which allows extractors in a Multi-Variant query to pre-process common stages in an extraction pipeline (such as line/label filters, logfmt, etc) and then run only the extraction specific stage over the pre-process log line, which should significantly reduce the amount of work done when process a multi-variant query.

Originally, the multi-variant query work refactored sample evaluation to be able to accept multiple extractors. this was the wrong approach as it limits our ability to pre-process a log line. The extractors do not expose much about their internal pipeline, so it would be difficult to find common stages. Furthermore, the extractor does not know the context of the line being processed until the call to .Process(), making it hard for the extractors to share state per log line between them.

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
    • Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@trevorwhitney trevorwhitney changed the title Refactor extractors multiple samples refactor: extractors can return multiple samples Apr 8, 2025
@trevorwhitney trevorwhitney marked this pull request as ready for review April 11, 2025 17:32
@trevorwhitney trevorwhitney requested a review from a team as a code owner April 11, 2025 17:32
@trevorwhitney trevorwhitney force-pushed the refactor-extractors-multiple-samples branch from 7e9be11 to eb615eb Compare April 11, 2025 18:00
Comment on lines 1734 to 1735
cur: []logproto.Sample{},
currLabels: []log.LabelsResult{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for the sake of consistency I would name them curr and currLabels

Comment on lines 1757 to 1760
if len(e.cur) == 1 {
e.cur = e.cur[:0]
e.currLabels = e.currLabels[:0]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was very confusing for me. Could you add a comment what's happening here: element at index 0 is the current one and there are no more next elements, meaning that in order to load next batch, we need to clear the current slice curr

@trevorwhitney trevorwhitney force-pushed the refactor-extractors-multiple-samples branch from 5fef96f to 6b61619 Compare May 2, 2025 20:36
@trevorwhitney trevorwhitney merged commit 577d501 into main May 9, 2025
65 checks passed
@trevorwhitney trevorwhitney deleted the refactor-extractors-multiple-samples branch May 9, 2025 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants