-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[pkg/ottl] Add ParseSeverity
function
#37280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
… made to mapGetter Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@bacherfl is this still draft? |
pkg/ottl/ottlfuncs/README.md
Outdated
|
||
`target` is a Getter that returns a string or an integer. | ||
`severityMapping` is a map containing the log levels, and a list of values they are mapped from. These values can be either | ||
strings, or map items containing a numeric range, defined by a `min` and `max` key, for the given log level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make it explicit whether the range has inclusive or exclusive bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i updated the readme now to make this explicit
return nil, fmt.Errorf("could not get log level: %w", err) | ||
} | ||
|
||
logLevel, err := evaluateSeverity(value, severityMap.AsRaw()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern about this function is performance, building up/looping through the severity mappings for every single execution can be expensive. I wish we could enforce the Mapping
argument to be a map literal value, so we could improve the lookups, but I think that's currently not supported by OTTL.
That said, I'd try to avoid as many loops as possible, and use the pcommon.Map
and pcommon.Slices
values instead of parsing them to .AsRaw()
. I might be wrong, but I think it would save some cycles/allocs that might worth the change. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern about this function is performance, building up/looping through the severity mappings for every single execution can be expensive.
I fully agree with this part at least. This primary principle behind the stanza implementation is to build the mapping once and then having constant lookup times while processing each record. If we can't make this happen somehow in OTTL then we will pay quite a cost.
Can we build this mapping inside of parseSeverity
but outside of the function returned by parseSeverity
?
func parseSeverity[K any](target ottl.Getter[K], mapping ottl.PMapGetter[K]) ottl.ExprFunc[K] {
// build static map of severities once
return func(ctx context.Context, tCtx K) (any, error) {
// use static map in every execution
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there's a better way but if nothing else, I think we could declare the compiled mapping as a var within parseSeverity
and outside the function it returns, and use a sync.Once
to compile it the first time the function executes. This is oversimplified psuedocode, but basically:
func parseSeverity[K any](target ottl.Getter[K], mapping ottl.PMapGetter[K]) ottl.ExprFunc[K] {
var compileOnce sync.Once
var staticMapping map[string]int
return func(ctx context.Context, tCtx K) (any, error) {
compileOnce.Do(func() {
precompiledMapping, getMappingErr = args.Mapping.Get(ctx, tCtx)
})
value, err := target.Get(ctx, tCtx)
if err != nil {
return nil, fmt.Errorf("could not get log level: %w", err)
}
sev, ok := staticMapping[value]
if !ok {
return defaultSeverity
}
return sev
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My favorite solution is finding a way to force the input to be a literal. It will take OTTL changes, but that is really what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even theoretically possible for a range to be expressed as a literal? Granted, it's not strictly necessary that we support ranges but there are some use cases where it is quite convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it even theoretically possible for a range to be expressed as a literal?
Yes I think it's possible, not exactly the range, but the whole map as literal, which for OTTL essentially means that the function's parameter value can be retrieved at the bootstrap time, and differently from the ottl.Getter
s, it's immutable and does not depend on the transformation context to get accessed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it even theoretically possible for a range to be expressed as a literal?
Yes I think it's possible, not exactly the range, but the whole map as literal, which for OTTL essentially means that the function's parameter value can be retrieved at the bootstrap time, and differently from the
ottl.Getter
s, it's immutable and does not depend on the transformation context to get accessed.
I agree supporting literal inputs is nice but my point is that sometimes it's more user friendly to support something which can be interpreted once when the function is built. I think the notion of a range is one of those situations because e.g. no one wants to have to list all the HTTP status codes in order to assign them to severity levels. This shouldn't be a choice between literal inputs and recomputing a mapping for every context. If we can provide more user friendly inputs AND compute a complete mapping only once, this is better than literal inputs and also better than recomputing the same thing repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree supporting literal inputs is nice but my point is that sometimes it's more user friendly to support something which can be interpreted once when the function is built
I might be missing the point, but that's what literals should do. The term "literals" is somehow confusing here, for OTTL it means we have a function's parameter that is not a getter, instead, it's a raw immutable value that is available when the function is built.
To support this use case, for example, OTTL needs to be changed so it knowns how to parse inputs like { "error":[ "err", { "min": 3, "max": 4 }]}
into a raw pcommon.Map
(required by the function argument).
The function usage doesn't change, the only thing that wouldn't be supported is using non-literal values (getters), as they're mutable, and their value might change from one statement to another, for example:
log_statements:
- context: log
statements:
- set(severity_number, ParseSeverity(severity_number, { "error":["err", { "min": 3, "max": 4 }]})) # Would work as the argument value is a literal, and cannot be changed by other statements.
- set(cache["mappings"], { "error":["err", { "min": 3, "max": 4 }]})
- set(severity_number, ParseSeverity(severity_number, cache["mappings"])) # Wouldn't work, as the cache["mappings"] path is mutable (getter), so it's not a literal and needs to be evaluate in every execution.
- set(cache["mappings"], { "error":["err", { "min": 5, "max": 6 }]})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your description of literals and agree we need them. My point though is about what happens after we have parsed the syntax.
Every single line of evaluateSeverityNumberMapping
is unnecessary if we build a mapping (immediate lookup, not function evaluation).
Severity ranges tend to have a very reasonable and finite number of possible values so there's no need to evaluate logic every time we see a log. "Is this a range criteria?" "Is there a min?" "What is the min?", "Is this value greater than the min?", etc are all unnecessary if we can precompile this into a lookup table that is instant access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, my answer was more focused on the OTTL side and how we could get the map parsed as literal.
I agree with you, we definitely need to build a lookup table for that, ideally with O(1) access.
rangeMin, gotMin := rangeMap[minKey] | ||
rangeMax, gotMax := rangeMap[maxKey] | ||
if !gotMin || !gotMax { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mixed feelings here, from one side we accept settings like:
set(attributes["test"], ParseSeverity(severity_number, {"info":[{"min": 200}]}))
But here that setting becomes no-op, and there's no error messages or logs alerting the user about that missing key (other than the no matching message). Should we consider this scenario invalid and raise an error instead? In addition to that, should we also raise an error if an invalid key name is present in the mappings ({"info":[{"invalid": 200}]}
?
For the min
and max
keys, I think another option would be making them "optional", so when one of them is suppressed, it could mean "no min/max bounds".
If the behavior is like this for keeping it the same as stanza, I'd add a note into the docs explaining that both keys are required, otherwise the condition is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Erroring for unexpected configuration feels right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i updated this section to return an error now
I agree, it's not necessary to have exactly the same structure. My hope was just that we'd keep the lessons learned from the stanza implementation.
I think there's a tradeoff here. I agree it's more clear to use
On the other hand, aside from min/max, I'm not convinced that AND is really all that useful with the current set of criteria. In this example, it would be impossible to find a value that is both equal to "err" and also in a numeric range. I like the idea in principle though and can imagine that we'd introduce more sophisticated criteria later. This raises other design questions for me though, such as whether order of criteria matters, supporting OR and NOT as well as AND - basically becomes an entire DSL. IMO we should just keep it simple for this first version and can look at a more advanced function separately if a need arises. |
Yes, I completely agree with you @djaglowski. My main point was giving conditions a name/type identifier, so it would be easier to extend and validating. The other examples are just possibilities, but I wouldn't suggest implementing them 😄 For example, the range condition has the Examples:
Thanks! |
@djaglowski @edmocosta thanks for reviewing, lets move forward with your suggestions. |
Thanks for the suggestions, will adapt the PR soon |
reverting back to draft while addressing the comments/suggestions |
Signed-off-by: Florian Bacher <[email protected]>
…fl/opentelemetry-collector-contrib into feat/35079/parse-severity
…fl/opentelemetry-collector-contrib into feat/35079/parse-severity
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
I updated the structure now to use the suggested structure. For the |
I think I addressed most points now and updated the PR accordingly. However there is still the issue of having the need to go through the conditions and evaluate them each time the function is invoked, as we would need a way to be able to force map arguments to be a literal. Should we put this PR on hold while this is not possible in OTTL yet? |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
reopening, but we will need for #40222 (literal value support) before this one can be completed |
Description
This PR adds the
ParseSeverity
function, as discussed in the linked ticket. I also had to make a minor change to theinternal
mapGetter
, handling the map literals to return a rawmap[string]any
, instead of apcommon.Map
. This is because if there is a map literal within a slice, thepcommon.Slice.FromRaw
cannot handle thepcommon.Map
, as it only works with raw data types.This change is however transparent, and the behavior to the outside of this package does not change.
EDIT: After merging main with the support for value expressions, introduced in #36883, this would affect the type of values returned by
ParseValueExpression
- previously this could potentially returnmap[string]any
/[]any
, but with the changes introduced in this PR, this would return apcommon.Map
/pcommon.Slice
.Please let me know if I should do this change in a separate PR though.
Link to tracking issue
Fixes #35079
Testing
Added unit and e2e tests
Documentation
Describe new function in the readme