-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add support for time, byte, and percent units in NumberFieldMapper #104037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Documentation preview: |
Pinging @elastic/es-search (Team:Search) |
This probably also needs adjustments for ES|QL to be able to leverage the unit. |
@felixbarny A smart approach to incorporate
|
Hey Mayya, thanks a lot for your comment.
The
I'm not sure if we need to have a mode for that as what's added here is strictly additional functionality rather than altering existing behavior. You can still do queries without units. So if you specify the unit
No, units can also be used for other types. For example, you might want to map a duration unit, such as |
Hi @felixbarny, I've created a changelog YAML for you. |
@jpountz I'd be curious on your thoughts about using I think using |
@felixbarny just a reminder, in case you had forgotten: https://opentelemetry.io/docs/specs/otel/metrics/data-model/#timeseries-model states that unit is part of an OTel metric's time series identity. Two metrics with the same name but different units should be considered two different time series. So should we also consider alternatives where unit is defined per document, rather than defined in the mappings? Alternatively, this could be another extension of passthrough, where unit is encoded in the field name. |
Oh, you're right, I didn't think of that. That'll make it tricky.
If we don't have the information about the unit in the mappings we also can't use it for field caps which Kibana and ES|QL depend on. Also, the parsing and conversion for term and range queries of a particular field can't take into account field metadata but has to first load actual data or even do a conversion per document. Units may also be used outside of TSDB, for example for logs and traces, so we can't rely on some metadata store for time series, either.
To check my understanding of what you're proposing - if you have a metric Overall, it seems like we do need to have the unit information available in the mapping so there's no contradiction with the OTel spec that this PR introduces. But we'll need to find ways how to dynamically add unit metadata, and how to encode the unit in the object hierarchy, while making that transparent to users. That probably means to manipulate field caps responses in a way that hides the physical layout ( It still feels a little weird to allow the same (logical) field name to contain values in different units. Imagine an extreme case of a |
Right.
Yes, either way. Anyway, I think I'm also convinced that it should be part of the mapping, and we can work that out separately. |
Intuitively I would expect support for units to work a bit more like |
I don't think that would be compatible with how units work in OpenTelemetry and also with how |
Thanks, that makes sense.
This is a flag that is only used to parse queries, in a similar way to how we'd expect Kibana to take advantage of it. And if Kibana can trust the |
I don't really see us ever wanting to do that. |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
Adding units to a
NumberFieldMapper
also adds new query capabilities to the field. For example, when annotating along
field withunit: us
, you can do term and range queries using notations like1s
, which will automatically convert that into microseconds.In this PR, I've leveraged the existing
meta
field and the convention for theunit
metadata. However, this metadata is not opaque to Elasticsearch anymore. So ideally, I think that theunit
should be a proper parameter for numeric field types. However, this will have all sorts of implications and complications for the existing usages of the unit field that's already adopted in our integrations (although I couldn't figure out if Kibana is actually making use ofmeta.unit
yet). As we'd want to expose the unit in field caps, this would also require some more work and boilerplate so I didn't want to implement that before discussing this further.This PR is also aligned with the concept of units in OpenTelemetry metrics that defines the format of unit to be a UCUM symbol. Using a well-established standard makes a lot of sense to me and increases interoperability. For compatibility reasons, this PR also supports the previously documented unit identifiers.
I've leveraged existing types used for settings to parse time, byte, and percent values with unit suffixes. This works well for now but these classes impose certain restrictions that may be limiting for this use case. For example,
TimeValue
doesn't negative or fractional numbers andRatioValue
doesn't support values above100%
. These limitations are probably fine for now but we might want to do something about that down the line, at which point the question is whether these should be completely separate classes dedicated to the settings and metric query use cases, respectively, or share common functionality.Closes #65432 to support querying for
us
in addition tomicros
. As this inconsistency (micros
vsms
) is now exposed to users via queries, I think the importance of consistency is higher now.Closes #31244 even though this doesn't add dedicated field types but rather relies on the
unit
for numeric field types.