Add development maxExportBatchSize configuration to Periodic MetricReader#4895
Add development maxExportBatchSize configuration to Periodic MetricReader#4895dashpole wants to merge 7 commits intoopen-telemetry:mainfrom
Conversation
04e847b to
581ba77
Compare
|
@dashpole |
|
I tried to capture that requirement here:
|
|
cc @open-telemetry/specs-metrics-approvers |
Got it. I was assuming the ordering within the collect, given incoming spec on per-series timestamps. |
|
I rephrased the requirement to |
|
@rajkumar-rangaraj Please review to see if it is feasible to implement this efficiently in .NET. This would be useful in the ETW/User-Events metric exporter, which has a hard 64 KB limit, and it currently achieves this by writing each metric point in one call. While @utpilla @lalitb Same for Rust etw/user-event exporters. |
@cijothomas you are correct this would help in the ETW/User-Events exporter scenario. This is feasible to implement in .NET. We already have |
|
I support the direction of this PR. There are couple places we need to consider before I give my approval:
|
It seems like the Periodic MetricReader should not Shutdown the exporters until after it has exported all batches it needs to? I'm not sure the shutdown method for an exporter would need to change here.
Again, I think this is the Periodic MetricReader's responsibility to properly order calls to the exporter. If the user calls ForceFlush on the MeterProvider, I would expect it to first collect, then batch and export, and then call force flush. Part of what is confusing here is that the spec seems to suggest that push exporters can call collect?
But the Periodic MetricReader spec seems pretty clear that it is what is supposed to do the collecting... |
3a255a8 to
977309e
Compare
Right, and we need to update the spec. Let me give a concrete example, https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#forceflush-1 says "ForceFlush SHOULD collect metrics, call Export(batch) and ForceFlush() on the configured Push Metric Exporter.", this needs to be updated as we're going to make multiple calls to Export(batch1) Export(batch2) ... and in the end ForceFlush(). If we hit a timeout, what should we do?
Where do you see that? Paste the specific wording here? |
|
Got it. I looked through all the shutdown/forceflush specs, and I think the only one that needs updating is the MetricReader.ForceFlush one you identified: a0403e9. |
I was reading the wrong ForceFlush again :) |
Since the export calls are serial, the reader should handle it the same way they currently do (probably return a timeout error). I think this is covered by existing language:
|
If reader ForceFlush is reaching the time limit, it SHOULD not try to call exporter.Export for the remaining batches, and make a final call to exporter.ForceFlush, so the exporter is notified and has a chance to take action. |
I'm not sure... The current spec recommends for the MetricReader to abort if it reaches the timeout today -- even if it hasn't yet gotten to call ForceFlush on the exporter. I suppose there is enough flexibility for the MetricReader to try and be smart and reserve some of the timeout for ForceFlush if it wants (i.e. apply an earlier timeout to Export). But that feels like a change from where the spec is today without batch splitting, and orthogonal to this change. |
I disagree. "ForceFlush SHOULD collect metrics, call Export(batch) and ForceFlush() on the configured Push Metric Exporter." I think we need to update the semantic by saying "you don't want to call Export() for all the batches if you're reaching the timeout limit, but you SHOULD still call ForceFlush in the end." |
|
Addressed in 382f8e6
|
|
I think approvals represent good language coverage, so i'm planning to merge this tomorrow unless there are objections. |
| batches. The initial batch of metric data MUST be split into as many "full" | ||
| batches of size `maxExportBatchSize` as possible -- even if this splits up data | ||
| points that belong to the same metric into different batches. The reader MUST | ||
| ensure all metric data points from a single `Collect()` are provided to |
There was a problem hiding this comment.
is this assuming we won't timeout before finishing all metric datapoints from collect1()?
If we had 1000 points, and 100 is max_batch_size, and we timeout at 8th batch. during next collect2(), we start over, and not continue from 8th batch from previous collect?
There was a problem hiding this comment.
The intent is that the timeout is still applied to a single export call, so we would attempt each Export. The problem that could arise is that the SDK generates so many batches that it ends up delaying the next export call. Ideally, language implementations would delay the next Collect, rather than continuously spawning new collections when the previous one hasn't completed.
We could alternatively change the timeout to apply to a collect and export(s) cycle, and adopt behavior similar to what you are suggesting (where once we timeout, we discard all unsent batches).
Fixes #4852
Prior Art
The Trace SDK and Logging SDK both support a
maxExportBatchSizeparameter to limit the number of spans/logs exported in a batch. The collector's exporter helper and batch processor support asend_batch_max_sizeconfiguration option, which (by default) applies to the number spans, logs, or metric data points. In all cases, the configured timeout applies to a single request.Requirements
Non-goals
Proposal
Add
maxExportBatchSizeto the periodic exporting MetricReader. The periodic exporting MetricReader splits the batch of metric data points received from Collect, if necessary, and then serially invokesExporton each split batch with the configured timeout.Alternatives considered
maxExportBatchSize for all MetricReaders
Instead of applying to only periodic readers, the batch size could apply to all readers. This alternative is not chosen because
maxExportBatchSize on OTLP exporters
Instead of being on the periodic exporting MetricReader, we could add this configuration on the OTLP http and grpc exporters. This alternative is not chosen because:
Prototypes:
Go: PoC for batching in PeriodicReader opentelemetry-go#7930
Links to the prototypes (when adding or changing features)
CHANGELOG.mdfile updated for non-trivial changesSpec compliance matrix updated if necessary