[WIP] Worker visibility design#938
Conversation
| #[derive(Debug)] | ||
| pub(crate) struct WorkerHeartbeatInfo { | ||
| in_mem_thing: Option<InMemoryThing>, | ||
| pub(crate) data: WorkerHeartbeatData, |
There was a problem hiding this comment.
Doesn't seem like this needs to be here - you can just call capture_heartbeat_data right before sending it out
There was a problem hiding this comment.
As in the whole struct? Isn't there state we want to keep in between heartbeats? like elapsed_since_last_heartbeat
There was a problem hiding this comment.
Stupid GH highlight - no just the data. But, yes, good point, we do need ways to make some delta. Maybe just calling this last_data could clear things up
| Otel(Arc<dyn CoreMeter>), | ||
| Prometheus(Arc<dyn CoreMeter>), |
There was a problem hiding this comment.
There's no need for these variants if they're both just the same underlying type and the behavior is undifferentiated.
But, I'm realizing I may have been thinking about this a bit stupidly to start with anyway. I'm not sure we need to really be this clever. Sorry for leading you down that path.
Clearly we still need the "in memory thing" - but I don't think we need CompositeMeter. This comes from two observations:
- There aren't actually all that many metrics that we need to send that we are already recording
- The new ones we do add, we don't want to be sent out along with all the other metrics (or, if we do, we want to control that). For example, the
last_interval_processed_tasksis something that would never make sense to export along with the normal metrics.
So, think we can simplify things quite a bit here. For the metrics which are already recorded, and also need to be used for heartbeats, I propose we simply make wrappers like:
struct MetricAndHeartbeatCounter {
metric: Arc<dyn Counter>
for_heartbeat: HeartbeatCounter
}
And use that from Metrics Context. Here HeartbeatCounter just directly wraps the underlying otel Counter<u64> which is provided by the in-memory meter provider. (Side note: I need to change the CoreMeter to return Boxes instead of Arcs, since they really aren't needed most of the time, looks like, will do that in my PR).
This all means we double-record for the handful of things that both get exported as metrics and sent with the heartbeat, but... not such a bit deal, esp if that's all hidden inside the MetricsContext.
There was a problem hiding this comment.
So instead of registering an in_mem_reader to the existing MeterProviderBuilder, you're suggesting we create a separate in-mem meter provider that we double record with for only the metrics we need, and plumb that through to MetricsContext?
There was a problem hiding this comment.
Precisely. Feel free to DM me and we can zoom etc if you have questions
There was a problem hiding this comment.
Pushed a new update with CompositeMeter removed. Plumbed MetricAndHeartbeatCounter through to total_sticky_cache_hit and can currently see its value as 1, so plumbing should be in a working state.
| // TODO: Use existing MetricsContext or a new meter to record and export these metrics, possibly through the same MetricsCallBuffer | ||
| // ANDREW: metrics is temporal metrics, meter is user metrics | ||
| let (metrics, meter, in_mem_thing) = if let Some(ti) = telem_instance { |
There was a problem hiding this comment.
You definitely want to use the existing Context - we don't want to have to pass around more stuff.
As for the TemporalMeter - yeah, seemingly that needs to bypass all of this, since that can get exposed to users via the custom slot suppliers and the lang-side metrics buffer etc.
MetricsContext, however, is not pub, and can hide the in-mem thing internally, so, stuffing it in there seems appropriate.
| let composite_meter = build_otlp_metric_exporter(opts).unwrap(); | ||
|
|
||
| let in_memory_thing = composite_meter.in_mem_exporter().clone(); | ||
|
|
||
| rt.telemetry_mut() | ||
| .attach_late_init_metrics(Arc::new(composite_meter), Some(in_memory_thing.clone())); |
There was a problem hiding this comment.
Why build the exporter which returns a composite meter, and then pull the in memory thing out of it just to then pass it back in?
composite_meter already has everything we need, so there shouldn't be any reason this is necessary (other than the type erasure where attach_late_init_metrics only takes a CoreMeter which I suspect is why you ended up here)
IMO lang doesn't need to know about any of this at all - even that a CompositeMeter is a thing that exists. If we end up needing to have lang record some of these metrics on its side we can consider what to do then - maybe just provide a new method on Worker to get something specifically for recording these.
There was a problem hiding this comment.
Created a new CoreMeterWithMem wrapper trait that lets me pull out the InMemoryMeter, lmk your thoughts. InMemoryMeter is currently getting plumbed through TelemetryInstance to be passed to the instrument wrapper, MetricAndHeartbeatCounter
| pub versioning_behavior: VersioningBehavior, | ||
| } | ||
|
|
||
| struct WorkerPollerInfo { |
There was a problem hiding this comment.
I don't think we really need these - can just use the proto types directly
…teMeter, based off of spencer feedback
…thMem trait to plumb meter to instruments
|
Going a different direction, closing this PR for now |
What was changed
Why?
Checklist
Closes
How was this tested: