Skip to content

[codex] Add shared proxy telemetry sinks#205

Merged
bglusman merged 1 commit into
mainfrom
codex-shared-telemetry-adapters
May 13, 2026
Merged

[codex] Add shared proxy telemetry sinks#205
bglusman merged 1 commit into
mainfrom
codex-shared-telemetry-adapters

Conversation

@bglusman
Copy link
Copy Markdown
Owner

What changed

  • Adds shared [[proxy.observability]] sinks for model-gateway attempt telemetry.
  • Supports log, http-json/webhook, and OTLP JSON sinks via otel/otlp/traceloop.
  • Emits attempt metadata from the routed gateway path after model/provider resolution, including requested model, root selector, concrete model, upstream model, provider id, gateway engine, duration, outcome, and failure kind.
  • Keeps telemetry payloads free of prompts, completions, headers, query strings, and secret values.
  • Documents observability sinks as separate from provider adapters and updates the gateway ADR accordingly.

Why

Provider engines should own model routing, retry policy, and credentials. Observability tools such as Traceloop should be able to receive events without pretending to be model gateways. This keeps the provider-adapter contract cleaner and gives us one place to fan out gateway attempt metadata.

Validation

  • cargo test -p calciforge
  • cargo clippy -p calciforge -- -D warnings
  • ruby scripts/check-architecture-ratchets.rb
  • ruby scripts/check-docs-site.rb
  • pre-push hook full workspace checks passed

Copilot AI review requested due to automatic review settings May 13, 2026 11:33
@bglusman bglusman force-pushed the codex-shared-telemetry-adapters branch from 9f0fef2 to bc6adb0 Compare May 13, 2026 11:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a shared “proxy observability” fanout so model-gateway attempt telemetry can be emitted to multiple sinks (log, HTTP JSON/webhook, and OTLP JSON targets such as Traceloop) without treating observability tools as provider adapters.

Changes:

  • Introduces TelemetryFanout + sink implementations and wires emission into the routed provider path.
  • Adds [[proxy.observability]] config schema + validation (including header validation and endpoint requirements).
  • Updates model-gateway docs + ADR to describe observability sinks as a separate surface from provider adapters.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
docs/model-gateway.md Documents [[proxy.observability]] sink configuration and redaction guarantees.
docs/adr/0001-model-gateway-and-agent-boundaries.md Updates ADR guidance to separate observability sinks from provider gateways.
crates/calciforge/src/proxy/telemetry.rs Implements telemetry events, sink fanout, HTTP/OTLP JSON emitters, and unit tests.
crates/calciforge/src/proxy/telemetry_tests.rs Adds an integration-style test ensuring routing emits redacted attempt telemetry.
crates/calciforge/src/proxy/mod.rs Exposes telemetry module and stores TelemetryFanout in ProxyState.
crates/calciforge/src/proxy/handlers.rs Emits gateway-attempt events after provider/model resolution (success/failure + duration).
crates/calciforge/src/config/validator.rs Validates observability sink kinds, endpoints, timeouts, and headers.
crates/calciforge/src/config/validator_tests_3.rs Adds validator tests covering new observability sink rules.
crates/calciforge/src/config/observability.rs Adds ProxyObservabilityConfig schema for [[proxy.observability]].
crates/calciforge/src/config.rs Wires observability sink config into ProxyConfig.

Ok(response) => telemetry_attempt.success(duration, response.choices.len()),
Err(error) => telemetry_attempt.failure(duration, error.failure_kind()),
};
state.telemetry.emit_gateway_attempt(event).await;
Comment on lines +298 to +299
let start_ns = event.timestamp_ms.saturating_mul(1_000_000);
let end_ns = start_ns.saturating_add(event.duration_ms.saturating_mul(1_000_000));
Comment on lines +146 to +151
pub(crate) async fn emit_gateway_attempt(&self, event: GatewayTelemetryEvent) {
for sink in self.sinks.iter() {
if let Err(err) = sink.emit_gateway_attempt(&event).await {
warn!(error = %err, sink = sink.name(), "Gateway telemetry sink failed");
}
}
/// Observability sink for model-gateway attempt telemetry.
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, Eq)]
pub struct ProxyObservabilityConfig {
/// Sink type. Supported values: "log", "http-json", "otel", "traceloop".
@bglusman bglusman marked this pull request as ready for review May 13, 2026 11:55
@bglusman bglusman force-pushed the codex-shared-telemetry-adapters branch from bc6adb0 to 25ae18a Compare May 13, 2026 12:04
Copilot AI review requested due to automatic review settings May 13, 2026 12:08
@bglusman bglusman force-pushed the codex-shared-telemetry-adapters branch from 25ae18a to 8decaee Compare May 13, 2026 12:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Comment on lines +146 to +150
pub(crate) async fn emit_gateway_attempt(&self, event: GatewayTelemetryEvent) {
for sink in self.sinks.iter() {
let sink = Arc::clone(sink);
let event = event.clone();
tokio::task::spawn(async move {
Comment on lines +151 to +152
if let Err(_err) = sink.emit_gateway_attempt(&event).await {
warn!(sink = sink.name(), "Gateway telemetry sink failed");

gateway.chat_completion(gateway_req).await
let upstream_model = gateway_req.model.clone();
let gateway_engine = gateway.engine_info().id;
@bglusman bglusman merged commit 2c84880 into main May 13, 2026
25 checks passed
@bglusman bglusman deleted the codex-shared-telemetry-adapters branch May 13, 2026 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants