feat: Add OpenTelemetry support for observability

## Description

It would be great to have native OpenTelemetry (OTel) support in vllm-mlx for production observability.

## Motivation

OpenTelemetry has become the industry standard for distributed tracing, metrics, and logs. Many organizations use OTel-compatible backends (Jaeger, Prometheus, Grafana, Datadog, etc.) for monitoring their ML inference services.

## Proposed Features

### 1. Metrics
- Request latency (P50, P95, P99)
- Tokens per second (input/output)
- Queue length / pending requests
- GPU/memory utilization
- Batch size distribution
- Time-to-first-token (TTFT)

### 2. Traces
- Request lifecycle (receive → queue → prefill → decode → response)
- Model inference spans
- Token generation steps
- Tool call execution (for MCP)

### 3. Logs (optional)
- Structured logging with trace correlation

## Configuration

Environment variables following OTel conventions:
```
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_SERVICE_NAME=vllm-mlx
OTEL_TRACES_SAMPLER=parentbased_traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1
```

## Use Cases

- Production monitoring dashboards (Grafana)
- Debug slow requests with distributed tracing
- Performance optimization with detailed metrics
- SLA/SLO monitoring
- Cost attribution per model/request

## References

- [OpenTelemetry Python SDK](https://opentelemetry.io/docs/languages/python/)
- [vLLM's OTel implementation](https://github.com/vllm-project/vllm/pull/5286) (upstream reference)
- [Langfuse integration](https://langfuse.com/docs/opentelemetry) (alternative approach)

## Additional Context

This would complement existing monitoring approaches and enable seamless integration with modern observability stacks without vendor lock-in.

Happy to contribute or discuss implementation details!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add OpenTelemetry support for observability #184

Description

Motivation

Proposed Features

1. Metrics

2. Traces

3. Logs (optional)

Configuration

Use Cases

References

Additional Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: Add OpenTelemetry support for observability #184

Description

Description

Motivation

Proposed Features

1. Metrics

2. Traces

3. Logs (optional)

Configuration

Use Cases

References

Additional Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions