Proposal: Add gen_ai.gateway.* attributes for AI routing/gateway layers

## Problem Statement

AI gateways and routers (OpenRouter, Portkey, LiteLLM, Martian, etc.) are becoming increasingly common in the GenAI ecosystem. These services act as an intermediary layer between applications and model providers, offering features like:

- Model routing and fallback logic
- Cost optimization and load balancing
- Unified API across multiple providers
- Usage tracking and rate limiting
- Data residency and compliance routing
- Bring Your Own Key (BYOK) support

The current GenAI semantic conventions don't have a standardized way to represent this gateway layer in telemetry. Users need to distinguish between issues at the gateway level vs. the underlying provider level for proper observability.

## Proposed Solution

Add a `gen_ai.gateway.*` attribute namespace to identify when requests are routed through an AI gateway/router.

---

## Attribute Categories

### 1. Core Attributes

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.name` | string | The name of the AI gateway/router service | `openrouter`, `portkey`, `litellm`, `martian` |
| `gen_ai.gateway.version` | string | Version of the gateway service | `1.0.0`, `2024.1` |
| `gen_ai.gateway.request.id` | string | Unique identifier for the routed request (distinct from `gen_ai.response.id` which is the provider's ID) | `gen-abc123xyz` |
| `gen_ai.gateway.app.id` | string | Application identifier registered with the gateway | `app_123456` |
| `gen_ai.gateway.origin` | string | Client origin/referrer URL | `https://myapp.com/` |

---

### 2. Model Resolution Attributes

Gateways often resolve model aliases, auto-select models, or map requested models to canonical identifiers.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.request.model` | string | The model identifier as requested through the gateway (may be an alias or auto-routed) | `auto`, `gpt-4-turbo`, `anthropic/claude-3-sonnet` |
| `gen_ai.gateway.response.models` | string[] | The actual model(s) resolved/selected by the gateway (includes fallback models if attempted) | `["anthropic/claude-3-5-sonnet-20241022"]`, `["openai/gpt-4", "anthropic/claude-3-5-sonnet"]` |
| `gen_ai.gateway.model.alias_resolved` | boolean | Whether an alias was resolved to a canonical model | `true`, `false` |

---

### 3. Routing Strategy Attributes

Gateways make intelligent decisions about where to send requests based on various strategies.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.route.strategy` | string | The routing strategy used | `lowest_cost`, `lowest_latency`, `highest_throughput`, `round_robin`, `weighted_random`, `manual_order`, `fallback` |
| `gen_ai.gateway.load_balance.enabled` | boolean | Whether load balancing was applied | `true`, `false` |
| `gen_ai.gateway.session.hit` | boolean | Whether sticky session cache was used (for cache affinity) | `true`, `false` |
| `gen_ai.gateway.session.id` | string | Session identifier for sticky routing (distinct from `gen_ai.conversation.id` which is for chat context) | `sess_abc123` |

---

### 4. Fallback Attributes

Gateways implement fallback mechanisms when providers fail or are unavailable.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.fallback.used` | boolean | Whether a fallback provider/model was used | `true`, `false` |
| `gen_ai.gateway.fallback.reason` | string | Reason fallback was triggered | `rate_limit`, `timeout`, `provider_error`, `capacity`, `model_unavailable`, `endpoint_status`, `moderation_blocked` |
| `gen_ai.gateway.fallback.attempt_number` | int | Which attempt this is (1 = first try, 2 = first fallback, etc.) | `1`, `2`, `3` |
| `gen_ai.gateway.fallback.total_attempts` | int | Total number of attempts made | `3` |
| `gen_ai.gateway.fallback.latency_wasted` | double | Time spent on failed attempts before success (seconds) | `1.5` |
| `gen_ai.gateway.fallback.model_level` | boolean | Whether fallback occurred at the model level (vs provider level) | `true`, `false` |
| `gen_ai.gateway.providers.attempted` | string[] | List of providers attempted in order | `["openai", "anthropic"]` |

---

### 5. Performance & Latency Attributes

Gateways track provider performance metrics and request latencies.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.latency` | double | Total gateway processing latency (seconds) | `0.511` |
| `gen_ai.gateway.latency.moderation` | double | Time spent on content moderation (seconds) | `0.214` |
| `gen_ai.gateway.latency.generation` | double | Time spent on upstream generation (seconds) | `0.719` |
| `gen_ai.gateway.endpoint.latency_p50` | double | 50th percentile latency for selected endpoint (seconds) | `0.5` |
| `gen_ai.gateway.endpoint.latency_p99` | double | 99th percentile latency for selected endpoint (seconds) | `2.1` |
| `gen_ai.gateway.endpoint.throughput_p50` | double | 50th percentile throughput (tokens/sec) | `150.0` |
| `gen_ai.gateway.endpoint.uptime_percent` | double | Provider endpoint uptime percentage (0-100) | `99.5` |
| `gen_ai.gateway.endpoint.status` | string | Current status of the selected endpoint | `default`, `degraded`, `down`, `deprioritized` |

---

### 6. Streaming & Request State Attributes

Gateways track the state and mode of requests.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.streamed` | boolean | Whether the response was streamed | `true`, `false` |
| `gen_ai.gateway.cancelled` | boolean | Whether the request was cancelled by the client | `true`, `false` |

---

### 7. Token Detail Attributes (Provider-Specific Breakdowns)

Standard token counts use existing `gen_ai.usage.input_tokens` and `gen_ai.usage.output_tokens`. These gateway attributes capture additional token breakdowns that providers report.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.usage.tokens.reasoning` | int | Reasoning/thinking tokens (e.g., o1, Claude thinking) | `150` |
| `gen_ai.gateway.usage.tokens.cached` | int | Tokens served from provider's prompt cache | `500` |
| `gen_ai.gateway.usage.tokens.output_images` | int | Image output tokens | `1024` |

---

### 8. Media & Tool Usage Attributes

Gateways track media inputs/outputs and tool usage.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.media.input_count` | int | Number of media items in input (images, files) | `3` |
| `gen_ai.gateway.media.input_audio_count` | int | Number of audio inputs | `1` |
| `gen_ai.gateway.media.output_count` | int | Number of media items in output | `2` |
| `gen_ai.gateway.search_results_count` | int | Number of web search results used | `5` |

---

### 9. Caching Attributes

Gateways can leverage prompt caching and session caching for efficiency.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.cache.hit` | boolean | Whether gateway-level cache was hit | `true`, `false` |
| `gen_ai.gateway.cache.type` | string | Type of cache hit | `prompt`, `session`, `response` |
| `gen_ai.gateway.cache.tokens_saved` | int | Number of tokens served from cache | `1500` |
| `gen_ai.gateway.cache.cost_saved` | double | Cost savings from caching (USD) | `0.0015` |
| `gen_ai.gateway.cache.discount` | double | Cache discount applied (USD, negative value) | `-0.0005` |

---

### 10. Rate Limiting Attributes

Gateways enforce rate limits at various levels.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.rate_limit.hit` | boolean | Whether a rate limit was triggered | `true`, `false` |
| `gen_ai.gateway.rate_limit.type` | string | Type of rate limit that was hit | `user`, `endpoint`, `api_key`, `ip`, `free_tier` |
| `gen_ai.gateway.rate_limit.name` | string | Name/identifier of the rate limit | `user_rpm`, `endpoint_rpd` |
| `gen_ai.gateway.rate_limit.remaining` | int | Remaining requests in the current window | `45` |
| `gen_ai.gateway.rate_limit.reset_at` | string | When the rate limit resets (ISO 8601 timestamp) | `2024-01-15T12:00:00Z` |

---

### 11. Quota & Budget Attributes

Gateways track usage against quotas and budgets.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.quota.credits_remaining` | double | Remaining credits/budget | `50.25` |
| `gen_ai.gateway.quota.limit_type` | string | Type of quota limit | `total`, `daily`, `weekly`, `monthly` |
| `gen_ai.gateway.quota.limit_remaining` | double | Remaining quota value | `100.0` |
| `gen_ai.gateway.quota.guardrail_hit` | boolean | Whether a budget guardrail was triggered | `true`, `false` |

---

### 12. Cost Attributes

Gateways track costs at both the gateway and provider level.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.usage.cost` | double | Total cost charged by the gateway (may include markup) | `0.001296` |
| `gen_ai.gateway.usage.cost_currency` | string | Currency for cost values | `USD` |
| `gen_ai.gateway.usage.cost_upstream` | double | Cost charged by the underlying provider | `0.001296` |

---

### 13. BYOK (Bring Your Own Key) Attributes

Gateways support customers using their own provider API keys.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.byok.enabled` | boolean | Whether BYOK was used for this request | `true`, `false` |
| `gen_ai.gateway.byok.provider` | string | Which provider's customer key was used | `openai`, `anthropic` |
| `gen_ai.gateway.byok.fee` | double | Gateway service fee for BYOK (USD) | `0.0001` |

---

### 14. Compliance & Data Region Attributes

Gateways handle data residency and compliance requirements.

| Attribute | Type | Description | Example Values |
|-----------|------|-------------|----------------|
| `gen_ai.gateway.data_region` | string | Data region used for routing | `global`, `europe`, `us-east`, `asia-pacific` |
| `gen_ai.gateway.compliance.hipaa` | boolean | Whether HIPAA-compliant routing was used | `true`, `false` |
| `gen_ai.gateway.compliance.soc2` | boolean | Whether SOC2-compliant routing was used | `true`, `false` |
| `gen_ai.gateway.data_policy` | string | Data policy applied | `no_logging`, `no_training`, `full_logging` |

---

## Relationship to Existing gen_ai.* Attributes

The gateway attributes complement (not replace) existing gen_ai semantic conventions. Use the standard attributes for provider-level data:

| Use This (Standard) | For | Gateway Adds |
|---------------------|-----|--------------|
| `gen_ai.provider.name` | The provider that handled the request | `gen_ai.gateway.providers.attempted` for fallback chain |
| `gen_ai.request.model` | The resolved model sent to provider | `gen_ai.gateway.request.model` for original alias/request |
| `gen_ai.response.model` | The model that responded | `gen_ai.gateway.response.models` for fallback chain |
| `gen_ai.response.id` | Provider's response ID | `gen_ai.gateway.request.id` for gateway's tracking ID |
| `gen_ai.operation.name` | Operation type (chat, embeddings) | - (use standard attribute) |
| `gen_ai.usage.input_tokens` | Input token count | `gen_ai.gateway.usage.tokens.cached` for cache breakdown |
| `gen_ai.usage.output_tokens` | Output token count | `gen_ai.gateway.usage.tokens.reasoning` for reasoning breakdown |
| `gen_ai.conversation.id` | Chat thread/session ID | `gen_ai.gateway.session.id` for routing affinity (different purpose) |

---

## Example Usage

### Standard Request

```
# Standard GenAI attributes (use these for provider data)
gen_ai.provider.name = 'aws.bedrock'
gen_ai.request.model = 'anthropic/claude-4.5-haiku-20251001'
gen_ai.response.model = 'anthropic/claude-4.5-haiku-20251001'
gen_ai.response.id = '98b87d18-b2f5-4dcb-b0b0-f5a8e4f12ca4'
gen_ai.operation.name = 'chat'
gen_ai.usage.input_tokens = 841
gen_ai.usage.output_tokens = 22
gen_ai.response.finish_reasons = ['stop']

# Gateway layer attributes (additional context)
gen_ai.gateway.name = 'openrouter'
gen_ai.gateway.request.id = 'gen-1769536032-2L9dWejMDV7jMP8m7LF5'
gen_ai.gateway.origin = 'https://claude.ai/'
gen_ai.gateway.streamed = true
gen_ai.gateway.cancelled = false
gen_ai.gateway.latency = 0.511
gen_ai.gateway.latency.moderation = 0.214
gen_ai.gateway.latency.generation = 0.719
gen_ai.gateway.usage.tokens.reasoning = 0
gen_ai.gateway.usage.tokens.cached = 0
gen_ai.gateway.usage.cost = 0.001296
gen_ai.gateway.usage.cost_upstream = 0.001296
gen_ai.gateway.byok.enabled = false
gen_ai.gateway.fallback.used = false
gen_ai.gateway.endpoint.status = 'default'
```

### Fallback Scenario

```
# Standard attributes reflect final successful provider
gen_ai.provider.name = 'anthropic'
gen_ai.response.model = 'anthropic/claude-3-5-sonnet'

# Gateway attributes show the full story
gen_ai.gateway.name = 'openrouter'
gen_ai.gateway.response.models = ['openai/gpt-4', 'anthropic/claude-3-5-sonnet']
gen_ai.gateway.fallback.used = true
gen_ai.gateway.fallback.reason = 'endpoint_status'
gen_ai.gateway.fallback.attempt_number = 2
gen_ai.gateway.fallback.total_attempts = 2
gen_ai.gateway.fallback.latency_wasted = 0.8
gen_ai.gateway.providers.attempted = ['openai', 'anthropic']
gen_ai.gateway.route.strategy = 'fallback'
gen_ai.gateway.endpoint.status = 'down'
```

### BYOK Request

```
gen_ai.provider.name = 'openai'
gen_ai.gateway.name = 'openrouter'
gen_ai.gateway.byok.enabled = true
gen_ai.gateway.byok.provider = 'openai'
gen_ai.gateway.byok.fee = 0.0001
gen_ai.gateway.usage.cost = 0.0001
gen_ai.gateway.usage.cost_upstream = 0.0024
```

### Request with Media & Search

```
gen_ai.provider.name = 'openai'
gen_ai.gateway.name = 'openrouter'
gen_ai.gateway.media.input_count = 2
gen_ai.gateway.media.input_audio_count = 0
gen_ai.gateway.media.output_count = 1
gen_ai.gateway.search_results_count = 5
gen_ai.gateway.usage.tokens.output_images = 1024
```

---

## Use Cases

1. **Debugging latency issues**: Separate gateway latency, moderation latency, and generation latency to pinpoint bottlenecks
2. **Cost attribution**: Track costs at both the gateway and provider level, including markups and BYOK fees
3. **Failure analysis**: Identify whether failures occurred at the gateway layer or provider layer; understand endpoint status
4. **Fallback monitoring**: Track how often fallbacks are triggered, why (including endpoint_status), and how many attempts were needed
5. **Model resolution tracking**: Understand how aliases/auto-routing resolve to actual models
6. **Performance optimization**: Use endpoint performance heuristics to understand routing decisions
7. **Rate limit debugging**: Identify which rate limits are being hit and when they reset
8. **Compliance auditing**: Track data region routing and compliance requirements
9. **BYOK cost analysis**: Separate BYOK service fees from upstream provider costs
10. **Cache efficiency**: Track cache hit rates and cost savings from caching

---

## Prior Art

The spec already acknowledges that proxies and hosting platforms exist:

> "Multiple providers, including Azure OpenAI, Gemini, and AI hosting platforms are accessible using the OpenAI REST API and corresponding client libraries, but may proxy or host models from different providers."

However, there's no standardized attribute to identify these gateway/proxy layers.

---

## Additional Context

- AI gateways are distinct from providers - they don't host models, they route to them
- This is similar to how HTTP semantic conventions distinguish between proxies and origin servers
- Multiple gateway vendors would benefit from standardization here
- These attributes were derived from real-world gateway implementations

Happy to discuss naming alternatives or prioritization of which attributes are most critical for an initial release.

Use This (Standard)	For	Gateway Adds
`gen_ai.provider.name`	The provider that handled the request	`gen_ai.gateway.providers.attempted` for fallback chain
`gen_ai.request.model`	The resolved model sent to provider	`gen_ai.gateway.request.model` for original alias/request
`gen_ai.response.model`	The model that responded	`gen_ai.gateway.response.models` for fallback chain
`gen_ai.response.id`	Provider's response ID	`gen_ai.gateway.request.id` for gateway's tracking ID
`gen_ai.operation.name`	Operation type (chat, embeddings)	- (use standard attribute)
`gen_ai.usage.input_tokens`	Input token count	`gen_ai.gateway.usage.tokens.cached` for cache breakdown
`gen_ai.usage.output_tokens`	Output token count	`gen_ai.gateway.usage.tokens.reasoning` for reasoning breakdown
`gen_ai.conversation.id`	Chat thread/session ID	`gen_ai.gateway.session.id` for routing affinity (different purpose)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Add gen_ai.gateway.* attributes for AI routing/gateway layers #3342

Problem Statement

Proposed Solution

Attribute Categories

1. Core Attributes

2. Model Resolution Attributes

3. Routing Strategy Attributes

4. Fallback Attributes

5. Performance & Latency Attributes

6. Streaming & Request State Attributes

7. Token Detail Attributes (Provider-Specific Breakdowns)

8. Media & Tool Usage Attributes

9. Caching Attributes

10. Rate Limiting Attributes

11. Quota & Budget Attributes

12. Cost Attributes

13. BYOK (Bring Your Own Key) Attributes

14. Compliance & Data Region Attributes

Relationship to Existing gen_ai.* Attributes

Example Usage

Standard Request

Fallback Scenario

BYOK Request

Request with Media & Search

Use Cases

Prior Art

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Attribute	Type	Description	Example Values
`gen_ai.gateway.name`	string	The name of the AI gateway/router service	`openrouter`, `portkey`, `litellm`, `martian`
`gen_ai.gateway.version`	string	Version of the gateway service	`1.0.0`, `2024.1`
`gen_ai.gateway.request.id`	string	Unique identifier for the routed request (distinct from `gen_ai.response.id` which is the provider's ID)	`gen-abc123xyz`
`gen_ai.gateway.app.id`	string	Application identifier registered with the gateway	`app_123456`
`gen_ai.gateway.origin`	string	Client origin/referrer URL	`https://myapp.com/`

Attribute	Type	Description	Example Values
`gen_ai.gateway.request.model`	string	The model identifier as requested through the gateway (may be an alias or auto-routed)	`auto`, `gpt-4-turbo`, `anthropic/claude-3-sonnet`
`gen_ai.gateway.response.models`	string[]	The actual model(s) resolved/selected by the gateway (includes fallback models if attempted)	`["anthropic/claude-3-5-sonnet-20241022"]`, `["openai/gpt-4", "anthropic/claude-3-5-sonnet"]`
`gen_ai.gateway.model.alias_resolved`	boolean	Whether an alias was resolved to a canonical model	`true`, `false`

Attribute	Type	Description	Example Values
`gen_ai.gateway.route.strategy`	string	The routing strategy used	`lowest_cost`, `lowest_latency`, `highest_throughput`, `round_robin`, `weighted_random`, `manual_order`, `fallback`
`gen_ai.gateway.load_balance.enabled`	boolean	Whether load balancing was applied	`true`, `false`
`gen_ai.gateway.session.hit`	boolean	Whether sticky session cache was used (for cache affinity)	`true`, `false`
`gen_ai.gateway.session.id`	string	Session identifier for sticky routing (distinct from `gen_ai.conversation.id` which is for chat context)	`sess_abc123`

Attribute	Type	Description	Example Values
`gen_ai.gateway.fallback.used`	boolean	Whether a fallback provider/model was used	`true`, `false`
`gen_ai.gateway.fallback.reason`	string	Reason fallback was triggered	`rate_limit`, `timeout`, `provider_error`, `capacity`, `model_unavailable`, `endpoint_status`, `moderation_blocked`
`gen_ai.gateway.fallback.attempt_number`	int	Which attempt this is (1 = first try, 2 = first fallback, etc.)	`1`, `2`, `3`
`gen_ai.gateway.fallback.total_attempts`	int	Total number of attempts made	`3`
`gen_ai.gateway.fallback.latency_wasted`	double	Time spent on failed attempts before success (seconds)	`1.5`
`gen_ai.gateway.fallback.model_level`	boolean	Whether fallback occurred at the model level (vs provider level)	`true`, `false`
`gen_ai.gateway.providers.attempted`	string[]	List of providers attempted in order	`["openai", "anthropic"]`

Attribute	Type	Description	Example Values
`gen_ai.gateway.latency`	double	Total gateway processing latency (seconds)	`0.511`
`gen_ai.gateway.latency.moderation`	double	Time spent on content moderation (seconds)	`0.214`
`gen_ai.gateway.latency.generation`	double	Time spent on upstream generation (seconds)	`0.719`
`gen_ai.gateway.endpoint.latency_p50`	double	50th percentile latency for selected endpoint (seconds)	`0.5`
`gen_ai.gateway.endpoint.latency_p99`	double	99th percentile latency for selected endpoint (seconds)	`2.1`
`gen_ai.gateway.endpoint.throughput_p50`	double	50th percentile throughput (tokens/sec)	`150.0`
`gen_ai.gateway.endpoint.uptime_percent`	double	Provider endpoint uptime percentage (0-100)	`99.5`
`gen_ai.gateway.endpoint.status`	string	Current status of the selected endpoint	`default`, `degraded`, `down`, `deprioritized`

Attribute	Type	Description	Example Values
`gen_ai.gateway.streamed`	boolean	Whether the response was streamed	`true`, `false`
`gen_ai.gateway.cancelled`	boolean	Whether the request was cancelled by the client	`true`, `false`

Attribute	Type	Description	Example Values
`gen_ai.gateway.usage.tokens.reasoning`	int	Reasoning/thinking tokens (e.g., o1, Claude thinking)	`150`
`gen_ai.gateway.usage.tokens.cached`	int	Tokens served from provider's prompt cache	`500`
`gen_ai.gateway.usage.tokens.output_images`	int	Image output tokens	`1024`

Attribute	Type	Description	Example Values
`gen_ai.gateway.media.input_count`	int	Number of media items in input (images, files)	`3`
`gen_ai.gateway.media.input_audio_count`	int	Number of audio inputs	`1`
`gen_ai.gateway.media.output_count`	int	Number of media items in output	`2`
`gen_ai.gateway.search_results_count`	int	Number of web search results used	`5`

Attribute	Type	Description	Example Values
`gen_ai.gateway.cache.hit`	boolean	Whether gateway-level cache was hit	`true`, `false`
`gen_ai.gateway.cache.type`	string	Type of cache hit	`prompt`, `session`, `response`
`gen_ai.gateway.cache.tokens_saved`	int	Number of tokens served from cache	`1500`
`gen_ai.gateway.cache.cost_saved`	double	Cost savings from caching (USD)	`0.0015`
`gen_ai.gateway.cache.discount`	double	Cache discount applied (USD, negative value)	`-0.0005`

Attribute	Type	Description	Example Values
`gen_ai.gateway.rate_limit.hit`	boolean	Whether a rate limit was triggered	`true`, `false`
`gen_ai.gateway.rate_limit.type`	string	Type of rate limit that was hit	`user`, `endpoint`, `api_key`, `ip`, `free_tier`
`gen_ai.gateway.rate_limit.name`	string	Name/identifier of the rate limit	`user_rpm`, `endpoint_rpd`
`gen_ai.gateway.rate_limit.remaining`	int	Remaining requests in the current window	`45`
`gen_ai.gateway.rate_limit.reset_at`	string	When the rate limit resets (ISO 8601 timestamp)	`2024-01-15T12:00:00Z`

Attribute	Type	Description	Example Values
`gen_ai.gateway.quota.credits_remaining`	double	Remaining credits/budget	`50.25`
`gen_ai.gateway.quota.limit_type`	string	Type of quota limit	`total`, `daily`, `weekly`, `monthly`
`gen_ai.gateway.quota.limit_remaining`	double	Remaining quota value	`100.0`
`gen_ai.gateway.quota.guardrail_hit`	boolean	Whether a budget guardrail was triggered	`true`, `false`

Attribute	Type	Description	Example Values
`gen_ai.gateway.usage.cost`	double	Total cost charged by the gateway (may include markup)	`0.001296`
`gen_ai.gateway.usage.cost_currency`	string	Currency for cost values	`USD`
`gen_ai.gateway.usage.cost_upstream`	double	Cost charged by the underlying provider	`0.001296`

Attribute	Type	Description	Example Values
`gen_ai.gateway.byok.enabled`	boolean	Whether BYOK was used for this request	`true`, `false`
`gen_ai.gateway.byok.provider`	string	Which provider's customer key was used	`openai`, `anthropic`
`gen_ai.gateway.byok.fee`	double	Gateway service fee for BYOK (USD)	`0.0001`

Attribute	Type	Description	Example Values
`gen_ai.gateway.data_region`	string	Data region used for routing	`global`, `europe`, `us-east`, `asia-pacific`
`gen_ai.gateway.compliance.hipaa`	boolean	Whether HIPAA-compliant routing was used	`true`, `false`
`gen_ai.gateway.compliance.soc2`	boolean	Whether SOC2-compliant routing was used	`true`, `false`
`gen_ai.gateway.data_policy`	string	Data policy applied	`no_logging`, `no_training`, `full_logging`

Proposal: Add gen_ai.gateway.* attributes for AI routing/gateway layers #3342

Description

Problem Statement

Proposed Solution

Attribute Categories

1. Core Attributes

2. Model Resolution Attributes

3. Routing Strategy Attributes

4. Fallback Attributes

5. Performance & Latency Attributes

6. Streaming & Request State Attributes

7. Token Detail Attributes (Provider-Specific Breakdowns)

8. Media & Tool Usage Attributes

9. Caching Attributes

10. Rate Limiting Attributes

11. Quota & Budget Attributes

12. Cost Attributes

13. BYOK (Bring Your Own Key) Attributes

14. Compliance & Data Region Attributes

Relationship to Existing gen_ai.* Attributes

Example Usage

Standard Request

Fallback Scenario

BYOK Request

Request with Media & Search

Use Cases

Prior Art

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions