Skip to content

Commit 05eb438

Browse files
committed
gen-ai: mark sampling-relevant attributes
1 parent 553948e commit 05eb438

File tree

8 files changed

+175
-14
lines changed

8 files changed

+175
-14
lines changed
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
change_type: enhancement
2+
component: gen_ai
3+
note: Added `sampling-relevant` flag to relevant GenAI span attributes to indicate their importance for sampling decisions.
4+
issues: [2994]

docs/gen-ai/aws-bedrock.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -204,6 +204,15 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
204204
it by default. Instrumentations MAY provide a way to enable
205205
populating this attribute.
206206

207+
The following attributes can be important for making sampling decisions
208+
and SHOULD be provided **at span creation time** (if provided at all):
209+
210+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
211+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
212+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
213+
* [`server.address`](/docs/registry/attributes/server.md)
214+
* [`server.port`](/docs/registry/attributes/server.md)
215+
207216
---
208217

209218
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/azure-ai-inference.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -205,6 +205,14 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
205205
it by default. Instrumentations MAY provide a way to enable
206206
populating this attribute.
207207

208+
The following attributes can be important for making sampling decisions
209+
and SHOULD be provided **at span creation time** (if provided at all):
210+
211+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
212+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
213+
* [`server.address`](/docs/registry/attributes/server.md)
214+
* [`server.port`](/docs/registry/attributes/server.md)
215+
208216
---
209217

210218
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/gen-ai-agent-spans.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,15 @@ Instrumentations SHOULD document the list of errors they report.
114114

115115
**[6] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
116116

117+
The following attributes can be important for making sampling decisions
118+
and SHOULD be provided **at span creation time** (if provided at all):
119+
120+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
121+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
122+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
123+
* [`server.address`](/docs/registry/attributes/server.md)
124+
* [`server.port`](/docs/registry/attributes/server.md)
125+
117126
---
118127

119128
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
@@ -354,6 +363,14 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
354363
it by default. Instrumentations MAY provide a way to enable
355364
populating this attribute.
356365

366+
The following attributes can be important for making sampling decisions
367+
and SHOULD be provided **at span creation time** (if provided at all):
368+
369+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
370+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
371+
* [`server.address`](/docs/registry/attributes/server.md)
372+
* [`server.port`](/docs/registry/attributes/server.md)
373+
357374
---
358375

359376
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/gen-ai-events.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ linkTitle: Events
88

99
<!-- toc -->
1010

11-
- [Event: `event.gen_ai.client.inference.operation.details`](#event-eventgen_aiclientinferenceoperationdetails)
12-
- [Event: `event.gen_ai.evaluation.result`](#event-eventgen_aievaluationresult)
11+
- [Event: `gen_ai.client.inference.operation.details`](#event-gen_aiclientinferenceoperationdetails)
12+
- [Event: `gen_ai.evaluation.result`](#event-gen_aievaluationresult)
1313

1414
<!-- tocstop -->
1515

@@ -39,7 +39,7 @@ GenAI instrumentations MAY capture user inputs sent to the model and responses r
3939
> Note:
4040
> Events are in-development and not yet available in some languages. Check [spec-compliance matrix](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.50.0/spec-compliance-matrix.md#logs) to see the implementation status in corresponding language.
4141
42-
## Event: `event.gen_ai.client.inference.operation.details`
42+
## Event: `gen_ai.client.inference.operation.details`
4343

4444
<!-- semconv event.gen_ai.client.inference.operation.details -->
4545
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
@@ -223,7 +223,7 @@ populating this attribute.
223223
<!-- END AUTOGENERATED TEXT -->
224224
<!-- endsemconv -->
225225

226-
## Event: `event.gen_ai.evaluation.result`
226+
## Event: `gen_ai.evaluation.result`
227227

228228
<!-- semconv event.gen_ai.evaluation.result -->
229229
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->

docs/gen-ai/gen-ai-spans.md

Lines changed: 81 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,15 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
217217
it by default. Instrumentations MAY provide a way to enable
218218
populating this attribute.
219219

220+
The following attributes can be important for making sampling decisions
221+
and SHOULD be provided **at span creation time** (if provided at all):
222+
223+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
224+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
225+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
226+
* [`server.address`](/docs/registry/attributes/server.md)
227+
* [`server.port`](/docs/registry/attributes/server.md)
228+
220229
---
221230

222231
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
@@ -307,27 +316,56 @@ The `gen_ai.operation.name` SHOULD be `embeddings`.
307316
| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
308317
|---|---|---|---|---|---|
309318
| [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | The name of the operation being performed. [1] | `chat`; `generate_content`; `text_completion` |
310-
| [`error.type`](/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [2] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
311-
| [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` If available. | string | The name of the GenAI model a request is being made to. [3] | `gpt-4` |
312-
| [`server.port`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` If `server.address` is set. | int | GenAI server port. [4] | `80`; `8080`; `443` |
319+
| [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | The Generative AI provider as identified by the client or server instrumentation. [2] | `openai`; `gcp.gen_ai`; `gcp.vertex_ai` |
320+
| [`error.type`](/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [3] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
321+
| [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` If available. | string | The name of the GenAI model a request is being made to. [4] | `gpt-4` |
322+
| [`server.port`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` If `server.address` is set. | int | GenAI server port. [5] | `80`; `8080`; `443` |
313323
| [`gen_ai.embeddings.dimension.count`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | int | The number of dimensions the resulting output embeddings should have. | `512`; `1024` |
314-
| [`gen_ai.request.encoding_formats`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string[] | The encoding formats requested in an embeddings operation, if specified. [5] | `["base64"]`; `["float", "binary"]` |
324+
| [`gen_ai.request.encoding_formats`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string[] | The encoding formats requested in an embeddings operation, if specified. [6] | `["base64"]`; `["float", "binary"]` |
315325
| [`gen_ai.usage.input_tokens`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | int | The number of tokens used in the GenAI input (prompt). | `100` |
316-
| [`server.address`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Recommended` | string | GenAI server address. [6] | `example.com`; `10.1.2.80`; `/tmp/my.sock` |
326+
| [`server.address`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Recommended` | string | GenAI server address. [7] | `example.com`; `10.1.2.80`; `/tmp/my.sock` |
317327

318328
**[1] `gen_ai.operation.name`:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value.
319329

320-
**[2] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
330+
**[2] `gen_ai.provider.name`:** The attribute SHOULD be set based on the instrumentation's best
331+
knowledge and may differ from the actual model provider.
332+
333+
Multiple providers, including Azure OpenAI, Gemini, and AI hosting platforms
334+
are accessible using the OpenAI REST API and corresponding client libraries,
335+
but may proxy or host models from different providers.
336+
337+
The `gen_ai.request.model`, `gen_ai.response.model`, and `server.address`
338+
attributes may help identify the actual system in use.
339+
340+
The `gen_ai.provider.name` attribute acts as a discriminator that
341+
identifies the GenAI telemetry format flavor specific to that provider
342+
within GenAI semantic conventions.
343+
It SHOULD be set consistently with provider-specific attributes and signals.
344+
For example, GenAI spans, metrics, and events related to AWS Bedrock
345+
should have the `gen_ai.provider.name` set to `aws.bedrock` and include
346+
applicable `aws.bedrock.*` attributes and are not expected to include
347+
`openai.*` attributes.
348+
349+
**[3] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
321350
the canonical name of exception that occurred, or another low-cardinality error identifier.
322351
Instrumentations SHOULD document the list of errors they report.
323352

324-
**[3] `gen_ai.request.model`:** The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned.
353+
**[4] `gen_ai.request.model`:** The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned.
325354

326-
**[4] `server.port`:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.
355+
**[5] `server.port`:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.
327356

328-
**[5] `gen_ai.request.encoding_formats`:** In some GenAI systems the encoding formats are called embedding types. Also, some GenAI systems only accept a single format per request.
357+
**[6] `gen_ai.request.encoding_formats`:** In some GenAI systems the encoding formats are called embedding types. Also, some GenAI systems only accept a single format per request.
329358

330-
**[6] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
359+
**[7] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
360+
361+
The following attributes can be important for making sampling decisions
362+
and SHOULD be provided **at span creation time** (if provided at all):
363+
364+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
365+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
366+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
367+
* [`server.address`](/docs/registry/attributes/server.md)
368+
* [`server.port`](/docs/registry/attributes/server.md)
331369

332370
---
333371

@@ -351,6 +389,34 @@ Instrumentations SHOULD document the list of errors they report.
351389
| `invoke_agent` | Invoke GenAI agent | ![Development](https://img.shields.io/badge/-development-blue) |
352390
| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Development](https://img.shields.io/badge/-development-blue) |
353391

392+
---
393+
394+
`gen_ai.provider.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
395+
396+
| Value | Description | Stability |
397+
|---|---|---|
398+
| `anthropic` | [Anthropic](https://www.anthropic.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
399+
| `aws.bedrock` | [AWS Bedrock](https://aws.amazon.com/bedrock) | ![Development](https://img.shields.io/badge/-development-blue) |
400+
| `azure.ai.inference` | Azure AI Inference | ![Development](https://img.shields.io/badge/-development-blue) |
401+
| `azure.ai.openai` | [Azure OpenAI](https://azure.microsoft.com/products/ai-services/openai-service/) | ![Development](https://img.shields.io/badge/-development-blue) |
402+
| `cohere` | [Cohere](https://cohere.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
403+
| `deepseek` | [DeepSeek](https://www.deepseek.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
404+
| `gcp.gemini` | [Gemini](https://cloud.google.com/products/gemini) [8] | ![Development](https://img.shields.io/badge/-development-blue) |
405+
| `gcp.gen_ai` | Any Google generative AI endpoint [9] | ![Development](https://img.shields.io/badge/-development-blue) |
406+
| `gcp.vertex_ai` | [Vertex AI](https://cloud.google.com/vertex-ai) [10] | ![Development](https://img.shields.io/badge/-development-blue) |
407+
| `groq` | [Groq](https://groq.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
408+
| `ibm.watsonx.ai` | [IBM Watsonx AI](https://www.ibm.com/products/watsonx-ai) | ![Development](https://img.shields.io/badge/-development-blue) |
409+
| `mistral_ai` | [Mistral AI](https://mistral.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
410+
| `openai` | [OpenAI](https://openai.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
411+
| `perplexity` | [Perplexity](https://www.perplexity.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
412+
| `x_ai` | [xAI](https://x.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
413+
414+
**[8]:** Used when accessing the 'generativelanguage.googleapis.com' endpoint. Also known as the AI Studio API.
415+
416+
**[9]:** May be used when specific backend is unknown.
417+
418+
**[10]:** Used when accessing the 'aiplatform.googleapis.com' endpoint.
419+
354420
<!-- markdownlint-restore -->
355421
<!-- prettier-ignore-end -->
356422
<!-- END AUTOGENERATED TEXT -->
@@ -421,6 +487,11 @@ It's expected to be an object - in case a serialized string is available
421487
to the instrumentation, the instrumentation SHOULD do the best effort to
422488
deserialize it to an object. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise.
423489

490+
The following attributes can be important for making sampling decisions
491+
and SHOULD be provided **at span creation time** (if provided at all):
492+
493+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
494+
424495
---
425496

426497
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/openai.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,14 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
210210
it by default. Instrumentations MAY provide a way to enable
211211
populating this attribute.
212212

213+
The following attributes can be important for making sampling decisions
214+
and SHOULD be provided **at span creation time** (if provided at all):
215+
216+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
217+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
218+
* [`server.address`](/docs/registry/attributes/server.md)
219+
* [`server.port`](/docs/registry/attributes/server.md)
220+
213221
---
214222

215223
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

0 commit comments

Comments
 (0)