Skip to content

Commit 2945b0b

Browse files
authored
GenAI: mark sampling-relevant attributes (#2994)
1 parent 0b5d27c commit 2945b0b

File tree

8 files changed

+178
-14
lines changed

8 files changed

+178
-14
lines changed
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
change_type: enhancement
2+
component: gen_ai
3+
note: Added `sampling-relevant` flag to relevant GenAI span attributes to indicate their importance for sampling decisions.
4+
issues: [2994]

docs/gen-ai/aws-bedrock.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,15 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
201201
it by default. Instrumentations MAY provide a way to enable
202202
populating this attribute.
203203

204+
The following attributes can be important for making sampling decisions
205+
and SHOULD be provided **at span creation time** (if provided at all):
206+
207+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
208+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
209+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
210+
* [`server.address`](/docs/registry/attributes/server.md)
211+
* [`server.port`](/docs/registry/attributes/server.md)
212+
204213
---
205214

206215
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/azure-ai-inference.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,14 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
203203
it by default. Instrumentations MAY provide a way to enable
204204
populating this attribute.
205205

206+
The following attributes can be important for making sampling decisions
207+
and SHOULD be provided **at span creation time** (if provided at all):
208+
209+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
210+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
211+
* [`server.address`](/docs/registry/attributes/server.md)
212+
* [`server.port`](/docs/registry/attributes/server.md)
213+
206214
---
207215

208216
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/gen-ai-agent-spans.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,15 @@ Instrumentations SHOULD document the list of errors they report.
112112

113113
**[6] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
114114

115+
The following attributes can be important for making sampling decisions
116+
and SHOULD be provided **at span creation time** (if provided at all):
117+
118+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
119+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
120+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
121+
* [`server.address`](/docs/registry/attributes/server.md)
122+
* [`server.port`](/docs/registry/attributes/server.md)
123+
115124
---
116125

117126
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
@@ -350,6 +359,15 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
350359
it by default. Instrumentations MAY provide a way to enable
351360
populating this attribute.
352361

362+
The following attributes can be important for making sampling decisions
363+
and SHOULD be provided **at span creation time** (if provided at all):
364+
365+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
366+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
367+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
368+
* [`server.address`](/docs/registry/attributes/server.md)
369+
* [`server.port`](/docs/registry/attributes/server.md)
370+
353371
---
354372

355373
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/gen-ai-events.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ linkTitle: Events
88

99
<!-- toc -->
1010

11-
- [Event: `event.gen_ai.client.inference.operation.details`](#event-eventgen_aiclientinferenceoperationdetails)
12-
- [Event: `event.gen_ai.evaluation.result`](#event-eventgen_aievaluationresult)
11+
- [Event: `gen_ai.client.inference.operation.details`](#event-gen_aiclientinferenceoperationdetails)
12+
- [Event: `gen_ai.evaluation.result`](#event-gen_aievaluationresult)
1313

1414
<!-- tocstop -->
1515

@@ -39,7 +39,7 @@ GenAI instrumentations MAY capture user inputs sent to the model and responses r
3939
> [!Note]
4040
> Events are in-development and not yet available in some languages. Check [spec-compliance matrix](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.53.0/spec-compliance-matrix.md#logs) to see the implementation status in corresponding language.
4141
42-
## Event: `event.gen_ai.client.inference.operation.details`
42+
## Event: `gen_ai.client.inference.operation.details`
4343

4444
<!-- semconv event.gen_ai.client.inference.operation.details -->
4545
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
@@ -219,7 +219,7 @@ populating this attribute.
219219
<!-- END AUTOGENERATED TEXT -->
220220
<!-- endsemconv -->
221221

222-
## Event: `event.gen_ai.evaluation.result`
222+
## Event: `gen_ai.evaluation.result`
223223

224224
<!-- semconv event.gen_ai.evaluation.result -->
225225
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->

docs/gen-ai/gen-ai-spans.md

Lines changed: 81 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -214,6 +214,15 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
214214
it by default. Instrumentations MAY provide a way to enable
215215
populating this attribute.
216216

217+
The following attributes can be important for making sampling decisions
218+
and SHOULD be provided **at span creation time** (if provided at all):
219+
220+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
221+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
222+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
223+
* [`server.address`](/docs/registry/attributes/server.md)
224+
* [`server.port`](/docs/registry/attributes/server.md)
225+
217226
---
218227

219228
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
@@ -301,27 +310,56 @@ The `gen_ai.operation.name` SHOULD be `embeddings`.
301310
| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
302311
| --- | --- | --- | --- | --- | --- |
303312
| [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | The name of the operation being performed. [1] | `chat`; `generate_content`; `text_completion` |
304-
| [`error.type`](/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [2] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
305-
| [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` If available. | string | The name of the GenAI model a request is being made to. [3] | `gpt-4` |
306-
| [`server.port`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` If `server.address` is set. | int | GenAI server port. [4] | `80`; `8080`; `443` |
313+
| [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | The Generative AI provider as identified by the client or server instrumentation. [2] | `openai`; `gcp.gen_ai`; `gcp.vertex_ai` |
314+
| [`error.type`](/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [3] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
315+
| [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` If available. | string | The name of the GenAI model a request is being made to. [4] | `gpt-4` |
316+
| [`server.port`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` If `server.address` is set. | int | GenAI server port. [5] | `80`; `8080`; `443` |
307317
| [`gen_ai.embeddings.dimension.count`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | int | The number of dimensions the resulting output embeddings should have. | `512`; `1024` |
308-
| [`gen_ai.request.encoding_formats`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string[] | The encoding formats requested in an embeddings operation, if specified. [5] | `["base64"]`; `["float", "binary"]` |
318+
| [`gen_ai.request.encoding_formats`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string[] | The encoding formats requested in an embeddings operation, if specified. [6] | `["base64"]`; `["float", "binary"]` |
309319
| [`gen_ai.usage.input_tokens`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | int | The number of tokens used in the GenAI input (prompt). | `100` |
310-
| [`server.address`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Recommended` | string | GenAI server address. [6] | `example.com`; `10.1.2.80`; `/tmp/my.sock` |
320+
| [`server.address`](/docs/registry/attributes/server.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Recommended` | string | GenAI server address. [7] | `example.com`; `10.1.2.80`; `/tmp/my.sock` |
311321

312322
**[1] `gen_ai.operation.name`:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value.
313323

314-
**[2] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
324+
**[2] `gen_ai.provider.name`:** The attribute SHOULD be set based on the instrumentation's best
325+
knowledge and may differ from the actual model provider.
326+
327+
Multiple providers, including Azure OpenAI, Gemini, and AI hosting platforms
328+
are accessible using the OpenAI REST API and corresponding client libraries,
329+
but may proxy or host models from different providers.
330+
331+
The `gen_ai.request.model`, `gen_ai.response.model`, and `server.address`
332+
attributes may help identify the actual system in use.
333+
334+
The `gen_ai.provider.name` attribute acts as a discriminator that
335+
identifies the GenAI telemetry format flavor specific to that provider
336+
within GenAI semantic conventions.
337+
It SHOULD be set consistently with provider-specific attributes and signals.
338+
For example, GenAI spans, metrics, and events related to AWS Bedrock
339+
should have the `gen_ai.provider.name` set to `aws.bedrock` and include
340+
applicable `aws.bedrock.*` attributes and are not expected to include
341+
`openai.*` attributes.
342+
343+
**[3] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
315344
the canonical name of exception that occurred, or another low-cardinality error identifier.
316345
Instrumentations SHOULD document the list of errors they report.
317346

318-
**[3] `gen_ai.request.model`:** The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned.
347+
**[4] `gen_ai.request.model`:** The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned.
319348

320-
**[4] `server.port`:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.
349+
**[5] `server.port`:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.
321350

322-
**[5] `gen_ai.request.encoding_formats`:** In some GenAI systems the encoding formats are called embedding types. Also, some GenAI systems only accept a single format per request.
351+
**[6] `gen_ai.request.encoding_formats`:** In some GenAI systems the encoding formats are called embedding types. Also, some GenAI systems only accept a single format per request.
323352

324-
**[6] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
353+
**[7] `server.address`:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
354+
355+
The following attributes can be important for making sampling decisions
356+
and SHOULD be provided **at span creation time** (if provided at all):
357+
358+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
359+
* [`gen_ai.provider.name`](/docs/registry/attributes/gen-ai.md)
360+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
361+
* [`server.address`](/docs/registry/attributes/server.md)
362+
* [`server.port`](/docs/registry/attributes/server.md)
325363

326364
---
327365

@@ -345,6 +383,34 @@ Instrumentations SHOULD document the list of errors they report.
345383
| `invoke_agent` | Invoke GenAI agent | ![Development](https://img.shields.io/badge/-development-blue) |
346384
| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Development](https://img.shields.io/badge/-development-blue) |
347385

386+
---
387+
388+
`gen_ai.provider.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
389+
390+
| Value | Description | Stability |
391+
| --- | --- | --- |
392+
| `anthropic` | [Anthropic](https://www.anthropic.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
393+
| `aws.bedrock` | [AWS Bedrock](https://aws.amazon.com/bedrock) | ![Development](https://img.shields.io/badge/-development-blue) |
394+
| `azure.ai.inference` | Azure AI Inference | ![Development](https://img.shields.io/badge/-development-blue) |
395+
| `azure.ai.openai` | [Azure OpenAI](https://azure.microsoft.com/products/ai-services/openai-service/) | ![Development](https://img.shields.io/badge/-development-blue) |
396+
| `cohere` | [Cohere](https://cohere.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
397+
| `deepseek` | [DeepSeek](https://www.deepseek.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
398+
| `gcp.gemini` | [Gemini](https://cloud.google.com/products/gemini) [8] | ![Development](https://img.shields.io/badge/-development-blue) |
399+
| `gcp.gen_ai` | Any Google generative AI endpoint [9] | ![Development](https://img.shields.io/badge/-development-blue) |
400+
| `gcp.vertex_ai` | [Vertex AI](https://cloud.google.com/vertex-ai) [10] | ![Development](https://img.shields.io/badge/-development-blue) |
401+
| `groq` | [Groq](https://groq.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
402+
| `ibm.watsonx.ai` | [IBM Watsonx AI](https://www.ibm.com/products/watsonx-ai) | ![Development](https://img.shields.io/badge/-development-blue) |
403+
| `mistral_ai` | [Mistral AI](https://mistral.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
404+
| `openai` | [OpenAI](https://openai.com/) | ![Development](https://img.shields.io/badge/-development-blue) |
405+
| `perplexity` | [Perplexity](https://www.perplexity.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
406+
| `x_ai` | [xAI](https://x.ai/) | ![Development](https://img.shields.io/badge/-development-blue) |
407+
408+
**[8]:** Used when accessing the 'generativelanguage.googleapis.com' endpoint. Also known as the AI Studio API.
409+
410+
**[9]:** May be used when specific backend is unknown.
411+
412+
**[10]:** Used when accessing the 'aiplatform.googleapis.com' endpoint.
413+
348414
<!-- prettier-ignore-end -->
349415
<!-- END AUTOGENERATED TEXT -->
350416
<!-- endsemconv -->
@@ -421,6 +487,11 @@ It's expected to be an object - in case a serialized string is available
421487
to the instrumentation, the instrumentation SHOULD do the best effort to
422488
deserialize it to an object. When recorded on spans, it MAY be recorded as a JSON string if structured format is not supported and SHOULD be recorded in structured form otherwise.
423489

490+
The following attributes can be important for making sampling decisions
491+
and SHOULD be provided **at span creation time** (if provided at all):
492+
493+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
494+
424495
---
425496

426497
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

docs/gen-ai/openai.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,14 @@ Since this attribute could be large, it's NOT RECOMMENDED to populate
208208
it by default. Instrumentations MAY provide a way to enable
209209
populating this attribute.
210210

211+
The following attributes can be important for making sampling decisions
212+
and SHOULD be provided **at span creation time** (if provided at all):
213+
214+
* [`gen_ai.operation.name`](/docs/registry/attributes/gen-ai.md)
215+
* [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md)
216+
* [`server.address`](/docs/registry/attributes/server.md)
217+
* [`server.port`](/docs/registry/attributes/server.md)
218+
211219
---
212220

213221
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

0 commit comments

Comments
 (0)