Skip to content

Conversation

@nagkumar91
Copy link

@nagkumar91 nagkumar91 commented Dec 22, 2025

Motivation

GenAI semantic conventions cover model, agent, and tool operations, but they don't provide a vendor-neutral way to observe security guardian/guardrail evaluations (allow/deny/modify decisions) and the specific security findings produced during those evaluations. This limits auditability, incident investigation, and cross-provider correlation for systems using guardrails across different vendors and frameworks.

What this PR adds

  • Adds apply_guardrail to the gen_ai.operation.name enum for guardrail/guardian evaluations.
  • Adds new attributes under gen_ai.guardian.* and gen_ai.security.* to describe:
    • Guardian identity (gen_ai.guardian.*)
    • Decision outcomes (gen_ai.security.decision.*)
    • Target being evaluated (gen_ai.security.target.*)
    • Findings and policy context (gen_ai.security.risk.*, gen_ai.security.policy.*)
    • Opt-in content capture (gen_ai.security.content.*)
  • Note: gen_ai.security.risk.category is a free-form string with suggested values aligned with OWASP LLM Top 10 2025.
  • Adds a new span: span.gen_ai.apply_guardrail.internal (guardian evaluation).
  • Adds a new event: gen_ai.security.finding (individual findings under a guardian evaluation).
  • Adds documentation: docs/gen-ai/gen-ai-security.md (linked from docs/gen-ai/README.md).

References

Prototypes / Instrumentation Links

The prototype demonstrates all proposed conventions with runnable story scenarios:

Quick Start

cd prototype
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements-appinsights.txt
python -m stories.story_runner --all --exporters console

Key Files

File Purpose
prototype/README.md Full documentation and quickstart
prototype/otel_guardian_utils.py Core span/event emission utilities
prototype/stories/ Runnable story scenarios
prototype/frameworks/ Framework adapters (LangChain, MCP)

Story Scenarios

Story Scenario Key Conventions
4 Enterprise RAG Access Control knowledge_query, knowledge_result, memory_* targets
5 Multi-Tenant SaaS llm_input, llm_output, modify decision, tenant.id
7 Multi-Agent Orchestrator tool_definition, tool_call, gen_ai.agent.id nesting
10 Progressive Jailbreak gen_ai.conversation.id correlation, cumulative risk
11 Guardian Error Handling error.type attribute, fail-open/fail-closed

Semantic Convention Coverage

  • Decision Types: allow, deny, modify, warn, audit
  • Target Types: llm_input, llm_output, tool_call, tool_definition, message, memory_store, memory_retrieve, knowledge_query, knowledge_result
  • Risk Categories: OWASP LLM Top 10 2025 aligned (prompt_injection, sensitive_info_disclosure, excessive_agency, etc.)

Tests

  • make table-generation registry-generation
  • make markdown-toc
  • make SED=sed check-policies (macOS note: the repo defaults to gsed)

Changelog

This is user-facing (new conventions). Add a .chloggen/*.yaml entry with component: gen-ai, or apply the "Skip Changelog" label if maintainers agree it's not required for this proposal stage.

@github-actions github-actions bot added enhancement New feature or request area:gen-ai labels Dec 23, 2025
@adityamehra
Copy link

@nagkumar91 We have a similar use-case and when a security incident happens for a chat span we create a new span as a chat span and add an attributed called gen_ai.security.event_id. The value for this attribute can either be in the response body of the inspection call, which is separate from the actual LLM call, or can be in the response header when inspection happens along with the LLM call. Is it possible to add support for this attribute in here? Thanks!

Here's the sample of how we add it as of now - https://github.com/signalfx/splunk-otel-python-contrib/tree/main/instrumentation-genai/opentelemetry-instrumentation-aidefense#trace-integration

@adityamehra
Copy link

Also, it will be great if we can have another entry in the otel-genai-util for this new span type like we have for chat span - https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/util/opentelemetry-util-genai/src/opentelemetry/util/genai/types.py#L96. Or probably this new type can extend the LLMInvocation type

`span.gen_ai.inference.client` or `span.gen_ai.execute_tool.internal`).

Multiple guardian spans MAY exist under a single operation span if multiple guardians are chained.
attributes:
Copy link

@adityamehra adityamehra Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an attribute called gen_ai.security.event_id?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event being proposed would be a gen_ai.security.finding

Apply guardrail span will have these for IDs:

  • gen_ai.guardian.id
  • gen_ai.security.target.id
  • gen_ai.security.policy.id

Would any of those fit your need?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be better as an event (security finding event as proposed in this span)? Wondering why its a chat span?

Currently, the domain team is using event_id and it's as per their requirement. Also, in our case the event is generated elsewhere and not instrumentation side. We used chat span to use existing types in the genai-utils for now and manage span life cycle using it.

@nagkumar91
Copy link
Author

@nagkumar91 We have a similar use-case and when a security incident happens for a chat span we create a new span as a chat span and add an attributed called gen_ai.security.event_id. The value for this attribute can either be in the response body of the inspection call, which is separate from the actual LLM call, or can be in the response header when inspection happens along with the LLM call. Is it possible to add support for this attribute in here? Thanks!

Here's the sample of how we add it as of now - https://github.com/signalfx/splunk-otel-python-contrib/tree/main/instrumentation-genai/opentelemetry-instrumentation-aidefense#trace-integration

Would this be better as an event (security finding event as proposed in this span)? Wondering why its a chat span?

| <a id="gen-ai-operation-name" href="#gen-ai-operation-name">`gen_ai.operation.name`</a> | ![Development](https://img.shields.io/badge/-development-blue) | string | The name of the operation being performed. [4] | `chat`; `generate_content`; `text_completion` |
| <a id="gen-ai-output-messages" href="#gen-ai-output-messages">`gen_ai.output.messages`</a> | ![Development](https://img.shields.io/badge/-development-blue) | any | Messages returned by the model where each message represents a specific model response (choice, candidate). [5] | [<br>&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;"role": "assistant",<br>&nbsp;&nbsp;&nbsp;&nbsp;"parts": [<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"type": "text",<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"content": "The weather in Paris is currently rainy with a temperature of 57°F."<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;&nbsp;&nbsp;],<br>&nbsp;&nbsp;&nbsp;&nbsp;"finish_reason": "stop"<br>&nbsp;&nbsp;}<br>] |
| <a id="gen-ai-output-type" href="#gen-ai-output-type">`gen_ai.output.type`</a> | ![Development](https://img.shields.io/badge/-development-blue) | string | Represents the content type requested by the client. [6] | `text`; `json`; `image` |
| <a id="gen-ai-guardian-id" href="#gen-ai-guardian-id">`gen_ai.guardian.id`</a> | ![Development](https://img.shields.io/badge/-development-blue) | string | The unique identifier of the security guardian or guardrail. [3] | `guard_abc123`; `sgi5gkybzqak`; `content-filter-v2` |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, the draft currently mixes guardian (evaluator/service) vs guardrail (policy/config) identity (e.g., mapping provider guardrail IDs into gen_ai.guardian.id). I suggest keeping gen_ai.guardian.* for the evaluating component, and mapping guardrail/config identifiers (like aws.bedrock.guardrail.id) to gen_ai.security.policy.id (and/or on gen_ai.security.finding when policy-triggered). Otherwise it becomes hard/impossible to differentiate a guardian from a guardrail in traces, and we end up with confusing cardinality/semantics. This keeps cross provider correlation clean.

Copy link
Author

@nagkumar91 nagkumar91 Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @habibam! I've addressed this in the commit (88c269b):

  • Updated gen_ai.guardian.* attribute documentation to clarify these identify the evaluating service/component, not the policy
  • Added explicit mapping guidance:
    • Guardian = service doing evaluation (e.g., "Azure Content Safety", "Bedrock Guardrails")
    • Policy = configuration being applied (use gen_ai.security.policy.id for ARNs, blocklist IDs, etc.)
  • Updated examples to be clearer

This should make cross-provider correlation cleaner by keeping guardian and guardrail/policy semantics distinct. Let me know if the updated documentation captures your concern!

# MCP Guardian Adapter
# ============================================================================

class MCPGuardianAdapter(BaseGuardianAdapter[MCPContext]):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add in an example of generating a span from the response of elicitation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in commit 88c269b!

New elicitation guard methods:

  • guard_elicitation_request() - Guards outbound requests to user (prevents info leakage, excessive elicitation)
  • guard_elicitation_response() - Guards user's response (detects PII, injection attempts)

Both map to target_type=message since they're user-facing interactions.

See the updated module docstring and main examples (tests 6-9) for usage patterns.

…ools

- Remove redundant files from git (kept locally): genai_guardrail_instrumentation_prototype.py, demo_chat.py, demo_tools.py
- Reduce framework adapters from 6 to 2 (keep LangChain + MCP, others preserved locally)
- Move trace viewer utilities to tools/ directory
- Consolidate README from 750+ lines to ~165 lines
- Update .gitignore to ignore archived files
…CP elicitation

- Clarify guardian vs guardrail semantics in registry.yaml:
  - gen_ai.guardian.* is for the evaluating service/component
  - gen_ai.security.policy.* is for configuration/policy identifiers
  - Added mapping guidance for AWS Bedrock, Azure Content Safety
- Add gen_ai.security.external_event_id attribute for SIEM correlation
- Add MCP elicitation guard methods:
  - guard_elicitation_request: guard outbound requests to user
  - guard_elicitation_response: guard user's input responses
- Regenerate markdown files from updated YAML

Addresses feedback from @habibam and @adityamehra
@nagkumar91
Copy link
Author

Thanks for the feedback @adityamehra!

Regarding the otel-genai-util entry - this is tracked separately since it lives in opentelemetry-python-contrib. Once this semconv PR is approved and merged, I'll open a follow-up PR to add the corresponding types for apply_guardrail spans to the util package.

In the meantime, I've addressed your other feedback:

  • Added gen_ai.security.external_event_id attribute for correlation with external security systems (SIEM, incident management). This should help with the use case you described where the event ID comes from an external inspection service response.

Let me know if the new attribute meets your needs or if you'd like any adjustments!

The advanced_agent_security_plan.md is a working document, not part of
the spec. The relevant information is now in prototype/README.md and
docs/gen-ai/gen-ai-security.md.
@nagkumar91 nagkumar91 marked this pull request as ready for review January 26, 2026 16:30
@nagkumar91 nagkumar91 requested review from a team as code owners January 26, 2026 16:30
The trace viewer utilities are Azure-specific and add complexity
without being essential for validating the spec. They remain
available locally for development use.
@nagkumar91
Copy link
Author

Non-normative implementation guide (PR branch):
https://github.com/nagkumar91/semantic-conventions/blob/gen-ai-security-guardian/docs/gen-ai/non-normative/security_implementation_gen_ai_spec.md

Note on scope:

  • All files under prototype/ are included for reference and to help understand/simulate the spans (stories, adapters, viewer).
  • These prototype/ files are intended to be removed before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:gen-ai enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants