-
Notifications
You must be signed in to change notification settings - Fork 306
gen-ai: add security guardian (apply_guardrail) span + finding event #3233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gen-ai: add security guardian (apply_guardrail) span + finding event #3233
Conversation
|
@nagkumar91 We have a similar use-case and when a security incident happens for a chat span we create a new span as a chat span and add an attributed called Here's the sample of how we add it as of now - https://github.com/signalfx/splunk-otel-python-contrib/tree/main/instrumentation-genai/opentelemetry-instrumentation-aidefense#trace-integration |
|
Also, it will be great if we can have another entry in the |
| `span.gen_ai.inference.client` or `span.gen_ai.execute_tool.internal`). | ||
|
|
||
| Multiple guardian spans MAY exist under a single operation span if multiple guardians are chained. | ||
| attributes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an attribute called gen_ai.security.event_id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The event being proposed would be a gen_ai.security.finding
Apply guardrail span will have these for IDs:
- gen_ai.guardian.id
- gen_ai.security.target.id
- gen_ai.security.policy.id
Would any of those fit your need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be better as an event (security finding event as proposed in this span)? Wondering why its a chat span?
Currently, the domain team is using event_id and it's as per their requirement. Also, in our case the event is generated elsewhere and not instrumentation side. We used chat span to use existing types in the genai-utils for now and manage span life cycle using it.
Would this be better as an event (security finding event as proposed in this span)? Wondering why its a chat span? |
docs/registry/attributes/gen-ai.md
Outdated
| | <a id="gen-ai-operation-name" href="#gen-ai-operation-name">`gen_ai.operation.name`</a> |  | string | The name of the operation being performed. [4] | `chat`; `generate_content`; `text_completion` | | ||
| | <a id="gen-ai-output-messages" href="#gen-ai-output-messages">`gen_ai.output.messages`</a> |  | any | Messages returned by the model where each message represents a specific model response (choice, candidate). [5] | [<br> {<br> "role": "assistant",<br> "parts": [<br> {<br> "type": "text",<br> "content": "The weather in Paris is currently rainy with a temperature of 57°F."<br> }<br> ],<br> "finish_reason": "stop"<br> }<br>] | | ||
| | <a id="gen-ai-output-type" href="#gen-ai-output-type">`gen_ai.output.type`</a> |  | string | Represents the content type requested by the client. [6] | `text`; `json`; `image` | | ||
| | <a id="gen-ai-guardian-id" href="#gen-ai-guardian-id">`gen_ai.guardian.id`</a> |  | string | The unique identifier of the security guardian or guardrail. [3] | `guard_abc123`; `sgi5gkybzqak`; `content-filter-v2` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit, the draft currently mixes guardian (evaluator/service) vs guardrail (policy/config) identity (e.g., mapping provider guardrail IDs into gen_ai.guardian.id). I suggest keeping gen_ai.guardian.* for the evaluating component, and mapping guardrail/config identifiers (like aws.bedrock.guardrail.id) to gen_ai.security.policy.id (and/or on gen_ai.security.finding when policy-triggered). Otherwise it becomes hard/impossible to differentiate a guardian from a guardrail in traces, and we end up with confusing cardinality/semantics. This keeps cross provider correlation clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @habibam! I've addressed this in the commit (88c269b):
- Updated
gen_ai.guardian.*attribute documentation to clarify these identify the evaluating service/component, not the policy - Added explicit mapping guidance:
- Guardian = service doing evaluation (e.g., "Azure Content Safety", "Bedrock Guardrails")
- Policy = configuration being applied (use
gen_ai.security.policy.idfor ARNs, blocklist IDs, etc.)
- Updated examples to be clearer
This should make cross-provider correlation cleaner by keeping guardian and guardrail/policy semantics distinct. Let me know if the updated documentation captures your concern!
| # MCP Guardian Adapter | ||
| # ============================================================================ | ||
|
|
||
| class MCPGuardianAdapter(BaseGuardianAdapter[MCPContext]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add in an example of generating a span from the response of elicitation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in commit 88c269b!
New elicitation guard methods:
guard_elicitation_request()- Guards outbound requests to user (prevents info leakage, excessive elicitation)guard_elicitation_response()- Guards user's response (detects PII, injection attempts)
Both map to target_type=message since they're user-facing interactions.
See the updated module docstring and main examples (tests 6-9) for usage patterns.
…ools - Remove redundant files from git (kept locally): genai_guardrail_instrumentation_prototype.py, demo_chat.py, demo_tools.py - Reduce framework adapters from 6 to 2 (keep LangChain + MCP, others preserved locally) - Move trace viewer utilities to tools/ directory - Consolidate README from 750+ lines to ~165 lines - Update .gitignore to ignore archived files
…CP elicitation - Clarify guardian vs guardrail semantics in registry.yaml: - gen_ai.guardian.* is for the evaluating service/component - gen_ai.security.policy.* is for configuration/policy identifiers - Added mapping guidance for AWS Bedrock, Azure Content Safety - Add gen_ai.security.external_event_id attribute for SIEM correlation - Add MCP elicitation guard methods: - guard_elicitation_request: guard outbound requests to user - guard_elicitation_response: guard user's input responses - Regenerate markdown files from updated YAML Addresses feedback from @habibam and @adityamehra
|
Thanks for the feedback @adityamehra! Regarding the In the meantime, I've addressed your other feedback:
Let me know if the new attribute meets your needs or if you'd like any adjustments! |
The advanced_agent_security_plan.md is a working document, not part of the spec. The relevant information is now in prototype/README.md and docs/gen-ai/gen-ai-security.md.
The trace viewer utilities are Azure-specific and add complexity without being essential for validating the spec. They remain available locally for development use.
|
Non-normative implementation guide (PR branch): Note on scope:
|
Motivation
GenAI semantic conventions cover model, agent, and tool operations, but they don't provide a vendor-neutral way to observe security guardian/guardrail evaluations (allow/deny/modify decisions) and the specific security findings produced during those evaluations. This limits auditability, incident investigation, and cross-provider correlation for systems using guardrails across different vendors and frameworks.
What this PR adds
apply_guardrailto thegen_ai.operation.nameenum for guardrail/guardian evaluations.gen_ai.guardian.*andgen_ai.security.*to describe:gen_ai.guardian.*)gen_ai.security.decision.*)gen_ai.security.target.*)gen_ai.security.risk.*,gen_ai.security.policy.*)gen_ai.security.content.*)gen_ai.security.risk.categoryis a free-formstringwith suggested values aligned with OWASP LLM Top 10 2025.span.gen_ai.apply_guardrail.internal(guardian evaluation).gen_ai.security.finding(individual findings under a guardian evaluation).docs/gen-ai/gen-ai-security.md(linked fromdocs/gen-ai/README.md).References
Prototypes / Instrumentation Links
The prototype demonstrates all proposed conventions with runnable story scenarios:
Quick Start
Key Files
prototype/README.mdprototype/otel_guardian_utils.pyprototype/stories/prototype/frameworks/Story Scenarios
knowledge_query,knowledge_result,memory_*targetsllm_input,llm_output,modifydecision,tenant.idtool_definition,tool_call,gen_ai.agent.idnestinggen_ai.conversation.idcorrelation, cumulative riskerror.typeattribute, fail-open/fail-closedSemantic Convention Coverage
allow,deny,modify,warn,auditllm_input,llm_output,tool_call,tool_definition,message,memory_store,memory_retrieve,knowledge_query,knowledge_resultprompt_injection,sensitive_info_disclosure,excessive_agency, etc.)Tests
make table-generation registry-generationmake markdown-tocmake SED=sed check-policies(macOS note: the repo defaults togsed)Changelog
This is user-facing (new conventions). Add a
.chloggen/*.yamlentry withcomponent: gen-ai, or apply the "Skip Changelog" label if maintainers agree it's not required for this proposal stage.