AI Product Analytics Event Schema Specification #6476

gitcommitshow · 2025-10-29T03:07:46Z

gitcommitshow
Oct 29, 2025
Maintainer

Version: 0.1.0-draft
Status: Request for Comments
Authors: Sumanth Puram
Created: October 2025
Comment Period: Oct 29, 2025 - Nov 29, 2025

Abstract

This specification defines a standardized event schema for tracking and analyzing AI-powered features including chatbots, assistants, and copilots. The schema provides three core events for capturing AI interactions and integration patterns for existing analytics infrastructure. This specification aims to establish industry-wide standards for AI product analytics that enable consistent measurement and cross-product comparison.

1. Introduction

1.1 Purpose

This specification standardizes the collection and structure of analytics events for AI-powered features to:

Enable consistent measurement across different AI products and organizations
Facilitate integration with existing data warehouses and customer data platforms
Support cross-product benchmarking and industry insights

1.2 Scope

This specification covers:

Event schemas for AI interaction tracking
Integration with analytics platforms

This specification does NOT cover:

LLM API specifications
Model training data formats
Real-time streaming protocols
Authentication and authorization
Implementation details (covered in separate documentation)

1.3 Terminology

Conversation: A distinct thread of AI interactions identified by a unique conversation_id
Session: A period of user activity in an application, which may contain multiple conversations
Prompt: User input submitted to an AI system
Response: AI system output returned to the user

2. Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

An implementation is conformant if it:

Implements all REQUIRED events and properties
Uses the specified event names exactly as defined
Maintains conversation_id consistency across related events

3. Core Event Schemas

3.1 Event: `ai_user_prompt_created`

Purpose: Tracks when a user submits a prompt to an AI system.

Trigger: MUST be fired immediately after user submits their query.

3.1.1 Required Properties

Property	Type	Description
`conversation_id`	string	Unique identifier for the conversation thread. MUST be consistent across all events in the same conversation.

3.1.2 Core Properties

Property	Type	Description	Example Values
`prompt_text`	string	The actual user input	"What are your return policies?"
`input_method`	string	Method used to submit the prompt	`text`, `voice`, `button`

3.2 Event: `ai_llm_response_received`

Purpose: Captures AI system responses and performance metrics.

Trigger: MUST be fired after receiving response from AI system.

3.2.1 Required Properties

Property	Type	Description
`conversation_id`	string	MUST match the conversation_id from the corresponding prompt
`response_status`	string	Status of the response: `success`, `error`, `timeout`

3.2.2 Core Properties

Property	Type	Description	Example
`response_text`	string	The AI response content
`latency_ms`	integer	Time from request to response in milliseconds	1250
`model_used`	string	Identifier of the AI model	"gpt-4"
`cost`	number	Cost of the API call	0.021

3.2.3 Token Count Properties

Property	Type	Description
`token_count`	object	Token usage metrics (optional)
`token_count.prompt_tokens`	integer	Number of tokens in prompt
`token_count.completion_tokens`	integer	Number of tokens in completion
`token_count.total_tokens`	integer	Total tokens used

3.3 Event: `ai_user_action`

Purpose: Tracks user interactions with AI responses.

Trigger: MUST be fired when user performs an action on an AI response.

3.3.1 Required Properties

Property	Type	Description
`conversation_id`	string	MUST match the conversation_id from related events
`action_type`	string	Type of action performed

3.3.2 Action Types

Action Type	Description
`feedback_given`	User provided feedback on response
`shared`	User shared the AI response
`reported`	User reported an issue with response
`copied_response`	User copied the AI response
`regenerated`	User requested a new response

3.3.3 Action Details

Property	Type	Description
`action_details`	object	Additional properties specific to the action

For action_type: feedback_given:

Property	Type	Description	Example
`action_details.feedback_type`	string	Type of feedback mechanism	`rating`, `thumbs`
`action_details.feedback_value`	varied	The feedback value	4, true
`action_details.feedback_text`	string	Optional text feedback	"Helpful but could be more specific"

4. Conversation Management

4.1 Conversation ID Generation

conversation_id MUST be unique within the implementing system
MUST remain consistent across all events in a conversation
SHOULD generate new ID when conversation context is reset

4.2 Conversation Lifecycle

[Start] → ai_user_prompt_created → ai_llm_response_received → ai_user_action* → [End]
         ↑                                                                      ↓
         └──────────────── (multiple turns) ───────────────────────────────────┘

4.3 Session vs Conversation

A session MAY contain multiple conversations
Each conversation MUST have a unique conversation_id
Sessions SHOULD be tracked using existing analytics session tracking
Conversations MAY span multiple sessions in persistent chat interfaces

5. Data Types and Formats

5.1 String Encoding

All string values MUST use UTF-8 encoding.

5.2 Numeric Values

Latency values MUST be in milliseconds (integer)
Cost values SHOULD be in base currency units (e.g., dollars, not cents)
Token counts MUST be non-negative integers

6. Privacy Considerations

6.1 Personal Data

Implementations MUST comply with applicable privacy regulations (GDPR, CCPA, etc.)
Raw prompt and response text SHOULD be classified as personal data

6.2 Data Retention

Implementations SHOULD document their data retention policies
Consider shorter retention periods for raw text data

7. Backwards Compatibility

Future versions of this specification will:

Maintain backwards compatibility for required fields
Use versioning in event metadata
Provide migration guides for breaking changes

8. Acknowledgments

This specification was developed with input from Sumanth Puram, Dileep, Pradeep, and multiple RudderStack customers (will be listing their names soon).

9. References

RFC 2119: Key words for use in RFCs to Indicate Requirement Levels
ISO 8601: Date and time format

Appendix A: Change Log

Version 0.1.0-draft (Current)

Initial draft specification
Core event schemas defined
Privacy considerations

Appendix B: Frequently Asked Questions

B.1 How does this relate to OpenTelemetry?

This specification focuses on business-level analytics events while OpenTelemetry handles operational telemetry. They are complementary and can be used together.

B.2 Can I extend the schema with custom properties?

Yes, implementations MAY add custom properties beyond the required and core fields. Custom properties SHOULD use a naming convention that avoids conflicts with future specification versions.

B.3 What about real-time streaming use cases?

The schema is designed to work with both batch and streaming systems. The same event structures can be published to streaming platforms like Kafka or Kinesis.

How to Provide Feedback

We welcome feedback on this specification. Please submit comments via:

GitHub Discussion: https://github.com/rudderlabs/rudder-server/discussions
Email: [email protected]
Slack: Invitation to join the Slack community

When providing feedback, please reference the specific section number and include concrete suggestions for improvement.

Review Checklist

When reviewing this specification, please consider:

Are the event schemas comprehensive enough for your use case?
Are there missing properties that would be essential?
Are the privacy considerations adequate?
Would this integrate with your existing analytics infrastructure?
Are there ambiguities that need clarification?

Disclaimer

This specification is provided "as is" without warranty of any kind.

gitcommitshow · 2025-10-29T03:15:51Z

gitcommitshow
Oct 29, 2025
Maintainer Author

Read this article for more details such as - implementation examples with code, a guide to protect user privacy using LLM-powered intent classification, and SQL queries & dashboard for an analytics system following this spec.

0 replies

gitcommitshow · 2025-10-29T03:19:03Z

gitcommitshow
Oct 29, 2025
Maintainer Author

Requesting your feedback on this specification. For a structured feedback discussion:

Create a new comment thread under this discussion for each new feedback topic
Avoid merging multiple feedback topics within single comment thread

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI Product Analytics Event Schema Specification #6476

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

AI Product Analytics Event Schema Specification #6476

Uh oh!

Uh oh!

gitcommitshow Oct 29, 2025 Maintainer

Abstract

1. Introduction

1.1 Purpose

1.2 Scope

1.3 Terminology

2. Conformance

3. Core Event Schemas

3.1 Event: ai_user_prompt_created

3.1.1 Required Properties

3.1.2 Core Properties

3.2 Event: ai_llm_response_received

3.2.1 Required Properties

3.2.2 Core Properties

3.2.3 Token Count Properties

3.3 Event: ai_user_action

3.3.1 Required Properties

3.3.2 Action Types

3.3.3 Action Details

4. Conversation Management

4.1 Conversation ID Generation

4.2 Conversation Lifecycle

4.3 Session vs Conversation

5. Data Types and Formats

5.1 String Encoding

5.2 Numeric Values

6. Privacy Considerations

6.1 Personal Data

6.2 Data Retention

7. Backwards Compatibility

8. Acknowledgments

9. References

Appendix A: Change Log

Version 0.1.0-draft (Current)

Appendix B: Frequently Asked Questions

B.1 How does this relate to OpenTelemetry?

B.2 Can I extend the schema with custom properties?

B.3 What about real-time streaming use cases?

How to Provide Feedback

Review Checklist

Replies: 2 comments

Uh oh!

Uh oh!

gitcommitshow Oct 29, 2025 Maintainer Author

Uh oh!

gitcommitshow Oct 29, 2025 Maintainer Author

gitcommitshow
Oct 29, 2025
Maintainer

3.1 Event: `ai_user_prompt_created`

3.2 Event: `ai_llm_response_received`

3.3 Event: `ai_user_action`

gitcommitshow
Oct 29, 2025
Maintainer Author

gitcommitshow
Oct 29, 2025
Maintainer Author