- Handle better queries like "What is love?"
- Remove textToSql from the router for now.
- Improve Graph Generation
- From Nuno: we will need to generate a title when we generate a graph
- Make sure it's reliable
- Split the chat into 3 endpoints:
- One to do the agentic stuff and return instructions
- One to execute the instructions and return data
- One to generate the chart (already done)
- Re-enable Text-to-SQL Agent when ready
The Insights Chat system implements a sophisticated multi-agent architecture that processes user queries through specialized agents, each responsible for different aspects of question analysis and data retrieval. The system follows an instruction-based architecture where agents generate execution plans (instructions) that are then executed separately, providing clear separation between planning and execution phases.
flowchart TD
A[User Query] --> B[Data Copilot<br/>Orchestrator]
B --> C[Router Agent<br/>Analysis & Routing]
C --> D{Routing Decision}
D -->|stop| E[Cannot Answer]
D -->|create_query| F[Text-to-SQL Agent<br/>Currently Disabled]
D -->|pipes| G[Pipe Agent]
F --> H[Instructions Generation]
G --> H
H --> I[Instruction Execution]
I --> J[Data Response]
E --> K[Error Message]
flowchart TD
A[Data from query] --> B[Chart Generation Agent]
B --> C[Chart Configuration]
C --> D[Frontend]
The entire system is orchestrated through frontend/lib/chat/data-copilot.ts, which:
- MCP Client Setup: Creates a Model Context Protocol client for Tinybird integration
- Tool Discovery: Dynamically retrieves all available tools from Tinybird
- Router Execution: Runs the Router Agent to analyze user intent
- Agent Selection: Based on routing decision, executes the appropriate specialized agent
- Instruction Execution: Executes generated instructions to fetch data
- Response Streaming: Streams results back to client using
createDataStreamResponse
Purpose: First-line analysis that determines how to answer user questions
Key Features:
- Temperature: 0 (deterministic)
- Max Steps: 3
- Available Tool:
list_datasourcesonly (can see all tools but only execute this one)
Decision Process:
- Check if existing pipe tools can answer the question
- If not, examine data sources to see if custom query is possible
- Return routing decision with tool selection
Output Schema:
{
next_action: "stop" | "create_query" | "pipes",
reasoning: string, // User-friendly explanation
reformulated_question: string, // Enhanced query with context
tools: string[] // Tools for next agent to use
}Prompt: frontend/lib/chat/prompts/router.ts
Purpose: Generates custom SQL queries for complex questions
Key Features:
- Temperature: 0 (deterministic)
- Max Steps: 10
- Reasoning Budget: 3000 tokens (via Bedrock configuration)
- Tools:
list_datasources,execute_query
Output Schema:
{
explanation: string, // Why this query answers the question
instructions: string // The SQL query to execute
}Current Status: Temporarily disabled (commit 31392b7). Router returns a message explaining that custom queries will be available soon.
Prompt: frontend/lib/chat/prompts/text-to-sql.ts - Comprehensive 213-line prompt with:
- ClickHouse SQL reference
- Query enhancement rules
- Time-based logic defaults
- Validation methodology
Purpose: Executes pre-built Tinybird pipes and combines their outputs
Key Features:
- Temperature: 0 (deterministic)
- Max Steps: 10
- Tools: Dynamic (filtered by router's tool selection)
Output Schema:
{
explanation: string,
instructions: {
pipes: [
{
id: string, // Unique identifier
name: string, // Actual pipe name
inputs: object // Parameters
}
],
output: [
// Direct column mapping
{
type: "direct",
name: string,
pipeId: string,
sourceColumn: string
},
// Formula columns for calculations
{
type: "formula",
name: string,
formula: string, // JavaScript expression
dependencies: [...]
}
]
}
}Prompt: frontend/lib/chat/prompts/pipe.ts
File: frontend/lib/chat/instructions.ts
The instruction execution layer provides functions to execute the instructions generated by agents:
- Executes each pipe via Tinybird API (
executeTinybirdPipe) - Collects results indexed by pipe ID
- Combines results according to output instructions:
- Direct mapping: Simple column-to-column transfer
- Formula evaluation: Safe JavaScript expression evaluation with limited scope
- Returns combined data array
- Executes SQL query via Tinybird's Query API
- Handles authentication and formatting
- Returns query results as data array
All agents extend BaseAgent class (frontend/lib/chat/agents/base-agent.ts):
Provides:
- Structured output validation using Zod schemas
- Automatic JSON formatting instructions
- Consistent error handling
- Tool management interface
- Temperature control
- Step limits
Key Methods:
getModel(): Returns AI model instancegetSystemPrompt(): Builds agent-specific promptgetTools(): Filters available toolsrun(): Executes agent with reasoning
const mcpClient = await createMCPClient({
transport: new StreamableHTTPClientTransport(url, {
sessionId: `session_${Date.now()}`,
}),
});-
Data Source Tools:
list_datasources: Lists tables and schemasexecute_query: Executes custom SQL (Text-to-SQL only)
-
Pipe Tools:
- Dynamic tools based on Tinybird workspace
- Each pipe becomes an executable tool
- Parameters passed as inputs
- Router: Analysis and routing logic only
- Specialized Agents: Domain-specific instruction generation
- Execution Layer: Pure execution without decision logic
- Agents generate plans, not execute them
- Clear audit trail of decisions
- Easier testing and debugging
- Potential for instruction optimization before execution
- Router sees all tools but can't execute them all
- Specialized agents receive filtered tool sets
- Clear tool access boundaries
- Complex calculations via formula columns
- Variable dependency tracking
- Safe JavaScript evaluation environment
- Graceful degradation at each layer
- Detailed error logging
- User-friendly error messages
Currently using: us.anthropic.claude-sonnet-4-20250514-v1:0 via AWS Bedrock for Router and Pipe Agent.
Currently using: us.anthropic.claude-opus-4-6-v1 via AWS Bedrock for Auditor and TextToSql Agent.
-
User Query: "What were the top users last month?"
-
Router Agent:
{ "next_action": "pipes", "reasoning": "Can answer using top_users pipe", "reformulated_question": "Show top users for last month...", "tools": ["top_users"] } -
Pipe Agent Instructions:
{ "pipes": [ { "id": "users", "name": "top_users", "inputs": { "start_date": "2024-11-01", "end_date": "2024-11-30" } } ], "output": [ { "type": "direct", "name": "User", "pipeId": "users", "sourceColumn": "user_name" }, { "type": "direct", "name": "Activity", "pipeId": "users", "sourceColumn": "event_count" } ] } -
Execution & Response: Combined data table streamed to client