In this hands-on lab, you'll build a complete AI-powered retail analytics platform entirely within Snowflake — no external infrastructure required. Using Snowflake CoCo as your AI-assisted development environment, you'll work through the full data lifecycle: stream real-time orders via Snowpipe Streaming, MERGE them into production tables with Gen2 Warehouses, transform them through a 3-tier Dynamic Tables pipeline, and serve them with Interactive Tables for low-latency point lookups.
You'll build analytical models with dbt, monitor data quality with Data Metric Functions, and create custom CoCo skills for reusable workflows. Tie it all together with Snowflake CoWork — a conversational AI interface where a Cortex Agent orchestrates Cortex Analyst and Agentic Search to answer "what happened" and "why" from both structured and unstructured data. Finally, evaluate your agent with ground-truth datasets, implement row-level security, and expose your agent as a managed MCP server for external AI clients.
- Accelerate development with Snowflake CoCo (AI-assisted SQL, deployment, and data exploration)
- Stream real-time data with Snowpipe Streaming and transform with Dynamic Tables
- Serve low-latency queries with Interactive Tables and Gen2 Warehouses
- Build analytical models with dbt
- Monitor data quality automatically with Data Metric Functions
- Create and query managed Iceberg V3 tables (deletion vectors, row lineage) (optional)
- Create custom CoCo skills for reusable team workflows
- Build a Cortex Agent with Cortex Analyst (semantic view + verified queries) and Agentic Search (multi-index Cortex Search)
- Evaluate agent quality with ground-truth datasets and LLM judges
- Expose agents as managed MCP servers for external AI clients
- Implement transparent row-level security with Row Access Policies
A production-grade AI-powered retail analytics platform on Snowflake — from raw data to conversational AI insights, entirely within a single platform. You'll create dynamic transformation pipelines, interactive low-latency tables, dbt analytical models, a Cortex Agent that answers questions across structured and unstructured data, row-level security that works transparently through AI, and an MCP server that exposes your agent to external clients.
- Access to a Snowflake account
- Python 3.8+ installed locally
- Git installed locally
- Basic familiarity with SQL and command-line tools
The Snowflake CLI (snow) lets you run SQL, deploy apps, and manage Snowflake objects from your terminal.
macOS (using Homebrew):
If you don't have Homebrew installed, first install it by opening Terminal and running:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Then install Snowflake CLI:
brew install snowflake-cliWindows:
pip install snowflake-cliLinux:
pip install snowflake-cliVerify the installation:
snow --versionYou should see output like Snowflake CLI version: 3.x.x.
Snowflake CoCo is an AI-powered coding assistant that runs in your terminal. It helps you write SQL, build pipelines, deploy apps, and explore your data using natural language prompts.
macOS (using Homebrew):
brew install cortexWindows / Linux:
pip install cortexVerify the installation:
cortex --versionBefore you can run any commands against Snowflake, you need to configure a connection. This tells the CLI which Snowflake account to connect to and how to authenticate.
Run the interactive connection wizard:
snow connection addYou'll be prompted for the following values (enter them one at a time):
| Prompt | What to enter | Example |
|---|---|---|
| Connection name | A short name for this connection | hol |
| Account identifier | Your Snowflake account URL (without .snowflakecomputing.com) |
myorg-myaccount |
| User | Your Snowflake username | jsmith |
| Password | Your Snowflake password | (hidden) |
| Role | ACCOUNTADMIN |
ACCOUNTADMIN |
| Warehouse | Leave blank (will be created by setup) | (leave empty) |
| Database | Leave blank (will be created by setup) | (leave empty) |
Tip: Your account identifier is the part before
.snowflakecomputing.comin your Snowflake URL. For example, if you log in athttps://myorg-myaccount.snowflakecomputing.com, your account identifier ismyorg-myaccount.
Test that your connection works:
snow connection test -c holYou should see Status: OK.
git clone https://github.com/Snowflake-Labs/automated-intelligence-dev-day-2026-hol.git
cd automated-intelligence-dev-day-2026-holLaunch Snowflake CoCo and verify your connection:
cortexWhat to expect: CoCo will start an interactive session in your terminal. You'll see your active connection, role, and warehouse displayed. You can type natural language prompts and CoCo will translate them into SQL or actions.
Then run the core infrastructure script (this takes ~10-15 minutes):
snow sql -f setup.sql -c holThis creates the database, schemas, warehouses, tables, Dynamic Tables pipeline, Interactive Tables, Cortex Search Services, Semantic View, seed data (10M orders, 25M order items, 2M customers), and Row Access Policy.
Demonstrate Gen2's Optima Indexing — automatic partition pruning without explicit clustering keys:
Prompt CoCo:
"Run a point lookup for customer_id 5000 on the Gen2 warehouse"
Open the query profile in Snowsight to see partition pruning — only a fraction of partitions scanned despite no explicit clustering key. This is Gen2's Optima Indexing in action.
Prompt CoCo:
"Show me the Dynamic Tables pipeline status — names, target lag, last refresh time, and row counts for each tier"
CoCo displays the 3-tier pipeline:
- Tier 1 (1-min lag):
enriched_orders(10M rows),enriched_order_items(25M rows) - Tier 2 (DOWNSTREAM):
fact_orders(25M rows) - Tier 3 (DOWNSTREAM):
daily_business_metrics(118 rows),product_performance_metrics(4 rows)
Note: Row counts shown are for the default
data_scale = '10M'. If you chose'50M'insetup.sql, expect ~50M enriched_orders, ~161M order_items, and 365 daily metrics rows.
Ask CoCo:
"Show me a sample of the daily business metrics — top 5 days by revenue"
Expected: Top-5 days are in September 2025 (back-to-season peak), each with ~$183M revenue and ~117K orders.
Run queries in Snowsight to observe sub-second latency:
USE WAREHOUSE hol_interactive_wh;
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
-- Point lookup by customer ID
SELECT * FROM dash_automated_intelligence_db.interactive.customer_order_analytics
WHERE customer_id = 1;
-- Point lookup by order ID
-- Run: SELECT order_id FROM dash_automated_intelligence_db.interactive.order_lookup LIMIT 5; to get UUIDs to use
SELECT * FROM dash_automated_intelligence_db.interactive.order_lookup
WHERE order_id = '<any-order-uuid>';Prompt CoCo:
"Run the interactive tables load test at interactive/load_test.py"
This fires 200 concurrent sessions (1000 queries total) against both Interactive and Standard warehouses, then compares P50/P90/P99 latencies. You should see notably lower latency and higher throughput on the Interactive warehouse.
Run the load test a second time to observe the effect of warm caches. Results may vary depending on account, region, and data scale.
The setup script injected ~200 NULL values into orders.total_amount and order_items.quantity, plus ~150 NULLs into order_items.product_name. DMFs detect the first two — but there's a gap.
Prompt CoCo:
"Check the data quality monitoring results and show me which columns have NULL violations"
CoCo shows that TOTAL_AMOUNT (200 NULLs) and QUANTITY (200 NULLs) have violations — but product_name NULLs are going undetected.
"Are there any NULL values in order_items.product_name? Is that column being monitored?"
CoCo finds ~150 NULLs and reveals the DMF is mis-attached to product_category instead of product_name.
"Fix the DMF — remove the NULL check from product_category and add it to product_name instead"
ALTER TABLE order_items DROP DATA METRIC FUNCTION SNOWFLAKE.CORE.NULL_COUNT ON (product_category);
ALTER TABLE order_items ADD DATA METRIC FUNCTION SNOWFLAKE.CORE.NULL_COUNT ON (product_name);This demonstrates the real-world workflow: monitor, discover gaps, fix coverage.
Prerequisites:
dbt-coreanddbt-snowflakemust be installed (pip install dbt-snowflake). CoCo will handle this for you if not already installed.
Prompt CoCo:
"Install dbt dependencies and build all models in the dbt-analytics project"
CoCo runs dbt deps then dbt build to create all staging views and mart tables (9+ models). CoCo automatically injects your active Snowflake connection into the dbt profile — no manual configuration needed.
Expected output: 71 tests pass, 1 warning (the
source_not_null_raw_orders_total_amounttest detects the 200 NULLs we injected for the Data Quality exercise — this is working as designed).
"Show me the customer lifetime value segments — how many customers are in each value tier?"
Create a reusable skill that automates table profiling:
Prompt CoCo:
"Create a custom CoCo skill called 'profile-table' that takes a table name, counts rows, checks for NULL columns, shows distinct value counts, and flags potential data quality issues"
CoCo creates .cortex/skills/profile-table/SKILL.md with the skill definition, triggers, and step-by-step instructions.
Note: After creating the skill, restart CoCo (type
/quitthencortex) for the new skill to become active.
"$profile-table DASH_AUTOMATED_INTELLIGENCE_DB.RAW.ORDERS"
This demonstrates how teams package repeatable workflows as shareable CoCo skills.
Prompt CoCo:
"Run snowflake-cowork/create_agent.sql to create the Business Insights Agent"
Open the Snowflake CoWork interface in Snowsight: navigate to AI & ML → CoWork (or search "CoWork" in the global search bar). Select the BUSINESS_INSIGHTS_AGENT agent. Then try each question to demonstrate different tool routing:
| Question | Tools Used |
|---|---|
| "Show me monthly revenue trend from June to September 2025" | Cortex Analyst (text-to-SQL) |
| "Which month had the lowest revenue, and what do customer reviews say about that period?" | Cortex Analyst + Agentic Search |
| "Find reviews mentioning wrong size with a rating below 3" | Agentic Search (filtered) |
| "Why are customers returning ski boots?" | Agentic Search (reviews + tickets) |
| "What is our total revenue and customer count by state?" | Cortex Analyst (text-to-SQL) |
| "What are the top complaint themes in support tickets?" | Agentic Search (filter + AI_AGG) |
| "How many reviews mention sizing issues, and which products are most affected?" | Agentic Search (search + breakdown) |
This is the capstone moment — the agent routes across structured data (text-to-SQL) and unstructured data (Cortex Search) to answer "what happened" and "why."
The Row Access Policy and WEST_COAST_MANAGER role were created by setup.sql. Demonstrate how row-level security transparently filters data based on role:
As ACCOUNTADMIN (full access):
USE ROLE ACCOUNTADMIN;
USE WAREHOUSE HOL_WH;
SELECT c.state, SUM(o.total_amount) AS total_revenue, COUNT(DISTINCT c.customer_id) AS customer_count
FROM dash_automated_intelligence_db.raw.orders o
JOIN dash_automated_intelligence_db.raw.customers c ON o.customer_id = c.customer_id
GROUP BY c.state ORDER BY total_revenue DESC;Result: all 10 states visible.
As WEST_COAST_MANAGER (restricted):
USE ROLE WEST_COAST_MANAGER;
USE WAREHOUSE HOL_WH;
SELECT c.state, SUM(o.total_amount) AS total_revenue, COUNT(DISTINCT c.customer_id) AS customer_count
FROM dash_automated_intelligence_db.raw.orders o
JOIN dash_automated_intelligence_db.raw.customers c ON o.customer_id = c.customer_id
GROUP BY c.state ORDER BY total_revenue DESC;Result: only CA, OR, WA appear — the Row Access Policy transparently filters data.
Key insight: Same query, same tables — different results based on who's asking. Row-level security enforces data boundaries without changing application logic.
Switch to the WEST_COAST_MANAGER role in Snowsight, then open CoWork and ask the agent:
"What is our total revenue and customer count by state?"
The agent returns results for only CA, OR, and WA — the Row Access Policy filters data transparently, even through AI-generated SQL.
Prerequisites: The dbt models must be built first (previous section). The dashboard queries
DBT_ANALYTICSandDBT_STAGINGtables.
Prompt CoCo:
"Deploy the Streamlit dashboard to Snowflake"
CoCo runs snow streamlit deploy from the streamlit-dashboard/ directory. Once deployed, open the app URL in Snowsight to explore:
- Summary — Revenue KPIs, order trends, customer counts
- Customer & Product Analytics — Lifetime value segments, product performance
- Pipeline Health — Dynamic Tables refresh status, data freshness monitoring
The evaluation dataset (7 questions + ground truth) was created by setup.sql. Run the evaluation in Snowsight:
- Switch to the ACCOUNTADMIN role in Snowsight (top-left role selector)
- Navigate to AI and ML > Agents > BUSINESS_INSIGHTS_AGENT > Evaluations tab
- Click New evaluation run, name it (e.g.
hol-eval-run-1), click Next - Select Create new dataset from table
- Under Source table, set Database and schema to
DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC, then selectAGENT_EVALUATION_DATA - Under New dataset location, keep
DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC - Set Dataset name:
hol_eval_dataset - Click Next
- Under Define metrics, confirm Input query =
INPUT_QUERY - Toggle on Answer Correctness, set Expected answer =
GROUND_TRUTH - Toggle on Logical Consistency
- Click Create — evaluation starts automatically (~3 min)
- Answer Correctness — Did the agent's response match ground truth? Scored 0-1 per question.
- Logical Consistency — Were planning steps, tool calls, and response internally consistent? (Reference-free.)
- Per-question drill-down — Select any row to see the full thread: planning, tool invocations, response generation.
If questions score low on logical consistency:
- Click a low-scoring row and view Thread details
- Look for vague reasoning about tool selection in the Planning step
- Update the agent's instructions to be more explicit
- Recreate the agent and re-run the evaluation
Expose the Business Insights Agent as a managed MCP server:
Prompt CoCo:
"Create a Snowflake-managed MCP server that exposes our Business Insights Agent, semantic view, and customer feedback search as tools"
CoCo creates the MCP server:
Note: CoCo generates tool names (like
revenue-analytics,customer-feedback-search) based on your prompt. Your names may differ — what matters is that thetypeandidentifierpoint to the correct objects.
CREATE MCP SERVER business_insights_mcp
FROM SPECIFICATION $$
tools:
- name: "business-insights-agent"
type: "CORTEX_AGENT_RUN"
identifier: "DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC.BUSINESS_INSIGHTS_AGENT"
description: "AI agent that answers business questions using structured data and customer feedback"
title: "Business Insights Agent"
- name: "revenue-analytics"
type: "CORTEX_ANALYST_MESSAGE"
identifier: "DASH_AUTOMATED_INTELLIGENCE_DB.SEMANTIC.BUSINESS_ANALYTICS_SEMANTIC"
description: "Text-to-SQL for revenue, orders, customers, and product metrics"
title: "Revenue Analytics"
- name: "customer-feedback-search"
type: "CORTEX_SEARCH_SERVICE_QUERY"
identifier: "DASH_AUTOMATED_INTELLIGENCE_DB.RAW.CUSTOMER_FEEDBACK_SEARCH"
description: "Search across product reviews and support tickets"
title: "Customer Feedback Search"
$$;cortex mcp add business-insights https://<account_url>/api/v2/databases/DASH_AUTOMATED_INTELLIGENCE_DB/schemas/SEMANTIC/mcp-servers/BUSINESS-INSIGHTS-MCP --type httpNow any MCP-compatible client (CoCo, Claude Desktop, custom apps) can discover and call these tools via the standard MCP protocol.
Note: This section is optional. It demonstrates Iceberg V3 capabilities (deletion vectors, default values) using CoCo-generated SQL. No other sections depend on it.
Note: CoCo may take a few attempts to generate correct SQL for Iceberg V3 features (these are newer APIs). If you see "error executing SQL," let CoCo retry — it will self-correct and the end result will work.
Prompt CoCo:
"Create a managed Iceberg table from RAW.ORDERS with clustering by year and month, then query it to show partition pruning"
CoCo creates the table with CATALOG='SNOWFLAKE' (no external volume needed) and demonstrates partition pruning on filtered queries.
"Create an Iceberg V3 table from RAW.ORDERS (ICEBERG_VERSION=3) with merge-on-read enabled, insert 1000 rows, then update 10 of them to demonstrate deletion vectors"
CoCo creates a V3 table with ENABLE_ICEBERG_MERGE_ON_READ = TRUE, inserts data, then runs an UPDATE that uses deletion vectors instead of full file rewrites.
"Add a new column 'priority' with default value 'STANDARD' to the V3 table and show that existing rows get the default without a backfill"
This demonstrates V3 schema evolution without rewriting data files.
Note: This section is optional. The
setup.sqlscript already loads all 10M orders directly. This section demonstrates how you would stream data in production using the Snowpipe Streaming Python SDK.
Generate keys for Snowpipe Streaming authentication:
# Generate private key (unencrypted PEM)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt
# Generate public key
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
# Upload public key to your Snowflake user (replace <your-username>)
snow sql -q "ALTER USER <your-username> SET RSA_PUBLIC_KEY='$(grep -v -- '-----' rsa_key.pub | tr -d '\n')'" -c hol
# Verify
snow sql -q "DESC USER <your-username>" -c hol | grep RSA_PUBLIC_KEY_FPcd snowpipe-streaming-python
pip install -r requirements.txt
# Copy and configure profile
cp profile.json.template profile.jsonEdit profile.json and set your account, user, private_key (contents of rsa_key.p8), and role.
# Stream 10,000 orders
python src/automated_intelligence_streaming.py 10000SELECT COUNT(*) FROM dash_automated_intelligence_db.staging.orders_staging;
SELECT COUNT(*) FROM dash_automated_intelligence_db.staging.order_items_staging;You should see 10,000 orders and ~50,000 order items in staging.
Use CoCo to merge the streamed data:
"Switch to the Gen2 warehouse, check how many rows are in staging, then merge them into RAW and show me the results"
To remove all objects created during this lab:
snow sql -f cleanup.sql -c holDocumentation:
- Snowpipe Streaming SDK
- Dynamic Tables
- Interactive Tables
- Gen2 Warehouses
- Cortex Agents
- Semantic Views
- Data Metric Functions
- Row Access Policies
- Snowflake CoCo
This project is licensed under the Apache License, Version 2.0.