Skip to content

Commit 2281f45

Browse files
authored
Cookbooks for catalogs: postgres, mssql, mysql (#394)
* Cookbooks for catalogs: postgres, mssql, mysql * update catalogs * use secrets
1 parent 358e216 commit 2281f45

18 files changed

Lines changed: 1616 additions & 0 deletions

File tree

catalogs/mssql/README.md

Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# Microsoft SQL Server Catalog Connector
2+
3+
Works with `v1.0+`
4+
5+
The Microsoft SQL Server Catalog Connector enables Spice to automatically discover and query all schemas and tables in an MSSQL database. This recipe demonstrates the connector using the standard TPC-H benchmark dataset (Scale Factor 1) with full foreign key constraints defined between tables.
6+
7+
## Prerequisites
8+
9+
- [Docker](https://docs.docker.com/get-docker/) installed
10+
- Spice installed (see the [Getting Started](https://docs.spiceai.org/getting-started) documentation)
11+
12+
## Step 1. Start the SQL Server database
13+
14+
Clone the cookbook repository and start the database using Docker Compose. The `mssql-init` container generates TPC-H data at Scale Factor 0.1 using DuckDB's built-in generator and loads it into SQL Server with primary keys and foreign keys.
15+
16+
```bash
17+
git clone https://github.com/spiceai/cookbook.git
18+
cd cookbook/catalogs/mssql
19+
docker compose up -d
20+
```
21+
22+
Wait for the init container to finish (usually 2–5 minutes):
23+
24+
```bash
25+
docker compose logs -f mssql-init
26+
```
27+
28+
You should see:
29+
30+
```
31+
tpch-mssql-init | SQL Server is ready.
32+
tpch-mssql-init | Database 'tpch' ready.
33+
tpch-mssql-init | Generating TPC-H SF=0.1 with DuckDB ...
34+
tpch-mssql-init | TPC-H data generated.
35+
tpch-mssql-init | Creating schema ...
36+
tpch-mssql-init | Schema created.
37+
tpch-mssql-init |
38+
tpch-mssql-init | Loading region ...
39+
tpch-mssql-init | 5 rows
40+
tpch-mssql-init | Loaded 5 rows into region.
41+
...
42+
tpch-mssql-init | All TPC-H tables loaded successfully!
43+
```
44+
45+
The TPC-H schema includes the following foreign key relationships:
46+
47+
| Table | Foreign Key Column(s) | References |
48+
|------------|--------------------------------|------------------------------------|
49+
| `nation` | `n_regionkey` | `region(r_regionkey)` |
50+
| `supplier` | `s_nationkey` | `nation(n_nationkey)` |
51+
| `customer` | `c_nationkey` | `nation(n_nationkey)` |
52+
| `partsupp` | `ps_partkey` | `part(p_partkey)` |
53+
| `partsupp` | `ps_suppkey` | `supplier(s_suppkey)` |
54+
| `orders` | `o_custkey` | `customer(c_custkey)` |
55+
| `lineitem` | `l_orderkey` | `orders(o_orderkey)` |
56+
| `lineitem` | `(l_partkey, l_suppkey)` | `partsupp(ps_partkey, ps_suppkey)` |
57+
58+
## Step 2. Create a new directory and initialize a Spicepod
59+
60+
```bash
61+
mkdir mssql-catalog-recipe
62+
cd mssql-catalog-recipe
63+
spice init
64+
```
65+
66+
## Step 3. Configure credentials
67+
68+
Create a `.env` file with the database credentials:
69+
70+
```bash
71+
cp .env.example .env
72+
```
73+
74+
Or set them directly:
75+
76+
```bash
77+
echo "MSSQL_USERNAME=sa" > .env
78+
echo "MSSQL_PASSWORD=SpiceDemo1!" >> .env
79+
```
80+
81+
## Step 4. Add the Microsoft SQL Server Catalog Connector to `spicepod.yaml`
82+
83+
```yaml
84+
version: v1
85+
kind: Spicepod
86+
name: mssql-catalog-recipe
87+
88+
catalogs:
89+
- from: mssql
90+
name: ms
91+
params:
92+
mssql_host: localhost
93+
mssql_port: 1433
94+
mssql_database: tpch
95+
mssql_username: ${secrets:MSSQL_USERNAME}
96+
mssql_password: ${secrets:MSSQL_PASSWORD}
97+
mssql_encrypt: disable
98+
mssql_trust_server_certificate: "true"
99+
```
100+
101+
## Step 5. Start the Spice runtime
102+
103+
```bash
104+
spice run
105+
```
106+
107+
Observe that Spice discovers all schemas and tables in the `tpch` database:
108+
109+
```bash
110+
2025-05-19T10:00:00.000000Z INFO runtime::init::catalog: Registering catalog 'ms' for mssql
111+
2025-05-19T10:00:00.500000Z INFO runtime::init::catalog: Registered catalog 'ms' with 1 schema and 8 tables
112+
```
113+
114+
## Step 6. Query the SQL Server catalog
115+
116+
In a new terminal, start the Spice SQL REPL:
117+
118+
```bash
119+
spice sql
120+
```
121+
122+
List all discovered tables:
123+
124+
```sql
125+
SHOW TABLES;
126+
```
127+
128+
```
129+
+---------------+--------------+--------------+------------+
130+
| table_catalog | table_schema | table_name | table_type |
131+
+---------------+--------------+--------------+------------+
132+
| ms | dbo | region | BASE TABLE |
133+
| ms | dbo | nation | BASE TABLE |
134+
| ms | dbo | part | BASE TABLE |
135+
| ms | dbo | supplier | BASE TABLE |
136+
| ms | dbo | customer | BASE TABLE |
137+
| ms | dbo | partsupp | BASE TABLE |
138+
| ms | dbo | orders | BASE TABLE |
139+
| ms | dbo | lineitem | BASE TABLE |
140+
| spice | runtime | task_history | BASE TABLE |
141+
| spice | runtime | metrics | BASE TABLE |
142+
+---------------+--------------+--------------+------------+
143+
```
144+
145+
Query a table using the three-part `catalog.schema.table` name:
146+
147+
```sql
148+
SELECT c_custkey, c_name, c_mktsegment, c_acctbal
149+
FROM ms.dbo.customer
150+
LIMIT 5;
151+
```
152+
153+
```
154+
+-----------+--------------------+--------------+-----------+
155+
| c_custkey | c_name | c_mktsegment | c_acctbal |
156+
+-----------+--------------------+--------------+-----------+
157+
| 1 | Customer#000000001 | BUILDING | 711.56 |
158+
| 2 | Customer#000000002 | AUTOMOBILE | 121.65 |
159+
| 3 | Customer#000000003 | AUTOMOBILE | 7498.12 |
160+
| 4 | Customer#000000004 | MACHINERY | 2866.83 |
161+
| 5 | Customer#000000005 | HOUSEHOLD | 794.47 |
162+
+-----------+--------------------+--------------+-----------+
163+
```
164+
165+
Run the TPC-H _Pricing Summary Report (Q1)_:
166+
167+
```sql
168+
SELECT
169+
l_returnflag,
170+
l_linestatus,
171+
sum(l_quantity) AS sum_qty,
172+
sum(l_extendedprice) AS sum_base_price,
173+
sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
174+
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
175+
avg(l_quantity) AS avg_qty,
176+
avg(l_extendedprice) AS avg_price,
177+
avg(l_discount) AS avg_disc,
178+
count(*) AS count_order
179+
FROM ms.dbo.lineitem
180+
WHERE l_shipdate <= date '1998-12-01' - interval '110' day
181+
GROUP BY l_returnflag, l_linestatus
182+
ORDER BY l_returnflag, l_linestatus;
183+
```
184+
185+
```
186+
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
187+
| l_returnflag | l_linestatus | sum_qty | sum_base_price | sum_disc_price | sum_charge | avg_qty | avg_price | avg_disc | count_order |
188+
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
189+
| A | F | 37734107.00 | 56586554400.73 | 53758257134.87 | 55909065222.83 | 25.522005 | 38273.129734 | 0.049985 | 1478493 |
190+
| N | F | 991417.00 | 1487504710.38 | 1413082168.05 | 1469649223.19 | 25.516471 | 38284.467760 | 0.050093 | 38854 |
191+
| N | O | 73416597.00 | 110112303006.41 | 104608220776.38 | 108796375788.18 | 25.502437 | 38249.282778 | 0.049996 | 2878807 |
192+
| R | F | 37719753.00 | 56568041380.90 | 53741292684.60 | 55889619119.83 | 25.505793 | 38250.854626 | 0.050009 | 1478870 |
193+
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
194+
195+
Time: 0.812 seconds. 4 rows.
196+
```
197+
198+
Run a cross-table join using the foreign key relationships:
199+
200+
```sql
201+
SELECT
202+
r.r_name AS region,
203+
n.n_name AS nation,
204+
COUNT(DISTINCT c.c_custkey) AS num_customers,
205+
ROUND(AVG(c.c_acctbal), 2) AS avg_balance
206+
FROM ms.dbo.customer c
207+
JOIN ms.dbo.nation n ON c.c_nationkey = n.n_nationkey
208+
JOIN ms.dbo.region r ON n.n_regionkey = r.r_regionkey
209+
GROUP BY r.r_name, n.n_name
210+
ORDER BY r.r_name, num_customers DESC;
211+
```
212+
213+
```
214+
+-------------+----------------+---------------+-------------+
215+
| region | nation | num_customers | avg_balance |
216+
+-------------+----------------+---------------+-------------+
217+
| AFRICA | MOZAMBIQUE | 1102 | 4571.98 |
218+
| AFRICA | ETHIOPIA | 1098 | 4627.11 |
219+
...
220+
| MIDDLE EAST | SAUDI ARABIA | 1067 | 4540.05 |
221+
+-------------+----------------+---------------+-------------+
222+
223+
Time: 0.045 seconds. 25 rows.
224+
```
225+
226+
## Step 7. Clean up
227+
228+
```bash
229+
docker compose down --volumes --rmi local
230+
```
231+
232+
## References
233+
234+
- [Spice.ai Microsoft SQL Server Catalog Connector documentation](https://docs.spiceai.org/components/catalogs/mssql)
235+
- [TPC-H Benchmark Specification](https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.17.1.pdf)
236+
- [Spice SQL CLI reference](https://docs.spiceai.org/cli/reference/sql)

catalogs/mssql/compose.yaml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
services:
2+
mssql:
3+
image: mcr.microsoft.com/mssql/server:2022-latest
4+
container_name: tpch-mssql
5+
environment:
6+
ACCEPT_EULA: "Y"
7+
SA_PASSWORD: "SpiceDemo1!"
8+
MSSQL_PID: Express
9+
ports:
10+
- "1433:1433"
11+
healthcheck:
12+
test: ["CMD-SHELL", "/opt/mssql-tools18/bin/sqlcmd -S localhost -U sa -P SpiceDemo1! -C -Q \"SELECT 1\""]
13+
interval: 5s
14+
timeout: 5s
15+
retries: 15
16+
17+
mssql-init:
18+
container_name: tpch-mssql-init
19+
build: mssql-init/
20+
depends_on:
21+
mssql:
22+
condition: service_healthy
23+
environment:
24+
MSSQL_HOST: mssql
25+
MSSQL_PORT: "1433"
26+
MSSQL_DB: tpch
27+
MSSQL_USER: sa
28+
MSSQL_PASSWORD: "SpiceDemo1!"
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
MSSQL_USERNAME=sa
2+
MSSQL_PASSWORD=SpiceDemo1!
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
version: v1
2+
kind: Spicepod
3+
name: mssql-catalog-recipe
4+
5+
catalogs:
6+
- from: mssql
7+
name: ms
8+
params:
9+
mssql_host: localhost
10+
mssql_port: 1433
11+
mssql_database: tpch
12+
mssql_username: ${secrets:MSSQL_USERNAME}
13+
mssql_password: ${secrets:MSSQL_PASSWORD}
14+
mssql_encrypt: disable
15+
mssql_trust_server_certificate: "true"
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
FROM python:3.12-slim
2+
3+
RUN apt-get update && apt-get install -y --no-install-recommends curl gpg \
4+
&& curl -fsSL https://packages.microsoft.com/keys/microsoft.asc \
5+
| gpg --dearmor -o /usr/share/keyrings/microsoft-prod.gpg \
6+
&& curl -fsSL https://packages.microsoft.com/config/debian/12/prod.list \
7+
-o /etc/apt/sources.list.d/mssql-release.list \
8+
&& apt-get update \
9+
&& ACCEPT_EULA=Y apt-get install -y --no-install-recommends msodbcsql18 unixodbc-dev \
10+
&& rm -rf /var/lib/apt/lists/*
11+
12+
RUN pip install --no-cache-dir \
13+
duckdb==1.3.0 \
14+
pyodbc==5.2.0
15+
16+
WORKDIR /app
17+
COPY load-tpch.py .
18+
19+
ENV PYTHONUNBUFFERED=1
20+
21+
CMD ["python", "load-tpch.py"]

0 commit comments

Comments
 (0)