Skip to content

Commit 01e459b

Browse files
committed
docs: first version of cdstore example
1 parent 4f79238 commit 01e459b

16 files changed

Lines changed: 680 additions & 0 deletions

examples/cdstore/README.md

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
# CD Store Example
2+
3+
Complete working example demonstrating RegreSQL with the Chinook database (a sample music store database). This example is based on the queries from "The Art of PostgreSQL" book.
4+
5+
## Overview
6+
7+
This example demonstrates:
8+
- **Snapshot build pipeline** with schema + SQL fixtures
9+
- **Complex SQL queries** with window functions, lateral joins, and aggregations
10+
- **Test plans** with multiple parameter bindings
11+
- **Real-world schema** (artists, albums, tracks, genres, playlists)
12+
13+
## Database Schema
14+
15+
The Chinook database models a digital media store:
16+
17+
```
18+
artist (artist_id, name)
19+
|
20+
album (album_id, title, artist_id, created_at)
21+
|
22+
track (track_id, name, album_id, genre_id, milliseconds, bytes, unit_price)
23+
|
24+
genre (genre_id, name)
25+
playlist_track (playlist_id, track_id)
26+
```
27+
28+
## Setup
29+
30+
### 1. Create Database
31+
32+
```bash
33+
createdb cdstore
34+
```
35+
36+
### 2. Build Snapshot
37+
38+
The snapshot pipeline applies the schema and loads SQL fixtures in one step:
39+
40+
```bash
41+
cd examples/cdstore
42+
regresql snapshot build
43+
```
44+
45+
This runs `db/schema.sql` then each file in `db/fixtures/` to create `snapshots/default.dump`.
46+
47+
### 3. Restore and Test
48+
49+
```bash
50+
regresql snapshot restore # restore snapshot into test DB
51+
regresql update # generate expected output files
52+
regresql test # run all tests
53+
```
54+
55+
## Queries
56+
57+
### artist.sql - Top Artists by Album Count
58+
59+
```sql
60+
-- name: top-artists-by-album
61+
select artist.name, count(*) as albums
62+
from artist left join album using(artist_id)
63+
group by artist.name
64+
order by albums desc
65+
limit :n;
66+
```
67+
68+
### album-by-artist.sql - Albums by Artist with Duration
69+
70+
```sql
71+
-- name: list-albums-by-artist
72+
select album.title as album,
73+
created_at,
74+
sum(milliseconds) * interval '1 ms' as duration
75+
from album
76+
join artist using(artist_id)
77+
left join track using(album_id)
78+
where artist.name = :name
79+
group by album, created_at
80+
order by album;
81+
```
82+
83+
### album-tracks.sql - Track List with Cumulative Duration
84+
85+
```sql
86+
-- name: list-tracks-by-albumid
87+
select name as title,
88+
milliseconds * interval '1ms' as duration,
89+
(sum(milliseconds) over (order by track_id) - milliseconds)
90+
* interval '1ms' as "begin",
91+
sum(milliseconds) over (order by track_id)
92+
* interval '1ms' as "end",
93+
round(milliseconds / sum(milliseconds) over () * 100, 2) as pct
94+
from track
95+
where album_id = :album_id
96+
order by track_id;
97+
```
98+
99+
**Features**: window functions (`sum() over`), interval arithmetic, percentage calculations
100+
101+
### genre-tracks.sql - Track Count by Genre
102+
103+
```sql
104+
-- name: tracks-by-genre
105+
select genre.name, count(*) as count
106+
from genre left join track using(genre_id)
107+
group by genre.name
108+
order by count desc;
109+
```
110+
111+
### genre-topn.sql - Top N Tracks per Genre
112+
113+
Advanced query using LATERAL joins:
114+
115+
```sql
116+
-- name: genre-top-n
117+
select genre.name as genre,
118+
case when length(ss.name) > 15
119+
then substring(ss.name from 1 for 15) || '...'
120+
else ss.name
121+
end as track,
122+
artist.name as artist
123+
from genre
124+
left join lateral (...) ss on true
125+
join album using(album_id)
126+
join artist using(artist_id)
127+
order by genre.name, ss.count desc;
128+
```
129+
130+
**Features**: LATERAL joins for Top-N per group, playlist-based popularity weighting
131+
132+
## Try It
133+
134+
```bash
135+
regresql snapshot build # create snapshot from schema + fixtures
136+
regresql update # run every query, save expected output
137+
regresql test # re-run and compare — all green
138+
```
139+
140+
When using `update` only - RegreSQL tests **query output** only - every row and column must match exactly. Try changing `order by albums desc` to `asc` in `artist.sql` and re-test.
141+
142+
`regresql test`
143+
144+
```
145+
FAILING:
146+
artist_top-artists-by-album.1.json
147+
148+
COMPARISON SUMMARY:
149+
├─ Expected: 5 rows
150+
├─ Actual: 5 rows
151+
├─ Matching: 0 rows
152+
└─ Modified: 5 rows
153+
154+
MODIFIED ROWS (showing 5 of 5):
155+
Row #1:
156+
Expected: {name: "AC/DC", albums: 3}
157+
Actual: {name: "Pearl Jam", albums: 1}
158+
Row #2:
159+
Expected: {name: "Metallica", albums: 3}
160+
Actual: {name: "Jamiroquai", albums: 1}
161+
Row #3:
162+
Expected: {name: "Led Zeppelin", albums: 3}
163+
Actual: {name: "Accept", albums: 1}
164+
Row #4:
165+
Expected: {name: "Pink Floyd", albums: 2}
166+
Actual: {name: "Nirvana", albums: 1}
167+
Row #5:
168+
Expected: {name: "Radiohead", albums: 2}
169+
Actual: {name: "Lenny Kravitz", albums: 1}
170+
171+
To accept changes: regresql update <query-name>
172+
```
173+
174+
You can either fix the query (regression), or if the query output matches the new business requirement update the expected result.
175+
176+
```
177+
regresql update artist.sql
178+
```
179+
180+
And follow up tests will pass again.
181+
182+
## Baselines
183+
184+
In previous example `regresql test` catches content only changes. To also catch **query plan regressions**
185+
(a dropped index, a seq scan that used to be an index scan), all you need to do is capture baselines:
186+
187+
```bash
188+
regresql baseline
189+
```
190+
191+
This runs `EXPLAIN` for every query and saves the plan signature — scan types,
192+
join methods, indexes used. Now try dropping an index:
193+
194+
```sql
195+
DROP INDEX track_album_id_idx;
196+
```
197+
198+
```bash
199+
regresql test
200+
```
201+
202+
Despite all the tests passing, RegreSQL is able to detect sub-optimal plans.
203+
204+
```
205+
WARNINGS:
206+
genre-topn_genre-top-n.top-1.cost (20.82 <= 20.82 * 110%)
207+
⚠️ Multiple sort operations detected (2 sorts)
208+
Suggestion: Consider composite indexes for ORDER BY clauses to avoid sorting
209+
genre-topn_genre-top-n.top-3.cost (21.79 <= 21.79 * 110%)
210+
⚠️ Multiple sort operations detected (2 sorts)
211+
Suggestion: Consider composite indexes for ORDER BY clauses to avoid sorting
212+
```
213+
214+
## Running Tests
215+
216+
```bash
217+
regresql test # run all tests
218+
regresql test --run artist # run specific query
219+
regresql update # regenerate expected results
220+
regresql baseline # recapture plan baselines
221+
```
222+
223+
## SQL Fixtures
224+
225+
Test data lives in `db/fixtures/` as plain SQL files, loaded in order during `snapshot build`:
226+
227+
- `01_base_data.sql` - Artists, genres, media types
228+
- `02_albums.sql` - Albums with tracks
229+
- `03_playlists.sql` - Playlists with track associations
230+
231+
## Directory Structure
232+
233+
```
234+
cdstore/
235+
├── README.md
236+
├── artist.sql # SQL query files
237+
├── album-by-artist.sql
238+
├── album-tracks.sql
239+
├── genre-tracks.sql
240+
├── genre-topn.sql
241+
├── db/
242+
│ ├── schema.sql # Database schema
243+
│ └── fixtures/ # SQL fixture data
244+
│ ├── 01_base_data.sql
245+
│ ├── 02_albums.sql
246+
│ └── 03_playlists.sql
247+
├── snapshots/
248+
│ └── default.dump # Built snapshot (auto-generated)
249+
└── regresql/
250+
├── regress.yaml # Configuration
251+
├── plans/ # Test plans (parameter bindings)
252+
├── expected/ # Expected results (auto-generated)
253+
└── baselines/ # Query baselines (auto-generated)
254+
```
255+
256+
257+
## Credits
258+
259+
- **Database**: [Chinook Database](https://github.com/lerocha/chinook-database) by Luis Rocha
260+
- **Queries**: Based on examples from [The Art of PostgreSQL](https://theartofpostgresql.com/) by Dimitri Fontaine
261+
262+
## License
263+
264+
Example code released under the BSD-2 License.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
-- name: list-albums-by-artist
2+
-- List the album titles and duration of a given artist
3+
select album.title as album,
4+
created_at,
5+
sum(milliseconds) * interval '1 ms' as duration
6+
from album
7+
join artist using(artist_id)
8+
left join track using(album_id)
9+
where artist.name = :name
10+
group by album, created_at
11+
order by album;

examples/cdstore/album-tracks.sql

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
-- name: list-tracks-by-albumid
2+
-- List the tracks of an album, includes duration and position
3+
select name as title,
4+
milliseconds * interval '1ms' as duration,
5+
(sum(milliseconds) over (order by track_id) - milliseconds)
6+
* interval '1ms' as "begin",
7+
sum(milliseconds) over (order by track_id)
8+
* interval '1ms' as "end",
9+
round(milliseconds / sum(milliseconds) over () * 100, 2) as pct
10+
from track
11+
where album_id = :album_id
12+
order by track_id;

examples/cdstore/artist.sql

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
-- name: top-artists-by-album
2+
-- Get the list of the N artists with the most albums
3+
select artist.name, count(*) as albums
4+
from artist
5+
left join album using(artist_id)
6+
group by artist.name
7+
order by albums asc
8+
limit :n;
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
INSERT INTO artist (artist_id, name) VALUES
2+
(1, 'AC/DC'),
3+
(2, 'Accept'),
4+
(8, 'Audioslave'),
5+
(15, 'Buddy Guy'),
6+
(22, 'Led Zeppelin'),
7+
(50, 'Metallica'),
8+
(58, 'Deep Purple'),
9+
(76, 'Creedence Clearwater Revival'),
10+
(82, 'Faith No More'),
11+
(84, 'Foo Fighters'),
12+
(90, 'Iron Maiden'),
13+
(92, 'Jamiroquai'),
14+
(100, 'Lenny Kravitz'),
15+
(110, 'Nirvana'),
16+
(118, 'Pearl Jam'),
17+
(120, 'Pink Floyd'),
18+
(127, 'Red Hot Chili Peppers'),
19+
(140, 'The Black Keys'),
20+
(152, 'Radiohead'),
21+
(200, 'Queens of the Stone Age');
22+
23+
INSERT INTO genre (genre_id, name) VALUES
24+
(1, 'Rock'),
25+
(2, 'Jazz'),
26+
(3, 'Metal'),
27+
(4, 'Alternative & Punk'),
28+
(5, 'Blues'),
29+
(6, 'Classical'),
30+
(7, 'Latin'),
31+
(8, 'Reggae'),
32+
(9, 'Pop'),
33+
(10, 'Soundtrack');
34+
35+
INSERT INTO media_type (media_type_id, name) VALUES
36+
(1, 'MPEG audio file'),
37+
(2, 'Protected AAC audio file'),
38+
(3, 'Protected MPEG-4 video file'),
39+
(4, 'Purchased AAC audio file'),
40+
(5, 'AAC audio file');

0 commit comments

Comments
 (0)