|
| 1 | +# CD Store Example |
| 2 | + |
| 3 | +Complete working example demonstrating RegreSQL with the Chinook database (a sample music store database). This example is based on the queries from "The Art of PostgreSQL" book. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +This example demonstrates: |
| 8 | +- **Snapshot build pipeline** with schema + SQL fixtures |
| 9 | +- **Complex SQL queries** with window functions, lateral joins, and aggregations |
| 10 | +- **Test plans** with multiple parameter bindings |
| 11 | +- **Real-world schema** (artists, albums, tracks, genres, playlists) |
| 12 | + |
| 13 | +## Database Schema |
| 14 | + |
| 15 | +The Chinook database models a digital media store: |
| 16 | + |
| 17 | +``` |
| 18 | +artist (artist_id, name) |
| 19 | + | |
| 20 | +album (album_id, title, artist_id, created_at) |
| 21 | + | |
| 22 | +track (track_id, name, album_id, genre_id, milliseconds, bytes, unit_price) |
| 23 | + | |
| 24 | +genre (genre_id, name) |
| 25 | +playlist_track (playlist_id, track_id) |
| 26 | +``` |
| 27 | + |
| 28 | +## Setup |
| 29 | + |
| 30 | +### 1. Create Database |
| 31 | + |
| 32 | +```bash |
| 33 | +createdb cdstore |
| 34 | +``` |
| 35 | + |
| 36 | +### 2. Build Snapshot |
| 37 | + |
| 38 | +The snapshot pipeline applies the schema and loads SQL fixtures in one step: |
| 39 | + |
| 40 | +```bash |
| 41 | +cd examples/cdstore |
| 42 | +regresql snapshot build |
| 43 | +``` |
| 44 | + |
| 45 | +This runs `db/schema.sql` then each file in `db/fixtures/` to create `snapshots/default.dump`. |
| 46 | + |
| 47 | +### 3. Restore and Test |
| 48 | + |
| 49 | +```bash |
| 50 | +regresql snapshot restore # restore snapshot into test DB |
| 51 | +regresql update # generate expected output files |
| 52 | +regresql test # run all tests |
| 53 | +``` |
| 54 | + |
| 55 | +## Queries |
| 56 | + |
| 57 | +### artist.sql - Top Artists by Album Count |
| 58 | + |
| 59 | +```sql |
| 60 | +-- name: top-artists-by-album |
| 61 | +select artist.name, count(*) as albums |
| 62 | + from artist left join album using(artist_id) |
| 63 | +group by artist.name |
| 64 | +order by albums desc |
| 65 | +limit :n; |
| 66 | +``` |
| 67 | + |
| 68 | +### album-by-artist.sql - Albums by Artist with Duration |
| 69 | + |
| 70 | +```sql |
| 71 | +-- name: list-albums-by-artist |
| 72 | +select album.title as album, |
| 73 | + created_at, |
| 74 | + sum(milliseconds) * interval '1 ms' as duration |
| 75 | + from album |
| 76 | + join artist using(artist_id) |
| 77 | + left join track using(album_id) |
| 78 | + where artist.name = :name |
| 79 | +group by album, created_at |
| 80 | +order by album; |
| 81 | +``` |
| 82 | + |
| 83 | +### album-tracks.sql - Track List with Cumulative Duration |
| 84 | + |
| 85 | +```sql |
| 86 | +-- name: list-tracks-by-albumid |
| 87 | +select name as title, |
| 88 | + milliseconds * interval '1ms' as duration, |
| 89 | + (sum(milliseconds) over (order by track_id) - milliseconds) |
| 90 | + * interval '1ms' as "begin", |
| 91 | + sum(milliseconds) over (order by track_id) |
| 92 | + * interval '1ms' as "end", |
| 93 | + round(milliseconds / sum(milliseconds) over () * 100, 2) as pct |
| 94 | + from track |
| 95 | + where album_id = :album_id |
| 96 | +order by track_id; |
| 97 | +``` |
| 98 | + |
| 99 | +**Features**: window functions (`sum() over`), interval arithmetic, percentage calculations |
| 100 | + |
| 101 | +### genre-tracks.sql - Track Count by Genre |
| 102 | + |
| 103 | +```sql |
| 104 | +-- name: tracks-by-genre |
| 105 | +select genre.name, count(*) as count |
| 106 | + from genre left join track using(genre_id) |
| 107 | +group by genre.name |
| 108 | +order by count desc; |
| 109 | +``` |
| 110 | + |
| 111 | +### genre-topn.sql - Top N Tracks per Genre |
| 112 | + |
| 113 | +Advanced query using LATERAL joins: |
| 114 | + |
| 115 | +```sql |
| 116 | +-- name: genre-top-n |
| 117 | +select genre.name as genre, |
| 118 | + case when length(ss.name) > 15 |
| 119 | + then substring(ss.name from 1 for 15) || '...' |
| 120 | + else ss.name |
| 121 | + end as track, |
| 122 | + artist.name as artist |
| 123 | + from genre |
| 124 | + left join lateral (...) ss on true |
| 125 | + join album using(album_id) |
| 126 | + join artist using(artist_id) |
| 127 | +order by genre.name, ss.count desc; |
| 128 | +``` |
| 129 | + |
| 130 | +**Features**: LATERAL joins for Top-N per group, playlist-based popularity weighting |
| 131 | + |
| 132 | +## Try It |
| 133 | + |
| 134 | +```bash |
| 135 | +regresql snapshot build # create snapshot from schema + fixtures |
| 136 | +regresql update # run every query, save expected output |
| 137 | +regresql test # re-run and compare — all green |
| 138 | +``` |
| 139 | + |
| 140 | +When using `update` only - RegreSQL tests **query output** only - every row and column must match exactly. Try changing `order by albums desc` to `asc` in `artist.sql` and re-test. |
| 141 | + |
| 142 | +`regresql test` |
| 143 | + |
| 144 | +``` |
| 145 | +FAILING: |
| 146 | + artist_top-artists-by-album.1.json |
| 147 | +
|
| 148 | + COMPARISON SUMMARY: |
| 149 | + ├─ Expected: 5 rows |
| 150 | + ├─ Actual: 5 rows |
| 151 | + ├─ Matching: 0 rows |
| 152 | + └─ Modified: 5 rows |
| 153 | +
|
| 154 | + MODIFIED ROWS (showing 5 of 5): |
| 155 | + Row #1: |
| 156 | + Expected: {name: "AC/DC", albums: 3} |
| 157 | + Actual: {name: "Pearl Jam", albums: 1} |
| 158 | + Row #2: |
| 159 | + Expected: {name: "Metallica", albums: 3} |
| 160 | + Actual: {name: "Jamiroquai", albums: 1} |
| 161 | + Row #3: |
| 162 | + Expected: {name: "Led Zeppelin", albums: 3} |
| 163 | + Actual: {name: "Accept", albums: 1} |
| 164 | + Row #4: |
| 165 | + Expected: {name: "Pink Floyd", albums: 2} |
| 166 | + Actual: {name: "Nirvana", albums: 1} |
| 167 | + Row #5: |
| 168 | + Expected: {name: "Radiohead", albums: 2} |
| 169 | + Actual: {name: "Lenny Kravitz", albums: 1} |
| 170 | +
|
| 171 | +To accept changes: regresql update <query-name> |
| 172 | +``` |
| 173 | + |
| 174 | +You can either fix the query (regression), or if the query output matches the new business requirement update the expected result. |
| 175 | + |
| 176 | +``` |
| 177 | +regresql update artist.sql |
| 178 | +``` |
| 179 | + |
| 180 | +And follow up tests will pass again. |
| 181 | + |
| 182 | +## Baselines |
| 183 | + |
| 184 | +In previous example `regresql test` catches content only changes. To also catch **query plan regressions** |
| 185 | +(a dropped index, a seq scan that used to be an index scan), all you need to do is capture baselines: |
| 186 | + |
| 187 | +```bash |
| 188 | +regresql baseline |
| 189 | +``` |
| 190 | + |
| 191 | +This runs `EXPLAIN` for every query and saves the plan signature — scan types, |
| 192 | +join methods, indexes used. Now try dropping an index: |
| 193 | + |
| 194 | +```sql |
| 195 | +DROP INDEX track_album_id_idx; |
| 196 | +``` |
| 197 | + |
| 198 | +```bash |
| 199 | +regresql test |
| 200 | +``` |
| 201 | + |
| 202 | +Despite all the tests passing, RegreSQL is able to detect sub-optimal plans. |
| 203 | + |
| 204 | +``` |
| 205 | +WARNINGS: |
| 206 | + genre-topn_genre-top-n.top-1.cost (20.82 <= 20.82 * 110%) |
| 207 | + ⚠️ Multiple sort operations detected (2 sorts) |
| 208 | + Suggestion: Consider composite indexes for ORDER BY clauses to avoid sorting |
| 209 | + genre-topn_genre-top-n.top-3.cost (21.79 <= 21.79 * 110%) |
| 210 | + ⚠️ Multiple sort operations detected (2 sorts) |
| 211 | + Suggestion: Consider composite indexes for ORDER BY clauses to avoid sorting |
| 212 | +``` |
| 213 | + |
| 214 | +## Running Tests |
| 215 | + |
| 216 | +```bash |
| 217 | +regresql test # run all tests |
| 218 | +regresql test --run artist # run specific query |
| 219 | +regresql update # regenerate expected results |
| 220 | +regresql baseline # recapture plan baselines |
| 221 | +``` |
| 222 | + |
| 223 | +## SQL Fixtures |
| 224 | + |
| 225 | +Test data lives in `db/fixtures/` as plain SQL files, loaded in order during `snapshot build`: |
| 226 | + |
| 227 | +- `01_base_data.sql` - Artists, genres, media types |
| 228 | +- `02_albums.sql` - Albums with tracks |
| 229 | +- `03_playlists.sql` - Playlists with track associations |
| 230 | + |
| 231 | +## Directory Structure |
| 232 | + |
| 233 | +``` |
| 234 | +cdstore/ |
| 235 | +├── README.md |
| 236 | +├── artist.sql # SQL query files |
| 237 | +├── album-by-artist.sql |
| 238 | +├── album-tracks.sql |
| 239 | +├── genre-tracks.sql |
| 240 | +├── genre-topn.sql |
| 241 | +├── db/ |
| 242 | +│ ├── schema.sql # Database schema |
| 243 | +│ └── fixtures/ # SQL fixture data |
| 244 | +│ ├── 01_base_data.sql |
| 245 | +│ ├── 02_albums.sql |
| 246 | +│ └── 03_playlists.sql |
| 247 | +├── snapshots/ |
| 248 | +│ └── default.dump # Built snapshot (auto-generated) |
| 249 | +└── regresql/ |
| 250 | + ├── regress.yaml # Configuration |
| 251 | + ├── plans/ # Test plans (parameter bindings) |
| 252 | + ├── expected/ # Expected results (auto-generated) |
| 253 | + └── baselines/ # Query baselines (auto-generated) |
| 254 | +``` |
| 255 | + |
| 256 | + |
| 257 | +## Credits |
| 258 | + |
| 259 | +- **Database**: [Chinook Database](https://github.com/lerocha/chinook-database) by Luis Rocha |
| 260 | +- **Queries**: Based on examples from [The Art of PostgreSQL](https://theartofpostgresql.com/) by Dimitri Fontaine |
| 261 | + |
| 262 | +## License |
| 263 | + |
| 264 | +Example code released under the BSD-2 License. |
0 commit comments