Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 52 additions & 14 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,15 @@ Every extension listed must be enabled on the PlanetScale database before starti
Every table must have a primary key or unique index. Bucardo cannot track rows without one.

```sql
SELECT c.relname
SELECT n.nspname || '.' || c.relname
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public' AND c.relkind = 'r'
WHERE n.nspname IN ('public') AND c.relkind = 'r'
AND NOT EXISTS (
SELECT 1 FROM pg_index i
WHERE i.indrelid = c.oid AND (i.indisprimary OR i.indisunique)
)
ORDER BY c.relname;
ORDER BY n.nspname, c.relname;
```

If any tables are returned, the user must add a primary key or unique index to each one on their Heroku database before starting. Example fix: `ALTER TABLE table_name ADD PRIMARY KEY (id);`
Expand Down Expand Up @@ -100,15 +100,53 @@ Match the PlanetScale database region accordingly (e.g., `us-east` for Heroku `u

PostgreSQL `GENERATED ALWAYS AS ... STORED` columns are handled automatically by the migrator -- it registers a `customcols` override against the PlanetScale side that omits each generated column from `COPY`, and PlanetScale recomputes the value on insert. No user action required. The dashboard's preflight section lists any affected tables for visibility.

### 8. Fresh PlanetScale target
### 8. pg_partman

The migrator supports pg_partman by filtering migration scope by schema while
still using Bucardo's `add all tables` flow. Defaults:

- `MIGRATION_SCHEMAS=public`
- `MIGRATION_EXCLUDE_SCHEMAS=heroku_ext,partman,pg_partman,bucardo,pg_catalog,information_schema`

Example deployment override:

```bash
heroku config:set \
MIGRATION_SCHEMAS=public \
MIGRATION_EXCLUDE_SCHEMAS=heroku_ext,partman,pg_partman \
-a <migration-app>
```

Before migration, tell the operator to pause pg_partman maintenance jobs on
Heroku and install pg_partman on PlanetScale. The dashboard detects
`partman.part_config` and `partman.dump_partitioned_table_definition(parent_table)`.
When detected, it shows managed parent tables and generated SQL from:

```sql
SELECT partman.dump_partitioned_table_definition(parent_table)
FROM partman.part_config
ORDER BY parent_table;
```

The operator must apply this SQL manually on PlanetScale after schema copy and
pg_partman extension installation. Do not auto-apply it. The dump function
supports single-level partition sets only.

Leaf partition tables in included application schemas are replicated. pg_partman
config/internal schemas are excluded. If a user stores pg_partman config tables
inside an included application schema, schema exclusion will not filter those
tables; they should move config tables to the extension schema or exclude that
schema from the migration.

### 9. Fresh PlanetScale target

Always use a clean PlanetScale database or branch for each migration attempt. Retrying against a target that has leftover tables/data from a failed run will cause errors.

## Common errors and fixes

### "Could not find TABLE inside public schema on database planetscale"

The schema copy (`pg_dump | psql`) failed silently for one or more tables. The table exists on Heroku but wasn't created on PlanetScale. Check the **Setup Log** in the dashboard (or `GET /logs` → `setup` field) for the actual `psql` error. Most common cause: a missing extension on PlanetScale that the table depends on.
The schema copy (`pg_dump | psql`) failed silently for one or more tables. The table exists on Heroku but wasn't created on PlanetScale. Check the **Setup Log** in the dashboard (or `GET /logs` → `setup` field) for the actual `psql` error. Most common cause: a missing extension on PlanetScale that the table depends on. For non-public schemas, check that `MIGRATION_SCHEMAS` includes the app schema and that it is not removed by `MIGRATION_EXCLUDE_SCHEMAS`.

### "Generated columns cannot be used in COPY"

Expand All @@ -119,7 +157,7 @@ DETAIL: Generated columns cannot be used in COPY.

Bucardo 5.6 does not filter out PostgreSQL `GENERATED ALWAYS AS ... STORED` columns when building the `COPY` it issues against the target. Any sync containing a table with a generated column will fail with this error during the initial copy and during ongoing replication.

**Current migrator (with auto-fix):** [scripts/mk-bucardo-repl.sh](scripts/mk-bucardo-repl.sh) detects generated columns and registers `bucardo add customcols ... db=planetscale` overrides automatically. The setup log will show `Excluding generated columns on public.<table> via customcols`. The dashboard's preflight section also lists affected tables as an informational note. No user action required.
**Current migrator (with auto-fix):** [scripts/mk-bucardo-repl.sh](scripts/mk-bucardo-repl.sh) detects generated columns in included migration schemas and registers `bucardo add customcols ... db=planetscale` overrides automatically. The setup log will show `Excluding generated columns on <schema>.<table> via customcols`. The dashboard's preflight section also lists affected tables as an informational note. No user action required.

**Older migrator (manual workaround):** If a user is on a version of the migrator without the auto-fix and they hit this error, they can apply the workaround inside the migration dyno (`heroku ps:exec -a <migration-app>`):

Expand All @@ -130,7 +168,7 @@ Bucardo 5.6 does not filter out PostgreSQL `GENERATED ALWAYS AS ... STORED` colu
FROM pg_attribute a
JOIN pg_class c ON c.oid = a.attrelid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public' AND c.relkind = 'r'
WHERE n.nspname IN ('public') AND c.relkind = 'r'
AND a.attnum > 0 AND NOT a.attisdropped
AND a.attgenerated <> ''
ORDER BY n.nspname, c.relname, a.attnum;
Expand All @@ -139,7 +177,7 @@ Bucardo 5.6 does not filter out PostgreSQL `GENERATED ALWAYS AS ... STORED` colu
2. For each affected table, register a customcols override that omits the generated column(s) (the PlanetScale target already has the generation expression and will recompute the value on insert):

```bash
bucardo add customcols public.<table> "SELECT id, col_a, col_b, ..." db=planetscale
bucardo add customcols <schema>.<table> "SELECT id, col_a, col_b, ..." db=planetscale
```

3. Abort the migration in the dashboard, recreate the PlanetScale target as a fresh database/branch, and start the migration again.
Expand Down Expand Up @@ -233,11 +271,11 @@ Run these against the Heroku source database to help diagnose issues:
SELECT extname, extversion FROM pg_extension WHERE extname != 'plpgsql' ORDER BY extname;

-- Tables without primary key or unique index
SELECT c.relname FROM pg_class c
SELECT n.nspname || '.' || c.relname FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public' AND c.relkind = 'r'
WHERE n.nspname IN ('public') AND c.relkind = 'r'
AND NOT EXISTS (SELECT 1 FROM pg_index i WHERE i.indrelid = c.oid AND (i.indisprimary OR i.indisunique))
ORDER BY c.relname;
ORDER BY n.nspname, c.relname;

-- Table row counts (estimated, fast)
SELECT relname, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;
Expand Down Expand Up @@ -281,12 +319,12 @@ DO $$
DECLARE r RECORD;
BEGIN
FOR r IN
SELECT tgname, relname FROM pg_trigger t
SELECT tgname, n.nspname, c.relname FROM pg_trigger t
JOIN pg_class c ON c.oid = t.tgrelid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE tgname LIKE 'bucardo_%' AND n.nspname = 'public'
WHERE tgname LIKE 'bucardo_%' AND n.nspname <> 'bucardo'
LOOP
EXECUTE format('DROP TRIGGER %I ON %I', r.tgname, r.relname);
EXECUTE format('DROP TRIGGER %I ON %I.%I', r.tgname, r.nspname, r.relname);
END LOOP;
END $$;

Expand Down
38 changes: 36 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,24 @@ heroku pg:locks -a your-app-name

If you see any `VACUUM` queries with `(to prevent wraparound)` in the output, wait for them to finish before starting the migration.

### 4a. If you use pg_partman

The migrator supports pg_partman-managed tables by migrating only the configured
application schemas. By default it includes `public` and excludes internal
schemas such as `partman`, `pg_partman`, `heroku_ext`, `bucardo`, `pg_catalog`,
and `information_schema`.

Before starting:

1. Pause pg_partman maintenance jobs on Heroku for the migration window.
2. Install the pg_partman extension on PlanetScale.
3. Confirm pg_partman config tables live in `partman` or `pg_partman`, not in an included application schema.
4. Review the dashboard pg_partman preflight callout. It shows parent tables and generated SQL from `partman.dump_partitioned_table_definition(parent_table)`.

After schema copy finishes, apply the generated SQL manually on PlanetScale to
recreate pg_partman partition sets. The migrator does not auto-apply this SQL.
`dump_partitioned_table_definition` supports single-level partition sets only.

### 5. Size your PlanetScale database

**Cluster size:** Choose a PlanetScale cluster with similar CPU and RAM to your Heroku Postgres plan. You don't need to get this exactly right. [Resizing in PlanetScale is an online operation](https://planetscale.com/docs/postgres/cluster-configuration) with no downtime, and you are only billed for the time you use.
Expand Down Expand Up @@ -165,6 +183,15 @@ Click the button at the top of this page, or deploy manually:
PLANETSCALE_URL="postgresql://..." \
PASSWORD="choose-a-password"
```

Optional schema scope for pg_partman or multi-schema applications:

```bash
heroku config:set \
MIGRATION_SCHEMAS=public \
MIGRATION_EXCLUDE_SCHEMAS=heroku_ext,partman,pg_partman \
-a <migration-app>
```
4. Deploy:
```bash
git push heroku main
Expand Down Expand Up @@ -220,6 +247,11 @@ Once you open the dashboard and click **Start Migration**, the process follows t

The migrator copies your database structure (tables, indexes, constraints) from Heroku to PlanetScale and configures Bucardo replication. This is fully automatic and typically takes a minute or two.

Bucardo still uses `add all tables` and `add all sequences`, but the migrator
calls those commands once per included schema using Bucardo's schema filter.
This keeps pg_partman config/internal schemas out of the relgroup while still
replicating leaf partition tables that live in included application schemas.

### Step 2: Data sync

All existing rows are copied from Heroku to PlanetScale (the "initial copy"). Once that finishes, Bucardo enters real-time replication mode. Every new write to your Heroku database is automatically replicated to PlanetScale.
Expand All @@ -235,10 +267,10 @@ In either case, Bucardo's triggers stay on your Heroku database while paused, so

### Step 3: Switch traffic

When the dashboard shows your databases are in sync, you're ready to cut over. Click **Switch Traffic** to block writes on your Heroku database. This runs a SQL `REVOKE` command that removes `INSERT`, `UPDATE`, and `DELETE` privileges from your Heroku database user:
When the dashboard shows your databases are in sync, you're ready to cut over. Click **Switch Traffic** to block writes on your Heroku database. This runs a SQL `REVOKE` command that removes `INSERT`, `UPDATE`, and `DELETE` privileges from your Heroku database user across the configured migration schemas:

```sql
REVOKE INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public FROM your_heroku_user;
REVOKE INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public, other_schema FROM your_heroku_user;
```

For production apps, consider enabling Heroku maintenance mode before clicking
Expand Down Expand Up @@ -333,6 +365,8 @@ Plan migration windows around the **initial copy** and your post-copy validation
| `HEROKU_URL` | Yes | Heroku Postgres connection URL |
| `PLANETSCALE_URL` | Yes | PlanetScale Postgres connection URL |
| `PASSWORD` | Yes | Password to access the migration dashboard |
| `MIGRATION_SCHEMAS` | No | Comma-separated schemas to migrate. Defaults to `public`. |
| `MIGRATION_EXCLUDE_SCHEMAS` | No | Comma-separated schemas removed from `MIGRATION_SCHEMAS`. Defaults to `heroku_ext,partman,pg_partman,bucardo,pg_catalog,information_schema`. |
| `DISABLE_NOTIFICATIONS` | No | Set to `true` to disable migration progress notifications to PlanetScale (enabled by default) |

## What is Bucardo?
Expand Down
5 changes: 4 additions & 1 deletion entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -187,8 +187,11 @@ RCEOF
--db-port "$PGPORT" \
--verbose 2>/dev/null || true

echo "Configuring Bucardo verbose logging..."
echo "Configuring Bucardo runtime settings..."
bucardo set log_level=verbose
bucardo set tcp_keepalives_idle=60
bucardo set tcp_keepalives_interval=10
bucardo set tcp_keepalives_count=6

echo "Starting Bucardo daemon..."
bucardo start || bucardo restart
Expand Down
72 changes: 66 additions & 6 deletions scripts/mk-bucardo-repl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,64 @@ if [ -z "$PRIMARY" -o -z "$REPLICA" ]
then usage 1
fi

MIGRATION_SCHEMAS="${MIGRATION_SCHEMAS:-public}"
MIGRATION_EXCLUDE_SCHEMAS="${MIGRATION_EXCLUDE_SCHEMAS:-heroku_ext,partman,pg_partman,bucardo,pg_catalog,information_schema}"

schema_list() {
printf "%s" "$1" |
tr "," "\n" |
sed -e "s/^[[:space:]]*//" -e "s/[[:space:]]*$//" |
awk 'length($0) && !seen[$0]++'
}

schema_list_contains() {
needle="$1"
list="$2"
printf "%s\n" "$list" | awk -v needle="$needle" '$0 == needle { found = 1 } END { exit found ? 0 : 1 }'
}

sql_quote() {
printf "'%s'" "$(printf "%s" "$1" | sed "s/'/''/g")"
}

REQUESTED_SCHEMAS="$(schema_list "$MIGRATION_SCHEMAS")"
EXCLUDED_SCHEMAS="$(schema_list "$MIGRATION_EXCLUDE_SCHEMAS")"
INCLUDED_SCHEMAS=""
for schema in $REQUESTED_SCHEMAS
do
if ! schema_list_contains "$schema" "$EXCLUDED_SCHEMAS"
then
INCLUDED_SCHEMAS="${INCLUDED_SCHEMAS}${INCLUDED_SCHEMAS:+
}${schema}"
fi
done

if [ -z "$INCLUDED_SCHEMAS" ]
then
echo "No migration schemas remain after applying MIGRATION_EXCLUDE_SCHEMAS" >&2
exit 1
fi

SCHEMA_SQL_LIST=""
for schema in $INCLUDED_SCHEMAS
do
quoted="$(sql_quote "$schema")"
SCHEMA_SQL_LIST="${SCHEMA_SQL_LIST}${SCHEMA_SQL_LIST:+, }${quoted}"
done

echo "Migrating schemas: $(printf "%s" "$INCLUDED_SCHEMAS" | paste -sd "," -)"
echo "Excluded schemas: $(printf "%s" "$EXCLUDED_SCHEMAS" | paste -sd "," -)"

# Copy the schema from the primary to the (soon to be) replica.
if [ "$SKIP_SCHEMA" -eq 0 ]; then
echo "Copying schema from primary to replica..."
pg_dump --no-owner --no-privileges --no-publications --no-subscriptions --schema-only "$PRIMARY" |
PG_DUMP_SCHEMA_ARGS=""
for schema in $INCLUDED_SCHEMAS
do
PG_DUMP_SCHEMA_ARGS="${PG_DUMP_SCHEMA_ARGS} --schema=${schema}"
done
pg_dump --no-owner --no-privileges --no-publications --no-subscriptions --schema-only $PG_DUMP_SCHEMA_ARGS "$PRIMARY" |
sed -E "s/^CREATE SCHEMA public;$/CREATE SCHEMA IF NOT EXISTS public;/" |
grep -v -E "^COMMENT ON EXTENSION " |
psql "$REPLICA" -a --set ON_ERROR_STOP=1
else
Expand All @@ -60,9 +114,12 @@ bucardo add database "planetscale" \
password="$(echo "$REPLICA" | cut -d ":" -f 3 | cut -d "@" -f 1)" \
dbname="$(echo "$REPLICA" | cut -d "/" -f 4 | cut -d "?" -f 1)"

# Add all the sequences and tables to Bucardo.
bucardo add all sequences --relgroup "planetscale_import"
bucardo add all tables --relgroup "planetscale_import"
# Add all the sequences and tables in each included schema to Bucardo.
for schema in $INCLUDED_SCHEMAS
do
bucardo add all sequences db=heroku -n "$schema" relgroup=planetscale_import
bucardo add all tables db=heroku -n "$schema" relgroup=planetscale_import
done

# Bucardo 5.6 does not filter out PostgreSQL generated columns when issuing
# COPY against the target, which fails with "column ... is a generated column /
Expand All @@ -76,7 +133,7 @@ GENERATED_TABLES=$(psql "$PRIMARY" -A -t -F"|" -c "
FROM pg_attribute a
JOIN pg_class c ON c.oid = a.attrelid
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public' AND c.relkind = 'r'
WHERE n.nspname IN (${SCHEMA_SQL_LIST}) AND c.relkind = 'r'
AND a.attnum > 0 AND NOT a.attisdropped
AND a.attgenerated <> ''
ORDER BY n.nspname, c.relname;")
Expand All @@ -88,7 +145,7 @@ if [ -n "$GENERATED_TABLES" ]; then
cols=$(psql "$PRIMARY" -A -t -c "
SELECT string_agg(quote_ident(attname), ', ' ORDER BY attnum)
FROM pg_attribute
WHERE attrelid = '${schema}.${table}'::regclass
WHERE attrelid = format('%I.%I', '${schema}', '${table}')::regclass
AND attnum > 0 AND NOT attisdropped
AND attgenerated = '';")
if [ -z "$cols" ]; then
Expand All @@ -113,6 +170,9 @@ fi
# The default 30s timeout is too short for databases with many tables, since
# each table is inspected on both source and target over remote connections.
bucardo set reload_config_timeout=180 log_level=verbose
bucardo set tcp_keepalives_idle=60
bucardo set tcp_keepalives_interval=10
bucardo set tcp_keepalives_count=6

# Reload Bucardo, which starts the sync we just added.
bucardo reload
Expand Down
26 changes: 26 additions & 0 deletions status-server/dashboard.html
Original file line number Diff line number Diff line change
Expand Up @@ -749,6 +749,7 @@ <h3 id="modalTitle">Confirm</h3>
let preflightCheckRan = false;
let preflightBadTables = [];
let preflightGeneratedTables = [];
let preflightPgPartman = null;

// ---- Phase helpers ----
function phaseBadgeClass(phase) {
Expand Down Expand Up @@ -894,6 +895,20 @@ <h3 id="modalTitle">Confirm</h3>
+ '</div>';
}

if (preflightPgPartman && preflightPgPartman.detected) {
const parents = preflightPgPartman.parent_tables || [];
const parentItems = parents.map(t => '<li>' + escapeHtml(t) + '</li>').join('');
const recreationSql = preflightPgPartman.recreation_sql || '';
html += '<div class="guide-callout guide-callout-info">'
+ '<strong>pg_partman detected</strong>'
+ '<p style="margin:8px 0 4px;">Pause pg_partman maintenance jobs before migration. The migrator excludes pg_partman internal schemas by default and will replicate leaf partition tables in included app schemas.</p>'
+ (parents.length > 0 ? '<ul class="preflight-table-list">' + parentItems + '</ul>' : '')
+ '<p style="margin:8px 0 4px;">After schema copy finishes and pg_partman is installed on PlanetScale, apply this generated SQL on PlanetScale to recreate the partition sets. Do not apply it automatically from the migrator.</p>'
+ '<div class="guide-callout" style="margin:8px 0;">' + escapeHtml(preflightPgPartman.warning || 'pg_partman dump_partitioned_table_definition supports single-level partition sets only.') + '</div>'
+ (recreationSql ? '<pre class="guide-code" id="pgPartmanSqlBlock">' + escapeHtml(recreationSql) + '</pre><button class="btn btn-secondary btn-sm" onclick="copyPgPartmanSql()">Copy pg_partman SQL</button>' : '<p>No pg_partman recreation SQL was returned.</p>')
+ '</div>';
}

if (preflightBadTables.length > 0) {
html += '<button class="btn btn-secondary btn-sm" style="margin-top:8px;" onclick="runPreflightChecks()">Re-check Tables</button>';
if (passedSection) passedSection.style.display = 'none';
Expand All @@ -914,6 +929,7 @@ <h3 id="modalTitle">Confirm</h3>
preflightCheckRan = true;
preflightBadTables = data.tables_without_pk_or_unique || [];
preflightGeneratedTables = data.tables_with_generated_columns || [];
preflightPgPartman = data.pg_partman || null;
renderCachedPreflightResult();
if (data.all_tables_valid) {
preflightChecks.tablesReady = true;
Expand Down Expand Up @@ -1086,6 +1102,16 @@ <h3 id="modalTitle">Confirm</h3>
});
}

function copyPgPartmanSql() {
const block = document.getElementById('pgPartmanSqlBlock');
if (!block) return;
navigator.clipboard.writeText(block.textContent || '').then(() => {
showActionResult('success', 'pg_partman SQL copied', 'The generated pg_partman SQL was copied to your clipboard.');
}).catch(() => {
showActionResult('error', 'Copy failed', 'Could not copy the generated pg_partman SQL.');
});
}

function updateBucardoUI(b) {
if (!b || b.error) {
document.getElementById('bucardoLoading').style.display = '';
Expand Down
Loading