fix: Skip COMMENT statements in tenant migration rewriter#2128
fix: Skip COMMENT statements in tenant migration rewriter#2128
Conversation
processMigrationSQL did naive string replacement of schema patterns
("party".X -> "org_tenant".X) across all migration content. This
corrupted COMMENT ON COLUMN "party"."attributes" IS ..., where the
second identifier is a column name rather than a schema qualifier.
The rewrite produced COMMENT ON COLUMN "org_tenant"."attributes"
which Postgres parses as schema.table (missing the column) and fails
with "relation does not exist".
PR #2120 documented the bug with a knownBroken skip-list and a no-op
corrective migration but did not fix the rewriter. Every fresh tenant
now hard-fails at the party.attributes migration, cascading into the
"relation does not exist" errors for data_source / dataset_definition /
instrument_definition seen in E2E and develop deploy - downstream
service schemas are never populated because provisioning aborts at
the first service.
Fix processMigrationSQL to split SQL into statements, detect COMMENT
statements via isCommentStatement, and skip schema-pattern rewriting
for them while still rewriting DDL statements normally.
Also:
- Remove the knownBroken entries in TestProcessMigrationSQL_AllMigrations_Parse
so the three affected migrations are exercised end-to-end.
- Replace TestProcessMigrationSQL_CommentOnColumn_BugDocumentation with
TestProcessMigrationSQL_CommentOnColumn_NotRewritten, asserting the
fix preserves table.column references.
- Add TestProcessMigrationSQL_CommentOnColumn_MixedWithDDL to pin the
mixed-statement behavior (DDL rewritten, COMMENT preserved).
- Bump baselineOversizedFunctions 187 -> 189 to unblock Test shard 0.
- deploy-develop.yml: broaden the tenant_provisioning reset to clear
service_schemas and include all non-deprovisioned rows, so stale
tenants whose service_schemas entries are marked migrated but whose
physical tables are missing get re-provisioned on redeploy.
📝 WalkthroughWalkthroughSplit migration SQL into individual statements, skip rewriting Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Claude Code ReviewCommit: SummaryClean, well-scoped bug fix for a migration rewriter that corrupted The E2E workflow change from manual schema creation to using the actual provisioner ( Risk Assessment
Findings
Test Coverage
Bot Review NotesNo unresolved bot threads. Previous inline comments resolved. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@services/tenant/provisioner/migration_runner.go`:
- Around line 413-421: The isCommentStatement function fails to detect COMMENT
statements preceded by block comments; update isCommentStatement to skip leading
block comments (/* ... */) as well as line comments and blank lines before
checking the first non-comment token, e.g., in isCommentStatement iterate
through stmt lines (or scan tokens) to strip/skip both lines starting with "--"
and block comment spans that may start and end on the same line or across lines,
then perform strings.HasPrefix(strings.ToUpper(trimmed), "COMMENT ") on the
first non-comment chunk so statements like "/* preface */ COMMENT ON ..." are
correctly recognized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6041708f-8214-43ee-af56-237156ed25df
📒 Files selected for processing (4)
.github/workflows/deploy-develop.ymlservices/tenant/provisioner/migration_runner.goservices/tenant/provisioner/migration_runner_test.gotests/architecture/size_test.go
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
The E2E workflow created empty org_<tenant> schemas manually via CREATE SCHEMA IF NOT EXISTS and relied on the public-schema search_path fallback to resolve unqualified table references. PR #2125 removed public from search_path, so those empty tenant schemas can no longer satisfy any query and the manifest diff step fails with "relation data_source / dataset_definition / instrument_definition does not exist". Replace the manual schema workaround with real provisioner invocation: - Stage per-service migration files under /tmp/provisioner-migrations so MIGRATIONS_BASE_PATH points at a flat layout matching the Docker image structure (/migrations/<service>/*.sql). - Start meridian with SCHEMA_PROVISIONING_ENABLED=true and MIGRATIONS_BASE_PATH=/tmp/provisioner-migrations so seed-dev's InitiateTenant triggers real provisioning that creates schemas AND applies migrations to them.
Address CodeRabbit review feedback: the initial isCommentStatement implementation only skipped leading whitespace and line comments. A statement of the form "/* audit */ COMMENT ON COLUMN ..." was not recognized as a COMMENT and could still be corrupted by the schema rewriter. Extract stripLeadingNoise helper that iteratively strips whitespace, -- line comments, and /* ... */ block comments, then check the remaining text for the COMMENT keyword. Unterminated block comments return false so no rewriting is attempted on malformed SQL. Add TestIsCommentStatement_SkipsLeadingComments covering plain, leading whitespace, line comment, block comment, mixed leading comments, non-COMMENT DDL, mid-statement block comment, and unterminated block comment cases.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
services/tenant/provisioner/migration_runner.go (1)
385-407: Consider removing the split→join→split round-trip.
processMigrationSQLnow parses statements and serializes them back, thenapplyMigrationListparses again at Line 278. Returning rewritten statements directly would reduce parser drift risk and avoid duplicate work.♻️ Refactor sketch
-processedSQL := p.processMigrationSQL(mig.Content, schemaName) -statements := splitSQLStatements(processedSQL) +statements := p.processMigrationStatements(mig.Content, schemaName)-func (p *PostgresProvisioner) processMigrationSQL(sql, schemaName string) string { +func (p *PostgresProvisioner) processMigrationStatements(sql, schemaName string) []string { ... - if len(rewritten) == 0 { - return "" - } - return strings.Join(rewritten, ";\n") + ";" + return rewritten }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/tenant/provisioner/migration_runner.go` around lines 385 - 407, The current processMigrationSQL does splitSQLStatements -> rewrite -> strings.Join and returns a single SQL string which forces applyMigrationList to call splitSQLStatements again; instead change processMigrationSQL to return the rewritten statement slice directly (e.g. []string) to avoid the split→join→split round-trip: keep the existing statement parsing loop and isCommentStatement/schemaPatterns rewriting logic inside processMigrationSQL, change its signature to return []string (or add a new helper like rewriteMigrationStatements that returns []string), update callers such as applyMigrationList to accept and use the returned []string directly (removing the extra splitSQLStatements call and the trailing semicolon join), and preserve existing behavior for comment statements and quoting rules by reusing schemaPatterns and the same replacement logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@services/tenant/provisioner/migration_runner.go`:
- Around line 385-407: The current processMigrationSQL does splitSQLStatements
-> rewrite -> strings.Join and returns a single SQL string which forces
applyMigrationList to call splitSQLStatements again; instead change
processMigrationSQL to return the rewritten statement slice directly (e.g.
[]string) to avoid the split→join→split round-trip: keep the existing statement
parsing loop and isCommentStatement/schemaPatterns rewriting logic inside
processMigrationSQL, change its signature to return []string (or add a new
helper like rewriteMigrationStatements that returns []string), update callers
such as applyMigrationList to accept and use the returned []string directly
(removing the extra splitSQLStatements call and the trailing semicolon join),
and preserve existing behavior for comment statements and quoting rules by
reusing schemaPatterns and the same replacement logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ca9d2538-87b1-42bc-a264-a1efc1a365a6
📒 Files selected for processing (2)
services/tenant/provisioner/migration_runner.goservices/tenant/provisioner/migration_runner_test.go
PR #2128 added a tenant_provisioning reset to force re-provisioning on every deploy, but the reset targeted the wrong table. The provisioning worker polls for pending tenants via ListByStatus(StatusProvisioningPending) which reads tenant.status, not tenant_provisioning.state. Resetting only tenant_provisioning left tenant.status='active', so the worker never claimed the tenant, provisioning never re-ran, and seed-dev's manifest apply still failed against the empty tenant schemas. Also: seed-dev's waitForTenantReady calls GetTenantProvisioningStatus whose OverallStatus is derived from tenant.status. With tenant.status stuck at 'active', seed-dev returned immediately from its wait loop and fired the manifest apply before the worker had a chance to act. Reset both tables in the same psql call: - tenant.status = 'provisioning_pending' (triggers worker pickup and makes seed-dev wait for real completion). - tenant_provisioning.service_schemas = '[]' (forces provisionAllServices to re-apply migrations instead of short-circuiting on "already provisioned" entries).
…2131) PR #2128 added a tenant_provisioning reset to force re-provisioning on every deploy, but the reset targeted the wrong table. The provisioning worker polls for pending tenants via ListByStatus(StatusProvisioningPending) which reads tenant.status, not tenant_provisioning.state. Resetting only tenant_provisioning left tenant.status='active', so the worker never claimed the tenant, provisioning never re-ran, and seed-dev's manifest apply still failed against the empty tenant schemas. Also: seed-dev's waitForTenantReady calls GetTenantProvisioningStatus whose OverallStatus is derived from tenant.status. With tenant.status stuck at 'active', seed-dev returned immediately from its wait loop and fired the manifest apply before the worker had a chance to act. Reset both tables in the same psql call: - tenant.status = 'provisioning_pending' (triggers worker pickup and makes seed-dev wait for real completion). - tenant_provisioning.service_schemas = '[]' (forces provisionAllServices to re-apply migrations instead of short-circuiting on "already provisioned" entries). Co-authored-by: Ben Coombs <bjcoombs@users.noreply.github.com>
Summary
processMigrationSQLnaively rewroteCOMMENT ON COLUMN \"party\".\"attributes\"into\"org_tenant\".\"attributes\", producing schema.column instead of table.column and failing withrelation does not exist.data_source/dataset_definition/instrument_definitionerrors seen in E2E shards and the develop deploy.knownBrokenmigration tests, bump the architecture size baseline to unblock Test shard 0, and broaden the deploy-develop tenant_provisioning reset so stale tenants get re-provisioned.Root cause
PR #2120 documented this bug (adding a
knownBrokenskip-list and a no-op corrective migration) but did not fixprocessMigrationSQL. The rewriter treats the second identifier in a COMMENT ON COLUMN as a table name and replaces it with the tenant schema.There are zero legitimate uses of schema-qualified references (
\"party\".\"table\",\"payment_order\".\"table\", etc.) in any migration file - all matches are COMMENT ON COLUMN statements. The rewriter was solving a phantom problem while corrupting real statements.Changes
services/tenant/provisioner/migration_runner.go
processMigrationSQL: split SQL into statements, skip schema-pattern rewriting for COMMENT statements, rewrite DDL as before.isCommentStatementhelper that tolerates leading whitespace and line comments.services/tenant/provisioner/migration_runner_test.go
knownBrokenentries soTestProcessMigrationSQL_AllMigrations_Parsenow exercises the previously-broken migrations.TestProcessMigrationSQL_CommentOnColumn_BugDocumentationwithTestProcessMigrationSQL_CommentOnColumn_NotRewritten, asserting the fix preservestable.column.TestProcessMigrationSQL_CommentOnColumn_MixedWithDDLto pin mixed-statement behavior (DDL rewritten, COMMENT preserved).tests/architecture/size_test.go
baselineOversizedFunctions187 -> 189 (unrelated ratchet failure on Test shard 0)..github/workflows/deploy-develop.yml
service_schemasand include all non-deprovisioned rows. Without this, the existingvolterra_energyrow on develop (state=pending, all services markedmigrated, no physical tables) short-circuits provisioning via the "service already provisioned, skipping" check inprovisionAllServicesand the code fix alone cannot recover the droplet.Evidence
gh run view 23980915882(Deploy Develop),23980915891(E2E all 4 shards),23980915885(Test shard 0),23980915881(Build & Test).meridian_mastertenant_provisioning on develop:party migrations failed: execute migration 20260221000001_add_party_attributes.sql: ERROR: relation \"org_meridian_master\" does not exist (SQLSTATE 42P01).go test ./services/tenant/provisioner/ -count=1passes (340s);go test ./tests/architecture/ -run TestFunctionSizepasses.Test plan