Skip to content

Cannot upgrade to 1.30.4 without downtime #10358

@smcgivern

Description

@smcgivern

Expected Behavior

We want to upgrade from Temporal 1.29.3 to 1.30.4. We expect this to be possible without downtime.

We are using Temporal self-hosted with Postgres.

Actual Behavior

The release contains this migration: https://github.com/temporalio/temporal/blob/main/schema/postgresql/v12/visibility/versioned/v1.10/add_temporal_reported_columns.sql

Our executions_visibility table has 35 million rows. When I run this migration on a test instance (sized to match our production instance; 16 cores, 128GB memory), with no other traffic, it takes almost an hour:

ALTER TABLE executions_visibility ADD COLUMN TemporalReportedProblems JSONB GENERATED ALWAYS AS (search_attributes->'TemporalReportedProblems') STORED;
CREATE INDEX by_temporal_reported_problems ON executions_visibility USING GIN (namespace_id, TemporalReportedProblems jsonb_path_ops);
ALTER TABLE
Time: 3430898.163 ms (57:10.898)
CREATE INDEX
Time: 34220.298 ms (00:34.220)

While it's doing this, it also uses 50% of the CPU available to the DB.

In addition to the length this takes, this also obtains an exclusive lock on the table, meaning we'd have to take downtime to run this.

Steps to Reproduce the Problem

  1. Have a large (?) visibility table.
  2. Run the linked migration.

Other migrations

These all do the same thing (add a column GENERATED ALWAYS ... STORED) so would have the same problem for us - locking and rewriting the entire table.

Could we consider using triggers for these instead, so we don't need to rewrite at migration time? The search attributes don't appear to exist prior to the migration. We can do this ourselves manually if needed.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions