Skip to content

Migration tool messes up database schema : "Identifier 's0.source' cannot be resolved from table with name s0" #257

@qlereboursFR

Description

@qlereboursFR

What was I trying to achieve?

I'm migrating from an old server to a new one, and I take benefit of this to upgrade my old Plausible (1.4.x) to the new one (3.0.1 or 2.1.5)

What's my issue?

After:

  • running the plausible instance on the new server
  • importing psql data
  • importing ClickHouse data in events and sessions tables
    I ran the script that migrates old data to the new "_v2" tables: docker compose exec plausible bin/plausible rpc Plausible.DataMigration.NumericIDs.run

But this script messed up the sessions_v2 and events_v2 tables.
Before running the script, I had the following columns:

  • In sessions_v2: country and source
  • In events_v2: country
    and after using the tool, they were replaced with country_code and (probably) referrer_source

According to me, the script isn't versioned at all and execute the same wether we're on the v2.1.5, v3.0.0, v3.0.1. I tried with v2.1.5 and v.3.0.1 and it did exactly the same.

It results in errors like :

(Ch.Error) Code: 47. DB::Exception: **Identifier 's0.source' cannot be resolved from table with name s0**. In scope SELECT toUInt64(round(uniq(s0.user_id) * any(_sample_factor))) AS visitors, if(empty(s0.source), 'Direct / None', s0.source) AS dim0 FROM sessions_v2 AS s0 WHERE (s0.site_id = _CAST(1, 'Int64')) AND (s0.start >= _CAST(1755986400, 'DateTime')) AND (s0.timestamp >= _CAST(1756591200, 'DateTime')) AND (s0.start <= _CAST(1759010399, 'DateTime')) GROUP BY dim0 ORDER BY visitors DESC, dim0 ASC LIMIT _CAST(0, 'Int64'), _CAST(9, 'Int64'). (UNKNOWN_IDENTIFIER) (version 24.12.6.70 (official build))

or the same with country Identifier 's0.country' cannot be resolved from table with name s0

For the record, my migration process is the following one:

Export database data (old instance)
docker exec -t hosting_plausible_db_1 pg_dumpall -U postgres > dump.sql

Export Clickhouse data (old instance)

docker exec -i hosting_plausible_events_db_1 \
  clickhouse-client --query="SELECT * FROM plausible_events_db.sessions FORMAT JSONEachRow" \
  | gzip > sessions.json.gz
docker exec -i hosting_plausible_events_db_1 \
  clickhouse-client --query="SELECT * FROM plausible_events_db.events FORMAT JSONEachRow" \
  | gzip > events.json.gz

After creating the new instance and copying the dump.sql and both json.gz on the new instance:

Prepare database by importing data :

docker compose up plausible_db
docker exec -i plausible-ce-plausible_db-1 psql -U postgres -c "DROP DATABASE IF EXISTS plausible_db;"
docker exec -i plausible-ce-plausible_db-1 psql -U postgres -c "CREATE DATABASE plausible_db;"
docker exec -i plausible-ce-plausible_db-1 psql -U postgres plausible_db < dump.sql
docker exec -i plausible-ce-plausible_db-1 psql -U postgres -c "ALTER USER postgres WITH PASSWORD 'postgres';"

Run the complete stack :

docker stop plausible-ce-plausible_db-1
docker compose up

Import ClickHouse data

gunzip -c sessions.json.gz | docker exec -i plausible-ce-plausible_events_db-1 \
  clickhouse-client --query="INSERT INTO plausible_events_db.sessions FORMAT JSONEachRow"
gunzip -c events.json.gz | docker exec -i plausible-ce-plausible_events_db-1 \
  clickhouse-client --query="INSERT INTO plausible_events_db.events FORMAT JSONEachRow"

Run migration that will break schemas

docker compose exec plausible bin/plausible rpc Plausible.DataMigration.NumericIDs.run

Fix schemas

docker exec -it plausible-ce-plausible_events_db-1 clickhouse-client

Run the following commands in the container to fix schemas:

USE plausible_events_db;

ALTER TABLE plausible_events_db.sessions_v2
ADD COLUMN IF NOT EXISTS country LowCardinality(FixedString(2)) DEFAULT '';

ALTER TABLE plausible_events_db.sessions_v2
UPDATE country = country_code
WHERE country = '';

ALTER TABLE plausible_events_db.sessions_v2
ADD COLUMN IF NOT EXISTS source String DEFAULT '';

ALTER TABLE plausible_events_db.sessions_v2
UPDATE source = referrer_source
WHERE source = '';

ALTER TABLE plausible_events_db.events_v2
ADD COLUMN IF NOT EXISTS country LowCardinality(FixedString(2)) DEFAULT '';

ALTER TABLE plausible_events_db.events_v2
UPDATE country = country_code
WHERE country = '';

Did I do something wrong, or should the migration tool be versioned to correctly manage the new schema?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions