-
Notifications
You must be signed in to change notification settings - Fork 377
Description
What was I trying to achieve?
I'm migrating from an old server to a new one, and I take benefit of this to upgrade my old Plausible (1.4.x) to the new one (3.0.1 or 2.1.5)
What's my issue?
After:
- running the plausible instance on the new server
- importing psql data
- importing ClickHouse data in events and sessions tables
I ran the script that migrates old data to the new "_v2" tables:docker compose exec plausible bin/plausible rpc Plausible.DataMigration.NumericIDs.run
But this script messed up the sessions_v2 and events_v2 tables.
Before running the script, I had the following columns:
- In
sessions_v2:countryandsource - In
events_v2:country
and after using the tool, they were replaced withcountry_codeand (probably)referrer_source
According to me, the script isn't versioned at all and execute the same wether we're on the v2.1.5, v3.0.0, v3.0.1. I tried with v2.1.5 and v.3.0.1 and it did exactly the same.
It results in errors like :
(Ch.Error) Code: 47. DB::Exception: **Identifier 's0.source' cannot be resolved from table with name s0**. In scope SELECT toUInt64(round(uniq(s0.user_id) * any(_sample_factor))) AS visitors, if(empty(s0.source), 'Direct / None', s0.source) AS dim0 FROM sessions_v2 AS s0 WHERE (s0.site_id = _CAST(1, 'Int64')) AND (s0.start >= _CAST(1755986400, 'DateTime')) AND (s0.timestamp >= _CAST(1756591200, 'DateTime')) AND (s0.start <= _CAST(1759010399, 'DateTime')) GROUP BY dim0 ORDER BY visitors DESC, dim0 ASC LIMIT _CAST(0, 'Int64'), _CAST(9, 'Int64'). (UNKNOWN_IDENTIFIER) (version 24.12.6.70 (official build))
or the same with country Identifier 's0.country' cannot be resolved from table with name s0
For the record, my migration process is the following one:
Export database data (old instance)
docker exec -t hosting_plausible_db_1 pg_dumpall -U postgres > dump.sql
Export Clickhouse data (old instance)
docker exec -i hosting_plausible_events_db_1 \
clickhouse-client --query="SELECT * FROM plausible_events_db.sessions FORMAT JSONEachRow" \
| gzip > sessions.json.gz
docker exec -i hosting_plausible_events_db_1 \
clickhouse-client --query="SELECT * FROM plausible_events_db.events FORMAT JSONEachRow" \
| gzip > events.json.gz
After creating the new instance and copying the dump.sql and both json.gz on the new instance:
Prepare database by importing data :
docker compose up plausible_db
docker exec -i plausible-ce-plausible_db-1 psql -U postgres -c "DROP DATABASE IF EXISTS plausible_db;"
docker exec -i plausible-ce-plausible_db-1 psql -U postgres -c "CREATE DATABASE plausible_db;"
docker exec -i plausible-ce-plausible_db-1 psql -U postgres plausible_db < dump.sql
docker exec -i plausible-ce-plausible_db-1 psql -U postgres -c "ALTER USER postgres WITH PASSWORD 'postgres';"
Run the complete stack :
docker stop plausible-ce-plausible_db-1
docker compose up
Import ClickHouse data
gunzip -c sessions.json.gz | docker exec -i plausible-ce-plausible_events_db-1 \
clickhouse-client --query="INSERT INTO plausible_events_db.sessions FORMAT JSONEachRow"
gunzip -c events.json.gz | docker exec -i plausible-ce-plausible_events_db-1 \
clickhouse-client --query="INSERT INTO plausible_events_db.events FORMAT JSONEachRow"
Run migration that will break schemas
docker compose exec plausible bin/plausible rpc Plausible.DataMigration.NumericIDs.run
Fix schemas
docker exec -it plausible-ce-plausible_events_db-1 clickhouse-client
Run the following commands in the container to fix schemas:
USE plausible_events_db;
ALTER TABLE plausible_events_db.sessions_v2
ADD COLUMN IF NOT EXISTS country LowCardinality(FixedString(2)) DEFAULT '';
ALTER TABLE plausible_events_db.sessions_v2
UPDATE country = country_code
WHERE country = '';
ALTER TABLE plausible_events_db.sessions_v2
ADD COLUMN IF NOT EXISTS source String DEFAULT '';
ALTER TABLE plausible_events_db.sessions_v2
UPDATE source = referrer_source
WHERE source = '';
ALTER TABLE plausible_events_db.events_v2
ADD COLUMN IF NOT EXISTS country LowCardinality(FixedString(2)) DEFAULT '';
ALTER TABLE plausible_events_db.events_v2
UPDATE country = country_code
WHERE country = '';
Did I do something wrong, or should the migration tool be versioned to correctly manage the new schema?