Skip to content

Commit 8e40573

Browse files
authored
Merge pull request #11 from encryption4all/feat/migrate-legacy-api-keys
feat: migration script for legacy pkg api_keys → business PG- format
2 parents f4676e0 + 2ebe532 commit 8e40573

4 files changed

Lines changed: 688 additions & 0 deletions

File tree

docs/migrate-legacy-api-keys.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Legacy pkg api_keys → postguard-business migration
2+
3+
Implements the migration half of encryption4all/postguard#141. The
4+
`pg-pkg` side (validator changes) is tracked separately in
5+
encryption4all/postguard#140.
6+
7+
## Schemas
8+
9+
### Source (legacy `pg-pkg`, Postgres)
10+
11+
`api_keys` — see `pg-pkg/migrations/20260316000000_create_api_keys.up.sql`
12+
13+
| column | type | notes |
14+
| ------------------------ | ------------ | ------------------------- |
15+
| id | uuid | pk |
16+
| api_key | varchar(128) | **plaintext**, unique |
17+
| email | varchar(256) | not null |
18+
| organisation_name | varchar(256) | nullable |
19+
| phone_number | varchar(32) | nullable |
20+
| kvk_number | varchar(32) | nullable |
21+
| organisation_name_public | bool | is this attribute signed? |
22+
| phone_number_public | bool | is this attribute signed? |
23+
| kvk_number_public | bool | is this attribute signed? |
24+
| expires_at | timestamp | not null |
25+
26+
### Target (`postguard-business`, Postgres)
27+
28+
`organizations` + `business_api_keys` — see
29+
`src/lib/server/db/schema.ts`. The relevant columns:
30+
31+
```
32+
organizations(id, name, domain UNIQUE, email, contact_name, phone,
33+
kvk_number, status)
34+
business_api_keys(id, key_hash UNIQUE, key_prefix, name, org_id FK,
35+
signing_attrs JSONB, expires_at, revoked_at, created_by)
36+
```
37+
38+
Key hashes are SHA-256 of the raw key (see `scripts/seed.ts:85`).
39+
40+
## Mapping
41+
42+
| legacy column | new location |
43+
| ------------------------------------------ | ---------------------------------------------- |
44+
| `api_key` (plaintext) | `business_api_keys.key_hash` = sha256(api_key) |
45+
| `api_key[0..10]` | `business_api_keys.key_prefix` |
46+
| `email`, `organisation_name`, `kvk_number` | `organizations.*` (grouped — see below) |
47+
| `organisation_name_public` etc. | `business_api_keys.signing_attrs` booleans |
48+
| `expires_at` | `business_api_keys.expires_at` |
49+
| `phone_number` | `organizations.phone` |
50+
51+
### Grouping legacy rows into organisations
52+
53+
The legacy schema is per-key; the new schema is per-organisation with a
54+
`UNIQUE` domain. Multiple legacy rows can belong to the same real-world
55+
org, so the script groups rows using the first available of:
56+
57+
1. `kvk_number` (Dutch Chamber of Commerce number — unambiguous)
58+
2. case-insensitive `organisation_name`
59+
3. email domain
60+
4. full email (last-resort: one synthetic org per user)
61+
62+
`organizations.domain` is derived deterministically:
63+
64+
- if the grouped rows share a plausible email domain, use it verbatim;
65+
- otherwise synthesise `<slug>.legacy.postguard.local` where `<slug>` is
66+
a kebab-case slug of the kvk number / org name / email.
67+
68+
If two groups happen to collide on a synthesised domain, an 8-hex-char
69+
disambiguating suffix is appended (derived from the grouping key's hash,
70+
so the result is still deterministic across runs).
71+
72+
Migrated orgs are created with `status = 'active'` — these are existing
73+
users and do not need to re-verify.
74+
75+
### Signing attributes
76+
77+
`email` is always signed (it is hardcoded as a public attribute in
78+
`pg-pkg/src/middleware/auth.rs:212-215`). Each of `orgName`, `phone`,
79+
`kvkNumber` is signed iff **both**:
80+
81+
- the legacy row had the corresponding `*_public` flag set to `true`, and
82+
- the corresponding source column was non-null.
83+
84+
## Operator runbook
85+
86+
```bash
87+
# 1. Point the script at BOTH databases.
88+
export DATABASE_URL='postgres://...business-db'
89+
export LEGACY_DATABASE_URL='postgres://...pkg-db'
90+
91+
# 2. Dry run. Reads both DBs, writes nothing.
92+
tsx scripts/migrate-legacy-api-keys.ts --dry-run
93+
94+
# 3. Review the printed plan. Pay attention to:
95+
# - number of orgs created vs reused,
96+
# - any rows in the "Skipped" section,
97+
# - any synthetic .legacy.postguard.local domains (these flag groups
98+
# that the migration could not tie to a real domain).
99+
100+
# 4. When the plan looks sane, apply it.
101+
tsx scripts/migrate-legacy-api-keys.ts --live
102+
```
103+
104+
The live run is wrapped in a single transaction — if any insert fails
105+
the whole migration rolls back.
106+
107+
Running `--live` twice is safe: the script looks up each `key_hash`
108+
before inserting and will skip keys already present, and `organizations`
109+
are upserted by `domain`.
110+
111+
## Open product questions
112+
113+
All resolved — confirmed by @rubenhensen.
114+
115+
1. **Re-keying.** **Resolved:** preserve existing keys. All legacy keys
116+
have the `PG-API-` prefix, which passes the new `PG-` prefix check
117+
in postguard#142. No re-keying needed.
118+
119+
2. **Grouping heuristic.** **Resolved:** the `kvk → orgname →
120+
email-domain → email` fallback chain is fine.
121+
122+
3. **Synthetic org status.** **Resolved:** migrated orgs are set to
123+
`active` — these are existing users and should be grandfathered in.
124+
125+
4. **`created_by`.** **Resolved:** left NULL is fine for now.
126+
127+
## What this PR does NOT do
128+
129+
- It does **not** touch the `pg-pkg` validator (`postguard#140`).
130+
- It does **not** drop the legacy `api_keys` table. That is a separate
131+
cleanup PR that should only run after `#140` is merged AND the
132+
transition window has elapsed.
133+
- It does **not** generate or mail new keys.

scripts/migrate-legacy-api-keys.ts

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
/**
2+
* Migrate legacy pg-pkg `api_keys` rows into the postguard-business
3+
* `organizations` + `business_api_keys` tables.
4+
*
5+
* See `docs/migrate-legacy-api-keys.md` for the full design discussion,
6+
* open product questions, and operator runbook.
7+
*
8+
* Usage:
9+
* DATABASE_URL=postgres://...business-db \
10+
* LEGACY_DATABASE_URL=postgres://...pkg-db \
11+
* tsx scripts/migrate-legacy-api-keys.ts --dry-run
12+
*
13+
* # ...once the dry-run output has been reviewed:
14+
* tsx scripts/migrate-legacy-api-keys.ts --live
15+
*/
16+
17+
import { drizzle } from 'drizzle-orm/postgres-js';
18+
import postgres from 'postgres';
19+
import { eq } from 'drizzle-orm';
20+
import { organizations, apiKeys } from '../src/lib/server/db/schema.ts';
21+
import {
22+
planMigration,
23+
type LegacyApiKeyRow
24+
} from '../src/lib/server/migrations/legacy-api-keys.ts';
25+
26+
type Mode = 'dry-run' | 'live';
27+
28+
function parseMode(argv: string[]): Mode {
29+
if (argv.includes('--live')) return 'live';
30+
return 'dry-run';
31+
}
32+
33+
async function readLegacyRows(legacyUrl: string): Promise<LegacyApiKeyRow[]> {
34+
const client = postgres(legacyUrl, { max: 1 });
35+
try {
36+
const rows = await client<LegacyApiKeyRow[]>`
37+
SELECT api_key, email, organisation_name, phone_number, kvk_number,
38+
organisation_name_public, phone_number_public, kvk_number_public,
39+
expires_at
40+
FROM api_keys
41+
WHERE expires_at > NOW()
42+
`;
43+
return rows;
44+
} finally {
45+
await client.end();
46+
}
47+
}
48+
49+
async function main() {
50+
const mode = parseMode(process.argv.slice(2));
51+
52+
const DATABASE_URL = process.env.DATABASE_URL;
53+
const LEGACY_DATABASE_URL = process.env.LEGACY_DATABASE_URL;
54+
55+
if (!DATABASE_URL) {
56+
console.error('DATABASE_URL is not set (target: postguard-business DB)');
57+
process.exit(1);
58+
}
59+
if (!LEGACY_DATABASE_URL) {
60+
console.error('LEGACY_DATABASE_URL is not set (source: legacy pg-pkg DB)');
61+
process.exit(1);
62+
}
63+
64+
console.log(`Mode: ${mode}`);
65+
console.log('Reading legacy api_keys...');
66+
const legacyRows = await readLegacyRows(LEGACY_DATABASE_URL);
67+
console.log(` Found ${legacyRows.length} active legacy key(s).`);
68+
69+
const plan = planMigration(legacyRows);
70+
71+
console.log(`\nPlanned actions:`);
72+
console.log(` ${plan.orgs.length} organisation(s) will be created or reused.`);
73+
console.log(` ${plan.keys.length} api key(s) will be migrated.`);
74+
console.log(` ${plan.skipped.length} row(s) skipped (see reasons below).`);
75+
76+
if (plan.skipped.length > 0) {
77+
console.log(`\nSkipped rows:`);
78+
for (const s of plan.skipped) {
79+
console.log(` - email=${s.row.email} reason=${s.reason}`);
80+
}
81+
}
82+
83+
console.log(`\nOrganisation groups:`);
84+
for (const g of plan.orgs) {
85+
console.log(
86+
` - name="${g.name}" domain=${g.domain} kvk=${g.kvkNumber ?? '-'} ` +
87+
`keys=${g.memberKeyHashes.length}`
88+
);
89+
}
90+
91+
if (mode === 'dry-run') {
92+
console.log('\nDry-run complete. No writes were performed.');
93+
console.log('Re-run with --live to apply the migration.');
94+
return;
95+
}
96+
97+
const client = postgres(DATABASE_URL, { max: 1 });
98+
const db = drizzle(client);
99+
100+
try {
101+
await db.transaction(async (tx) => {
102+
const orgIdByHash = new Map<string, string>();
103+
104+
for (const g of plan.orgs) {
105+
const existing = await tx
106+
.select({ id: organizations.id })
107+
.from(organizations)
108+
.where(eq(organizations.domain, g.domain))
109+
.limit(1);
110+
111+
let orgId: string;
112+
if (existing.length > 0) {
113+
orgId = existing[0].id;
114+
console.log(`org: reused "${g.domain}" id=${orgId}`);
115+
} else {
116+
const [inserted] = await tx
117+
.insert(organizations)
118+
.values({
119+
name: g.name,
120+
domain: g.domain,
121+
email: g.email,
122+
contactName: g.contactName,
123+
phone: g.phone ?? null,
124+
kvkNumber: g.kvkNumber ?? null,
125+
status: 'active'
126+
})
127+
.returning({ id: organizations.id });
128+
orgId = inserted.id;
129+
console.log(`org: created "${g.domain}" id=${orgId}`);
130+
}
131+
132+
for (const h of g.memberKeyHashes) {
133+
orgIdByHash.set(h, orgId);
134+
}
135+
}
136+
137+
let inserted = 0;
138+
let alreadyPresent = 0;
139+
140+
for (const k of plan.keys) {
141+
const existing = await tx
142+
.select({ id: apiKeys.id })
143+
.from(apiKeys)
144+
.where(eq(apiKeys.keyHash, k.keyHash))
145+
.limit(1);
146+
147+
if (existing.length > 0) {
148+
alreadyPresent++;
149+
continue;
150+
}
151+
152+
const orgId = orgIdByHash.get(k.keyHash);
153+
if (!orgId) {
154+
throw new Error(
155+
`internal: key ${k.keyPrefix}... has no mapped org (this is a bug in planMigration)`
156+
);
157+
}
158+
159+
await tx.insert(apiKeys).values({
160+
keyHash: k.keyHash,
161+
keyPrefix: k.keyPrefix,
162+
name: k.name,
163+
orgId,
164+
signingAttrs: k.signingAttrs,
165+
expiresAt: k.expiresAt
166+
});
167+
inserted++;
168+
}
169+
170+
console.log(`\nInserted ${inserted} new api key row(s).`);
171+
console.log(`Found ${alreadyPresent} existing key row(s) (idempotent skip).`);
172+
});
173+
} finally {
174+
await client.end();
175+
}
176+
177+
console.log('\nMigration complete.');
178+
}
179+
180+
main().catch((err) => {
181+
console.error('Migration failed:', err);
182+
process.exit(1);
183+
});

0 commit comments

Comments
 (0)