Skip to content

fix(geo-data): correct country_id mismatch in states.csv and cities.csv#1682

Open
YogeshK34 wants to merge 2 commits into
credebl:mainfrom
YogeshK34:fix/geo-data-clean
Open

fix(geo-data): correct country_id mismatch in states.csv and cities.csv#1682
YogeshK34 wants to merge 2 commits into
credebl:mainfrom
YogeshK34:fix/geo-data-clean

Conversation

@YogeshK34

@YogeshK34 YogeshK34 commented Jun 18, 2026

Copy link
Copy Markdown

Problem

states.csv contains a systematic country_id corruption affecting all countries from Germany onward (~148 countries). The country_code column is correct, but country_id values don't match the IDs in countries.csv — causing states of one country to be served under another.

Example:

File name country_id (before) country_code Actual country
states.csv Andhra Pradesh 99 IN Hong Kong (99) ≠ India (102)

This means selecting India in the UI returned Iran's states. Selecting Hong Kong returned India's states. No error is thrown — the data is silently wrong.

cities.csv had the same downstream corruption.

Root Cause

The country_id column was not derived from countries.csv IDs. The country_code column was always correct, but country_id diverged for ~148 countries starting alphabetically around Germany.

Fix

Regenerated country_id in both CSVs by mapping country_code (ISO) → correct id from countries.csv.

  • states.csv — 4,581 rows corrected
  • cities.csv — 105,446 rows corrected

Impact

Scenario Impact
Fresh local setup / CI ✅ Fixed — correct geo data on first seed
Existing seeded deployment ✅ No impact — import script skips non-empty tables

Testing

-- Verify India's states return correctly
SELECT * FROM states WHERE country_id = 102 LIMIT 5;
-- All rows should have country_code = 'IN'

Note

Supersedes #1681 — this PR isolates only the geo-data fix.

Summary by CodeRabbit

  • Bug Fixes
    • Improved the initial setup flow so an existing platform admin account is correctly reused during seeding.
    • Prevented missing user references from affecting organization records created afterward, helping keep audit fields accurate.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The seed script now fills platformUserId from an existing platform admin user’s id when that user already exists, so later seed steps receive a populated identifier.

Changes

Platform user seeding

Layer / File(s) Summary
Existing-user identifier assignment
libs/prisma-service/prisma/seed.ts
The createPlatformUser branch for an already-present admin user now sets platformUserId from that user’s id.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐇 I hopped through seed and found a clue,
An admin ID to carry through.
Now audits wink with tidy grace,
And rabbit paws keep perfect pace.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title describes a geo-data CSV fix, but the summarized code change updates Prisma seed logic for platformUserId handling. Rename the PR to reflect the actual seed script change, e.g. fix platform user seeding when the admin user already exists.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Signed-off-by: yogeshk34 <khutwadyogesh34@gmail.com>
@YogeshK34 YogeshK34 force-pushed the fix/geo-data-clean branch from 95f2de6 to 3fcc25b Compare June 26, 2026 19:29

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@libs/prisma-service/prisma/seed.ts`:
- Around line 190-192: `seed.ts` currently reuses the first result from
`prisma.user.findMany` without validating `PLATFORM_ADMIN_EMAIL`, which can
accidentally match any user when the env var is missing. Update the
`platformUserId` lookup logic to fail fast if `PLATFORM_ADMIN_EMAIL` is unset,
and switch the user lookup to `findUnique` in the platform admin branch so the
unique `email` field is used safely before assigning
`existPlatformAdminUser[0].id`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9b31560c-6a42-4670-b115-7f6252048603

📥 Commits

Reviewing files that changed from the base of the PR and between e7163f4 and 3fcc25b.

⛔ Files ignored due to path filters (1)
  • libs/prisma-service/prisma/data/geo-location-master-data/cities.csv is excluded by !**/*.csv
📒 Files selected for processing (1)
  • libs/prisma-service/prisma/seed.ts

Comment on lines +190 to +192
// User already exists — still need to set platformUserId so downstream
// functions (createPlatformOrganization) have a valid UUID for createdBy/lastChangedBy
platformUserId = existPlatformAdminUser[0].id;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether PLATFORM_ADMIN_EMAIL is validated anywhere before the Prisma lookup.
fd seed.ts . | xargs -I{} sh -c '
  echo "== {} =="
  sed -n "150,220p" "{}"
  echo
  rg -n "PLATFORM_ADMIN_EMAIL|findMany\\(|findUnique\\(" "{}"
'

Repository: credebl/platform

Length of output: 3849


🌐 Web query:

Does Prisma Client ignore undefinedvalues inwherefilters, and isfindUnique({ where: { email } }) the recommended API for a unique email column?

💡 Result:

Prisma Client behavior regarding undefined values in where filters is as follows: When you pass undefined as a value in a where filter, Prisma Client effectively ignores that filter criteria, as if it had not been provided at all [1][2]. This can lead to unexpected behavior, such as returning all records when you intended to filter by none [3][4]. To mitigate this, you can enable the strictUndefinedChecks preview feature (available in version 5.20.0 and later), which causes explicit undefined values to throw a runtime error and requires the use of Prisma.skip to intentionally omit a field [1][5]. Regarding the recommended API for a unique email column, findUnique({ where: { email } }) is indeed the correct and recommended API, provided that the email field is marked as @unique in your Prisma schema [6][7]. If the field is not marked as @unique, Prisma Client will not allow you to use it in a findUnique query; in such cases, you would receive an error and would need to either mark the field as unique or use findFirst instead [8][9][10].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the Prisma user model and nearby seed logic to confirm whether email is unique
# and whether PLATFORM_ADMIN_EMAIL is handled elsewhere.
echo "== seed.ts excerpt =="
sed -n '160,215p' libs/prisma-service/prisma/seed.ts

echo
echo "== user model references =="
rg -n "model User|model user|`@unique`|email" libs/prisma-service/prisma -g '*.prisma'

Repository: credebl/platform

Length of output: 4876


Validate PLATFORM_ADMIN_EMAIL before reusing the first user record.

If this env var is missing, prisma.user.findMany({ where: { email: undefined } }) drops the filter and existPlatformAdminUser[0].id can come from an arbitrary user. Fail fast here and switch to findUnique since email is unique.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/prisma-service/prisma/seed.ts` around lines 190 - 192, `seed.ts`
currently reuses the first result from `prisma.user.findMany` without validating
`PLATFORM_ADMIN_EMAIL`, which can accidentally match any user when the env var
is missing. Update the `platformUserId` lookup logic to fail fast if
`PLATFORM_ADMIN_EMAIL` is unset, and switch the user lookup to `findUnique` in
the platform admin branch so the unique `email` field is used safely before
assigning `existPlatformAdminUser[0].id`.

@YogeshK34

Copy link
Copy Markdown
Author

Hey @tipusinghaw & @ankita-p17,

There was a merge conflict on cities.csv my branch hadn't been rebased with upstream. I've resolved it and force-pushed the updated branch.

Would like your feedback

…pstream

Signed-off-by: yogeshk34 <khutwadyogesh34@gmail.com>
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant