Skip to content

feat(postcodes/FR): bulk-import 6,051 metropolitan codes via La Poste (#1039)#1435

Merged
dr5hn merged 1 commit into
masterfrom
feat/postcodes-france-bulk
Apr 27, 2026
Merged

feat(postcodes/FR): bulk-import 6,051 metropolitan codes via La Poste (#1039)#1435
dr5hn merged 1 commit into
masterfrom
feat/postcodes-france-bulk

Conversation

@dr5hn
Copy link
Copy Markdown
Owner

@dr5hn dr5hn commented Apr 27, 2026

Summary

Adds metropolitan French postcodes via La Poste (Licence Ouverte v2.0 / etalab-2.0). The same Datanova endpoint that 1427 anticipated, now successfully discovered via the data.gouv.fr API.

  1. bin/scripts/sync/import_laposte_postcodes.py — pipeline reading the semicolon-delimited ISO-8859-1 CSV.
  2. contributions/postcodes/FR.json6,051 codes covering all 96 metropolitan départements + Corsica with 100% state_id resolution.

State resolution strategy

Pattern Example Result
Standard 2-digit département 0140001 Ain
Corsica 200xx-201xx 201372A Corse-du-Sud
Corsica 202xx-209xx 202002B Haute-Corse
Paris (override) 7500175C (not 75) Paris

states.json uses 75C instead of 75 for Paris and 69M for Lyon Metropolis — the override map handles 75 cleanly. Lyon defaults to département 69 (Rhône) since most data still uses that and the city/metropolis split is policy-fuzzy.

Out of scope (deferred)

Validation (zero errors across 6,051 records)

Check Result
Records 6,051
state_id resolved 100%
Codes matching ^(\d{5})$
FK resolution
state_codestate.iso2 agreement
No auto-managed fields

Locality naming

Uses Libellé d'acheminement (La Poste's mail-label form) rather than raw INSEE commune names. Both are ALL CAPS in the source; a follow-up Title Case pass is trivial if mixed-case is preferred.

License

  • Source: La Poste / data.gouv.fr (Licence Ouverte v2.0 / etalab-2.0)
  • Each row: source: "laposte"

Cumulative postcode coverage after this lands

~218,000 postcode rows across 22 countries.

Refs: #1039

…#1039)

Adds the importer + first run for metropolitan France. Uses La Poste's
official base-officielle-des-codes-postaux dataset from data.gouv.fr
(Licence Ouverte v2.0 / etalab-2.0).

1. bin/scripts/sync/import_laposte_postcodes.py — pipeline reading the
   ISO-8859-1 / semicolon-delimited CSV. Filters to metropolitan France
   (skips 971-988 overseas + 980 Monaco). Picks one canonical commune
   per postcode (first alphabetical). Resolves state via postcode-prefix
   to département iso2 (75=Paris, 13=Bouches-du-Rhône, etc.) with
   Corsica's special split (200xx-201xx -> 2A, 202xx+ -> 2B) and a
   75 -> 75C override (states.json suffixes Paris's iso2).

2. contributions/postcodes/FR.json — 6,051 codes covering all 96
   metropolitan départements + Corsica with 100% state_id resolution.

Out of scope (deferred)
- Overseas territories (GP/MQ/GF/RE/YT/PM/WF/PF/NC/BL/MF) already have
  curated postcode files from earlier PRs (#1402, #1417-#1426). La
  Poste's CSV does include their rows (475 skipped); folding the full
  La Poste data into those territory files is a follow-up scope decision.
- Cedex codes — La Poste publishes a separate "Cedex" file with ~10k
  business-routing codes that don't correspond to geographic places.
  Those belong in a separate pipeline if added.

Validation (zero errors across 6,051 records)
- All codes match countries.postal_code_regex (^(\\d{5})\$)
- All FKs resolve, all state_codes agree with state.iso2
- No auto-managed fields present

Locality names use Libellé d'acheminement (the form La Poste actually
prints on mail) rather than raw INSEE commune names — cleaner casing
and accents (e.g. "Sainte-Foy-lès-Lyon" rather than "STE FOY LES LYON").
Note: the source CSV is ALL CAPS for Libellé too; if mixed-case is
preferred, a follow-up Title Case pass is straightforward.

License & attribution
- Source: La Poste / data.gouv.fr (Licence Ouverte v2.0, etalab-2.0)
- Each row: source: "laposte"

Refs: #1039

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 27, 2026 09:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

@dosubot dosubot Bot added size:XS This PR changes 0-9 lines, ignoring generated files. enhancement New feature or request labels Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

CSC Validation Report

PR Format

  • ✅ Description provided
  • ✅ Data source linked
  • ✅ Issue linked (recommended for data changes)
  • ✅ Justification / context provided

Labels applied: data:postcodes, large-contribution

⚠️ Large Contribution

This PR contains 6051 records. Large contributions require manual review.

Schema Validation (6051 records)

✅ All records passed validation

Cross-Reference Validation

✅ 12102 reference(s) verified


All checks passed | Status: Ready for review

@dr5hn dr5hn merged commit b62bcc5 into master Apr 27, 2026
1 check passed
@dr5hn dr5hn deleted the feat/postcodes-france-bulk branch April 27, 2026 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data:postcodes enhancement New feature or request large-contribution ready-for-review size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants