feat(postcodes/FR): bulk-import 6,051 metropolitan codes via La Poste (#1039)#1435
Merged
Conversation
…#1039) Adds the importer + first run for metropolitan France. Uses La Poste's official base-officielle-des-codes-postaux dataset from data.gouv.fr (Licence Ouverte v2.0 / etalab-2.0). 1. bin/scripts/sync/import_laposte_postcodes.py — pipeline reading the ISO-8859-1 / semicolon-delimited CSV. Filters to metropolitan France (skips 971-988 overseas + 980 Monaco). Picks one canonical commune per postcode (first alphabetical). Resolves state via postcode-prefix to département iso2 (75=Paris, 13=Bouches-du-Rhône, etc.) with Corsica's special split (200xx-201xx -> 2A, 202xx+ -> 2B) and a 75 -> 75C override (states.json suffixes Paris's iso2). 2. contributions/postcodes/FR.json — 6,051 codes covering all 96 metropolitan départements + Corsica with 100% state_id resolution. Out of scope (deferred) - Overseas territories (GP/MQ/GF/RE/YT/PM/WF/PF/NC/BL/MF) already have curated postcode files from earlier PRs (#1402, #1417-#1426). La Poste's CSV does include their rows (475 skipped); folding the full La Poste data into those territory files is a follow-up scope decision. - Cedex codes — La Poste publishes a separate "Cedex" file with ~10k business-routing codes that don't correspond to geographic places. Those belong in a separate pipeline if added. Validation (zero errors across 6,051 records) - All codes match countries.postal_code_regex (^(\\d{5})\$) - All FKs resolve, all state_codes agree with state.iso2 - No auto-managed fields present Locality names use Libellé d'acheminement (the form La Poste actually prints on mail) rather than raw INSEE commune names — cleaner casing and accents (e.g. "Sainte-Foy-lès-Lyon" rather than "STE FOY LES LYON"). Note: the source CSV is ALL CAPS for Libellé too; if mixed-case is preferred, a follow-up Title Case pass is straightforward. License & attribution - Source: La Poste / data.gouv.fr (Licence Ouverte v2.0, etalab-2.0) - Each row: source: "laposte" Refs: #1039 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
CSC Validation ReportPR Format
Labels applied:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds metropolitan French postcodes via La Poste (Licence Ouverte v2.0 / etalab-2.0). The same Datanova endpoint that 1427 anticipated, now successfully discovered via the data.gouv.fr API.
bin/scripts/sync/import_laposte_postcodes.py— pipeline reading the semicolon-delimited ISO-8859-1 CSV.contributions/postcodes/FR.json— 6,051 codes covering all 96 metropolitan départements + Corsica with 100%state_idresolution.State resolution strategy
01400→0120137→2A20200→2B75001→75C(not75)states.jsonuses75Cinstead of75for Paris and69Mfor Lyon Metropolis — the override map handles75cleanly. Lyon defaults to département69(Rhône) since most data still uses that and the city/metropolis split is policy-fuzzy.Out of scope (deferred)
Validation (zero errors across 6,051 records)
state_idresolved^(\d{5})$state_code↔state.iso2agreementLocality naming
Uses Libellé d'acheminement (La Poste's mail-label form) rather than raw INSEE commune names. Both are ALL CAPS in the source; a follow-up Title Case pass is trivial if mixed-case is preferred.
License
source: "laposte"Cumulative postcode coverage after this lands
~218,000 postcode rows across 22 countries.
Refs: #1039