Skip to content

Bulk add uruguay cities#1497

Merged
dr5hn merged 3 commits into
dr5hn:masterfrom
mrodal:bulk
May 5, 2026
Merged

Bulk add uruguay cities#1497
dr5hn merged 3 commits into
dr5hn:masterfrom
mrodal:bulk

Conversation

@mrodal
Copy link
Copy Markdown
Contributor

@mrodal mrodal commented Apr 29, 2026

Description

Adds missing cities from Uruguay

Type of Change

  • Adding new data (cities/states/countries)
  • Correcting existing data
  • Deleting data (requires justification)
  • Infrastructure/Scripts
  • Documentation
  • Other (describe below)

Data Source

Source: https://catalogodatos.gub.uy/dataset/ide-localidades-del-uruguay
Date Verified: 29/04/2026

Affected Entities

  • Country: Uruguay
  • State/Province: All
  • Number of records: 1839

Checklist

  • I have edited files only in the contributions/ directory
  • I have NOT included id, created_at, updated_at, or flag fields
  • All required fields are present (name, coordinates, codes)
  • Data source is from a Tier 1 or Tier 2 source
  • Coordinates are verified and accurate
  • Verified data against reliable sources

Related Issue

Closes #

Notes

To find the missing cities I matched the cities from this db with the government source by exact state, close name and close coordinates (<2km apart), then checked the matches manually to ensure it was correct. Then I substracted the matches from the government source and also did some cleaning to remove duplicate entries.

The government source has the cities polygons in EPSG 32721 format, to get the real world coordinates I calculated the center of each polygon and converted to real world coordinates

@dosubot dosubot Bot added size:XS This PR changes 0-9 lines, ignoring generated files. enhancement New feature or request labels Apr 29, 2026
Copy link
Copy Markdown
Owner

dr5hn commented May 4, 2026

Weekly data-quality review (2026-05-04)

Verdict: needs-discussion

Checks

  • Schema: ✅ No forbidden fields (id, flag, created_at, updated_at absent from all 1,839 new records). All required fields present (name, state_id, state_code, country_id, country_code, latitude, longitude, timezone).
  • FK integrity: ✅ All records use country_id: 235 (UY). All 1,839 records carry a state_id and state_code.
  • Coordinates: ✅ All 1,839 lat/lon pairs fall within Uruguay's bounding box (lat −34.95 → −30.09, lon −58.44 → −53.07).
  • Wikidata: N/A (no wikiDataId in new records).
  • Naming convention: ✅ Names are Spanish-language, consistent with existing UY entries.

Discussion items

22 records share exact coordinates with at least one other new record (20 pairs, 1 triple). These are a direct consequence of the polygon-centroid method described in the PR (multiple localities inside the same source polygon get the same centroid). Most look intentional; a few deserve a second look:

  1. Balneario Zagarzau / Balneario Zagarzazu / Zagarzazu — all three in state CO (Colonia) at (-33.963, -58.332). These appear to be spelling variants of the same locality. Recommend keeping only one canonical form.
  2. Cerro Chato appears three times in departments DU (state 3209), FD (state 3217), and TT (state 3214) with identical coordinates (-33.103, -55.134). If the town sits on a tri-departmental boundary this is defensible, but the single-point geometry suggests the source polygon wasn't split — worth confirming whether three entries are intended.
  3. Same-state same-coord pairs (e.g. 18 De Julio + Pueblo Nuevo in SJ, City Park + Parque Carrasco in CA, Kiyu - Ordeig + Ordeig in SJ) could be legitimate distinct localities sharing a polygon; the manual check described in the PR description is reassuring, but a brief note in the PR on each would help reviewers.

No blocker is raised — the methodology is sound and the data source (official government geodata) is Tier 1. Once the Zagarzazu duplicates and Cerro Chato tri-entry are confirmed, this is ready to merge.

🤖 Automated weekly review — Claude (sonnet-4-6).


Generated by Claude Code

@mrodal
Copy link
Copy Markdown
Contributor Author

mrodal commented May 4, 2026

I have reviewed the changes and i had uploaded an outdated version. But the latest version still included alternate names for the same locality as different records. I have now merged them into single records with the alternate names separated by a hyphen.
There are some localities that are in the intersection between departments, so these are repeated for each of those departments. these are:

ILLESCAS,FLORIDA
ILLESCAS,LAVALLEJA

VALENTINES,FLORIDA
VALENTINES,TREINTA Y TRES

CERRO CHATO,DURAZNO
CERRO CHATO,FLORIDA
CERRO CHATO,TREINTA Y TRES

MERINOS,PAYSANDU
MERINOS,RIO NEGRO

PIEDRA SOLA,PAYSANDU
PIEDRA SOLA,TACUAREMBO

TAMBORES,PAYSANDU
TAMBORES,TACUAREMBO

Copy link
Copy Markdown
Owner

@dr5hn dr5hn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed +1,823 net new cities from catalogodatos.gub.uy. Validation on master:

  • All 19 UY state codes resolve, no FK mismatches.
  • All coordinates within Uruguay bbox (~lat -35..-30, lon -58..-53).
  • Required fields complete on every new row.
  • No id/created_at/updated_at/flag on the additions.
  • No intra-PR duplicates.
  • Methodology (match by exact state + close name + <2km coords, then subtract) is sound.

One minor convention note for a follow-up PR (not blocking): 633 of the new rows use Title Case for Spanish prepositions ("18 De Julio", "Cerros De La Calera", "Termas Del Dayman") whereas the existing 33 multi-word UY rows use lowercase ("25 de Agosto", "Colonia del Sacramento", "Punta del Este"). I'll push a small normalisation pass after merge so the convention stays consistent across the file.

Thanks for the careful sourcing — this is a great addition.

@dr5hn dr5hn merged commit 6a695da into dr5hn:master May 5, 2026
1 check failed
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants