Tracks daily changes to the City of Toronto's Address Points dataset — over 525,000 addresses across the city.
Every day, the City publishes a fresh snapshot of all address points. This tool downloads each snapshot, stores it, and produces a diff report showing which addresses were added, removed, or modified since the last run.
The City of Toronto doesn't publish historical versions of this dataset — each daily update replaces the previous one. Without tracking changes over time, there's no way to know when an address appeared, disappeared, or was corrected.
This project fills that gap.
Browse the latest change report on the project page.
This tool uses a Slowly Changing Dimension (SCD) Type 2 approach to store address history efficiently.
Instead of storing full snapshots for every day, we track the validity period (min_snapshot_id to max_snapshot_id) for each address record.
This allows us to:
- Store only the changes (deltas), saving significant space.
- Query the state of the database at any point in history.
- Generate accurate diff reports even for periods with no changes.
Fetch the latest address points from Toronto Open Data:
python run.py downloadImport a specific GeoJSON file. This will automatically detect changes against the previous snapshot:
python run.py import --file data/address-points-YYYY-MM-DD.geojsonWithout --file, it picks the alphabetically last .geojson in data/, which may not be the most recent date if non-date-named files (e.g. test-*.geojson) are present. Always pass the file explicitly to be safe.
If you need to re-process all data (e.g., after a schema change or to backfill history), use the rebuild command.
Warning: This deletes the existing database and re-imports all files in data/ sequentially.
python run.py rebuildGenerate HTML reports for all historical snapshots and update the index:
python run.py report-allDownload, import, diff, and generate a report in a single command:
python run.py updateTwo PowerShell scripts manage the Windows Task Scheduler entry. Run them as Administrator.
Add — registers a daily task that runs update at noon and appends output to logs\scheduler.log:
.\schedule-add.ps1Remove — unregisters the task:
.\schedule-remove.ps1The task is named TorontoAddressImport and can also be managed via the Task Scheduler GUI (taskschd.msc).
If the task fails with a "python not found" error, replace
pythoninschedule-add.ps1with the full path (e.g.C:\Python312\python.exe). Runwhere pythonin a terminal to find it.