For instructions on how to generate these folders, see QUICKSTART.md. For how this data is hosted and served to the dashboard, see HOSTING.md.
Where this data lives: the pipeline writes these folders to your local
data/directory, thenupload_to_azure.pypushes them to thewalkshedsAzure Blob Storage container. The HTML dashboard reads them from Azure at runtime (via theDATA_BASEURL), mirroring the exact folder layout described below.data/is not committed to Git.
Each city processed by this pipeline generates a folder under data/ named after its TDEI OSW dataset ID.
Example for Seattle: data/05776f25-f0f3-461c-ac34-4fa88a00936c/
All generated files live inside the data/ subfolder of that directory:
data/<dataset_id>/
└── data/
├── stops/
├── walkshed_geojson/
├── walkshed_edges_by_stop/
├── walkshed_edges_by_stop_wheelchair/
├── metrics/
├── csv_pois/
└── overpass_tile_cache/
Created by: run_city_pipeline.py (step 2 — calls internal generate_bus_stops_geojson logic)
Contains one file:
| File | Description |
|---|---|
{city}_bus_stops.geojson |
GeoJSON FeatureCollection of all unique bus stop locations for the city. Each Feature is a Point with stop_id, agency, name, lat, lon properties. This is the primary input to the walkshed pipeline — one walkshed is generated per stop. |
Why this exists: The route CSV lists the same stop many times across many routes and paths. This step deduplicates them into a clean point file, one feature per physical stop location.
Created by: run_walksheds_from_geojson.py
Contains the city-wide walkshed edge networks for each accessibility profile, both as intermediate batch files and final merged files.
| File pattern | Description |
|---|---|
{city}_{Profile}_combined_edges.geojson |
Final merged file. All walkshed edges for the entire city under a given profile, concatenated into one GeoJSON FeatureCollection. This is the primary file used by query_osm_pois.py as the bounding box source. |
{city}_{Profile}_combined_edges_batch{NNNN}_{MMMM}.geojson |
Intermediate batch file. Covers stops NNNN through MMMM. Generated when running the walkshed script in batches (e.g. 500 stops at a time for large cities). Safe to delete after the final merged file is confirmed. |
Profiles present:
Unconstrained_Pedestrian_(Sidewalks_Only)— unrestricted walking, uses sidewalksManual_Wheelchair— restricted by steep grades, missing curb cuts, and obstructions
Why this exists: The TDEI Walkshed API returns the reachable street network from each stop. The combined file for the whole city is used to derive the bounding box for the OSM amenity query, ensuring OSM data is fetched for exactly the area that walkshed coverage covers.
Created by: export_walkshed_edges_per_stop.py (called by run_city_pipeline.py step 4)
Contains one GeoJSON file per bus stop for the pedestrian profile.
| File pattern | Description |
|---|---|
{Agency}_{stop_id}.geojson |
The reachable walking network (edges/paths) from that single stop, as returned by the TDEI Walkshed API. Contains LineString features representing walkable segments within the travel time budget (~10 minutes). |
Example: Metro_Transit_22510.geojson, City_of_Seattle_1-26645.geojson
Why this exists: The HTML map loads the individual stop file when a user clicks a stop, drawing only that stop's walkshed on the map rather than loading the entire city-wide file. This keeps the map fast and interactive. There will be roughly one file per unique stop (~2,600+ for Seattle).
Created by: export_walkshed_edges_per_stop.py (called by run_city_pipeline.py step 6, wheelchair profile)
Identical structure to walkshed_edges_by_stop/, but for the Manual Wheelchair profile.
| File pattern | Description |
|---|---|
{Agency}_{stop_id}.geojson |
The reachable wheelchair-accessible network from that single stop. Edges reflect routes that meet grade, curb cut, and surface requirements for a manual wheelchair user. |
Why this exists: The HTML map's "Wheelchair" toggle loads files from this folder instead of
walkshed_edges_by_stop/, allowing side-by-side comparison of pedestrian vs. wheelchair accessibility from the same stop.
Created by: count_amenities_in_walksheds.py (amenity counts, steps 8–9) and run_walksheds_from_geojson.py (infrastructure metrics); amenity location detail files are a by-product of count_amenities_in_walksheds.py
Contains summary statistics used directly by the HTML map.
| File | Columns | Description |
|---|---|---|
{city}_ped_amenity_counts.csv |
stop_id, agency, amenity_count, clinic, college, community_centre, doctors, food_bank, healthcare, hospital, library, nursing_home, place_of_worship, polling_station, school, social_facility, supermarket |
Pedestrian profile. One row per stop. Total count of reachable OSM amenities, plus a breakdown by category. Loaded by the HTML map to color-code stops and show bar charts. |
{city}_wc_amenity_counts.csv |
same columns | Wheelchair profile. Same structure as above but counts only amenities reachable under wheelchair constraints. |
{city}_ped_amenity_counts_amenity_locations.csv |
stop_id, agency, lat, lon, name, amenity, osm_type |
Pedestrian profile. One row per (stop, amenity) pair. Lists every individual OSM amenity reachable from each stop. Used by the HTML map to place amenity markers when a stop is selected. |
{city}_wc_amenity_counts_amenity_locations.csv |
same columns | Wheelchair profile. Same as above for wheelchair-reachable amenities. |
{city}_metrics.csv |
profile, uphill, downhill, avoidCurbs, streetAvoidance, max_cost, reverse, total_length, path_count, crossing_count, curb_count, marked_curbs, lowered_curbs |
Summary infrastructure metrics for the city's walkshed network by profile. Captures total path/edge lengths, crossing counts, curb statistics, and the profile parameters used. Useful for high-level accessibility reporting. |
Created by: query_osm_pois.py
| File | Columns | Description |
|---|---|---|
{dataset_id}_filtered_amenities.csv |
type, lat, lon, node_id, name, amenity |
All OSM amenity nodes/ways found within the city's walkshed bounding box, filtered to the relevant categories (hospitals, schools, supermarkets, parks, etc.). This is the raw OSM query result — it is then spatially joined against each stop's walkshed polygon to produce the per-stop counts in metrics/. |
Why the name uses the dataset ID:
query_osm_pois.pyuses the parent folder name (the dataset ID) as the file prefix, since this folder can be reused across cities without naming collisions.
Created by: query_osm_pois.py
Contains cached tile responses from the OSM Overpass API. The bounding box of the city is divided into a grid of tiles and each tile's response is saved here as a JSON file so re-running the script doesn't re-fetch data already downloaded.
Safe to delete if you want to force a fresh OSM query. The script will re-download and re-populate this folder automatically.
| File | Used for |
|---|---|
stops/seattle_bus_stops.geojson |
Not loaded directly — used as input to pipeline scripts only |
walkshed_edges_by_stop/{Agency}_{id}.geojson |
Drawn on map when a stop is clicked (pedestrian walkshed) |
walkshed_edges_by_stop_wheelchair/{Agency}_{id}.geojson |
Drawn on map when a stop is clicked (wheelchair walkshed) |
metrics/seattle_ped_amenity_counts.csv |
Colors stops by amenity count; populates route-level totals |
metrics/seattle_wc_amenity_counts.csv |
Same, for wheelchair profile |
metrics/seattle_ped_amenity_counts_amenity_locations.csv |
Places individual amenity markers when stop is selected |
metrics/seattle_wc_amenity_counts_amenity_locations.csv |
Same, for wheelchair profile |
walkshed_geojson/*_combined_edges.geojson |
Not loaded by the map — only used during pipeline (OSM bbox) |
csv_pois/*_filtered_amenities.csv |
Not loaded by the map — intermediate pipeline data |
overpass_tile_cache/ |
Not loaded by the map — pipeline cache only |