Spatial Quality Control for Building & Road Datasets
Detect overlaps, conflicts, and topological issues — locally, offline, and fast
OVC is a Python-based spatial quality control tool for detecting geometric and topological issues in building and road datasets. It validates your local geospatial data — Shapefiles, GeoJSON, GeoPackage — detecting overlapping buildings, boundary violations, road conflicts, and road network problems.
- Detect building overlaps (duplicate and partial) with vectorized spatial joins
- Validate boundary compliance and building-road conflicts
- Analyze road networks — disconnected segments, self-intersections, dangles
- Check geometry quality — invalid geometries, area reasonableness, compactness
- Visualize results on interactive web maps with issue highlighting
- Export GeoPackage, CSV reports, and HTML web maps
- Pre-check data quality with GeoQA integration
No internet connection required. No API rate limits. Just point OVC at your data files and get results.
| Feature | Description |
|---|---|
| Building Overlap Detection | Identify duplicate and partial overlaps via vectorized spatial joins |
| Boundary Compliance | Validate building footprints against administrative boundaries |
| Road Conflict Analysis | Detect buildings conflicting with road geometries |
| Road Network QC | Disconnected segments, self-intersections, and dangle detection |
| Geometry Quality | Invalid geometry, area reasonableness, compactness score, setback violations |
| GeoQA Pre-Check | Automated data-readiness gate before running QC pipeline |
| Interactive Web Maps | Folium-based maps with legends and issue highlighting |
| Multi-Format I/O | Shapefile, GeoJSON, GeoPackage input → GeoPackage, CSV, HTML output |
| Offline & Fast | No internet required, vectorized spatial joins (10–50× faster than loops) |
git clone https://github.com/AmmarYasser455/ovc.git
cd ovc
pip install -e ".[dev]"Or install dependencies only:
pip install -r requirements.txtRequirements: Python 3.10+ — depends on geopandas, shapely, pyproj, pandas, folium, fiona, and rtree.
# Buildings only (minimum required)
python scripts/run_qc.py --buildings buildings.shp --out outputs
# Buildings + Roads
python scripts/run_qc.py --buildings buildings.shp --roads roads.shp --out outputs
# Buildings + Roads + Boundary
python scripts/run_qc.py --buildings buildings.shp --roads roads.shp --boundary boundary.shp --out outputs
# Enable Road QC
python scripts/run_qc.py --buildings buildings.shp --roads roads.shp --road-qc --out outputs
# Run GeoQA pre-check before QC
python scripts/run_qc.py --buildings buildings.shp --roads roads.shp --precheck --out outputs
# Run ONLY the pre-check (skip QC)
python scripts/run_qc.py --buildings buildings.shp --precheck-only --out outputsfrom ovc.export.pipeline import run_pipeline
from pathlib import Path
outputs = run_pipeline(
buildings_path=Path("data/buildings.shp"),
roads_path=Path("data/roads.shp"),
boundary_path=Path("data/boundary.shp"),
out_dir=Path("outputs"),
)
print(f"GeoPackage: {outputs.gpkg_path}")
print(f"Web map: {outputs.webmap_html}")from ovc.checks.geometry_quality import (
find_duplicate_geometries,
find_invalid_geometries,
find_unreasonable_areas,
compute_compactness,
find_min_road_distance_violations,
)
dupes = find_duplicate_geometries(buildings_metric)
invalid = find_invalid_geometries(buildings_metric)
area_issues = find_unreasonable_areas(buildings_metric, min_area_m2=4.0)
compact = compute_compactness(buildings_metric, min_compactness=0.2)
setback = find_min_road_distance_violations(buildings_metric, roads_metric, min_distance_m=3.0)--buildingsis required — this is the data OVC checks--boundaryis optional — enables boundary overlap and outside-boundary checks--roadsis optional — enables building-on-road conflict checks--road-qcrequires--roads— runs road network quality checks- All input files can be Shapefile, GeoJSON, GeoPackage, or any format supported by Fiona
OVC integrates with GeoQA — a Python package for geospatial data quality assessment — as a data-readiness gate before running the QC pipeline.
When you pass --precheck, OVC uses GeoQA to:
- Profile each input dataset (buildings, roads, boundary)
- Compute a quality score (0–100) based on geometry validity, attribute completeness, and CRS
- Detect invalid, empty, duplicate, and null geometries
- Run topology checks (slivers, bad rings, overlaps)
- Classify issues as warnings (proceed with caution) or blockers (stop — fix data first)
- Generate HTML quality reports for each dataset
Only when all datasets pass the pre-check does OVC proceed with the full QC pipeline. This catches fundamental data issues early — missing CRS, invalid geometries, empty features — saving compute time and giving clear diagnostics.
# Install GeoQA for pre-check support
pip install geoqa| Workflow Step | Tool | What It Does |
|---|---|---|
| Data Readiness | GeoQA | Profile, validate, and score input datasets |
| Building QC | OVC | Overlap, boundary, and road conflict detection |
| Road QC | OVC | Disconnected segments, self-intersections, dangles |
| Check | Description |
|---|---|
| Overlap Detection | Duplicate and partial overlaps via spatial join |
| Boundary Compliance | Buildings touching or outside administrative boundary |
| Road Conflict | Buildings conflicting with road geometries |
| Duplicate Geometry | Identical building footprints (WKB comparison) |
| Invalid Geometry | Self-intersections, topology errors |
| Area Reasonableness | Unreasonably small or large buildings |
| Compactness Score | Polsby-Popper shape regularity check |
| Road Setback | Buildings too close to roads |
| Check | Description |
|---|---|
| Disconnected Segments | Roads not connected to the network |
| Self-Intersections | Roads that cross themselves |
| Dangles | Dead-end endpoints (incomplete digitization) |
outputs/
├── precheck/ # GeoQA quality reports (when --precheck)
│ ├── buildings_quality_report.html
│ ├── roads_quality_report.html
│ └── boundary_quality_report.html
├── building_qc/
│ ├── building_qc.gpkg # GeoPackage with issue layers
│ ├── building_qc_map.html # Interactive web map
│ └── building_qc_metrics.csv # Summary metrics
└── road_qc/ # When --road-qc is enabled
├── road_qc.gpkg
├── road_qc_map.html
└── road_qc_metrics.csv
| Output Type | Description |
|---|---|
| GeoPackage | Spatial layers containing detected issues |
| CSV reports | Summary statistics and metrics |
| HTML web map | Interactive map for visual inspection |
| HTML quality report | GeoQA pre-check assessment (when enabled) |
Runtime thresholds can be customized in ovc/core/config.py:
| Parameter | Default | Description |
|---|---|---|
duplicate_ratio_min |
0.98 | Minimum overlap ratio for duplicate classification |
partial_ratio_min |
0.30 | Minimum overlap ratio for partial classification |
min_intersection_area_m2 |
0.5 | Minimum overlap area threshold |
road_buffer_m |
1.0 | Buffer distance around roads |
| Operation | Typical Time |
|---|---|
| Building overlap detection (10k buildings) | ~20 sec |
| Road conflict detection (10k buildings) | ~30 sec |
| Full pipeline (10k buildings + roads + boundary) | ~1 min |
Key optimizations:
- Vectorized spatial joins (no Python-level loops)
- Spatial pre-filtering reduces geometry comparison count
ovc/
├── core/ # Shared utilities, CRS handling, config, spatial indexing
├── loaders/ # Data loading and preprocessing (multi-format)
├── checks/ # Building quality checks and validation logic
├── metrics/ # Statistics and summary computation
├── export/ # Output generation (GeoPackage, CSV, web maps)
├── precheck/ # GeoQA-powered data quality assessment
└── road_qc/ # Road network quality control
├── checks/ # Disconnected, self-intersection, dangle detection
├── config.py
├── metrics.py
├── pipeline.py
└── webmap.py
For detailed design decisions, see ARCHITECTURE.md.
All vector formats readable by GeoPandas/Fiona: Shapefile, GeoJSON, GeoPackage, KML, GML, File Geodatabase, and more via GDAL/OGR drivers.
| Issue | Cause | Fix |
|---|---|---|
| CRS warnings | Local CRS without .prj file |
Ensure data has a valid .prj file |
| Out of memory | Very large areas (>500 km²) | Process in smaller boundary chunks |
| Slow processing | Large datasets with O(n²) comparisons | Vectorized joins used automatically; filter to area of interest |
pytest # Run full test suite
pytest --cov=ovc --cov-report=html # Run with coverageContributions are welcome. See CONTRIBUTING.md for guidelines.
git clone https://github.com/AmmarYasser455/ovc.git
cd ovc
pip install -e ".[dev]"
pytestAmmar Yasser
- GitHub: @AmmarYasser455
- LinkedIn: Ammar Yasser
OVC is part of a geospatial quality control ecosystem alongside GeoQA. The project uses vectorized spatial operations powered by GeoPandas, Shapely, and Folium for interactive visualization.
@software{ovc2026,
title = {OVC: Overlap Violation Checker for Geospatial Data},
author = {Ammar Yasser Abdalazim},
year = {2026},
url = {https://github.com/AmmarYasser455/ovc},
license = {MIT}
}** If you find this project useful, please consider giving it a star!**


