Skip to content

Commit f232c8b

Browse files
chore: consolidate history
0 parents  commit f232c8b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+8713
-0
lines changed

.github/workflows/ci.yml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
runs-on: ${{ matrix.os }}
12+
strategy:
13+
fail-fast: false
14+
matrix:
15+
os: [ubuntu-latest, windows-latest]
16+
python-version: ["3.9", "3.10", "3.11", "3.12"]
17+
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
- name: Set up Python ${{ matrix.python-version }}
22+
uses: actions/setup-python@v5
23+
with:
24+
python-version: ${{ matrix.python-version }}
25+
26+
- name: Install dependencies
27+
run: |
28+
python -m pip install --upgrade pip
29+
pip install -e ".[dev]"
30+
31+
- name: Lint with ruff
32+
run: |
33+
ruff check geoqa/
34+
35+
- name: Format check with black
36+
run: |
37+
black --check geoqa/ tests/
38+
39+
- name: Run tests
40+
run: |
41+
pytest --cov=geoqa --cov-report=xml
42+
43+
- name: Upload coverage
44+
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.11'
45+
uses: codecov/codecov-action@v3
46+
with:
47+
file: ./coverage.xml

.github/workflows/pages.yml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
name: Deploy docs to GitHub Pages
2+
3+
on:
4+
push:
5+
branches: [main]
6+
paths: [docs/**]
7+
workflow_dispatch:
8+
9+
permissions:
10+
contents: read
11+
pages: write
12+
id-token: write
13+
14+
concurrency:
15+
group: pages
16+
cancel-in-progress: false
17+
18+
jobs:
19+
deploy:
20+
runs-on: ubuntu-latest
21+
environment:
22+
name: github-pages
23+
url: ${{ steps.deployment.outputs.page_url }}
24+
steps:
25+
- uses: actions/checkout@v4
26+
- uses: actions/configure-pages@v5
27+
- uses: actions/upload-pages-artifact@v3
28+
with:
29+
path: docs
30+
- id: deployment
31+
uses: actions/deploy-pages@v4

.gitignore

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# Distribution / packaging
7+
dist/
8+
build/
9+
*.egg-info/
10+
*.egg
11+
12+
# Virtual environments
13+
.venv/
14+
venv/
15+
env/
16+
17+
# IDE
18+
.vscode/
19+
.idea/
20+
*.swp
21+
22+
# Testing
23+
.pytest_cache/
24+
.coverage
25+
htmlcov/
26+
27+
# Documentation
28+
site/
29+
30+
# OS
31+
.DS_Store
32+
Thumbs.db
33+
34+
# Jupyter
35+
.ipynb_checkpoints/
36+
37+
# Output
38+
outputs/
39+
*.html
40+
!docs/**/*.html

CHANGELOG.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Changelog
2+
3+
All notable changes to GeoQA will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [0.1.1] - 2026-02-12
9+
10+
### Fixed
11+
- **Security**: Enable Jinja2 autoescape in HTML report generation to prevent XSS when dataset names or attribute values contain HTML/script content. (GeoQA-002 from audit)
12+
- **UX**: Log a warning when loading files larger than 500 MB or datasets with more than 100,000 features, so users know to expect longer runtimes.
13+
14+
### Added
15+
- 3 new tests: XSS prevention, special characters in dataset name, and large-file warning validation.
16+
17+
## [0.1.0] - 2026-02-11
18+
19+
### Added
20+
21+
- **Core profiling**: `geoqa.profile()` one-liner for dataset analysis
22+
- **GeoProfile class**: Complete dataset profiling with quality scoring
23+
- **Geometry checks**: Validity, empty, null, duplicate, mixed-type detection
24+
- **Attribute profiling**: Data types, null analysis, statistics, top values
25+
- **Spatial analysis**: CRS info, bounds, area/length/perimeter statistics
26+
- **Interactive maps**: Folium-based visualization with quality highlighting
27+
- **HTML reports**: Self-contained quality reports with Jinja2 templates
28+
- **CLI interface**: `geoqa profile`, `geoqa report`, `geoqa check`, `geoqa show`
29+
- **Rich output**: Beautiful terminal output with tables and colors
30+
- **Quality scoring**: Weighted 0-100 score based on multiple criteria
31+
- **Format support**: All Fiona/GDAL-supported vector formats
32+
- **Auto-fix**: `GeometryChecker.fix_invalid()` for geometry repair
33+
- **Comprehensive docs**: README, CONTRIBUTING, API docs, example notebooks
34+
- **Test suite**: pytest-based tests for all modules

CONTRIBUTING.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# Contributing to GeoQA
2+
3+
Thank you for your interest in contributing to GeoQA! This document provides guidelines and instructions for contributing.
4+
5+
## 🚀 Getting Started
6+
7+
### Prerequisites
8+
9+
- Python 3.9+
10+
- Git
11+
- A code editor (VS Code recommended)
12+
13+
### Development Setup
14+
15+
1. **Fork and clone** the repository:
16+
```bash
17+
git clone https://github.com/YOUR_USERNAME/geoqa.git
18+
cd geoqa
19+
```
20+
21+
2. **Create a virtual environment**:
22+
```bash
23+
python -m venv .venv
24+
source .venv/bin/activate # Linux/Mac
25+
.venv\Scripts\activate # Windows
26+
```
27+
28+
3. **Install in development mode**:
29+
```bash
30+
pip install -e ".[dev]"
31+
```
32+
33+
4. **Run tests** to verify setup:
34+
```bash
35+
pytest
36+
```
37+
38+
## 📋 Development Workflow
39+
40+
1. **Create a branch** for your work:
41+
```bash
42+
git checkout -b feature/your-feature-name
43+
```
44+
45+
2. **Make your changes** following the code style guidelines below.
46+
47+
3. **Write tests** for new functionality.
48+
49+
4. **Run the test suite**:
50+
```bash
51+
pytest
52+
```
53+
54+
5. **Format your code**:
55+
```bash
56+
black geoqa/ tests/
57+
isort geoqa/ tests/
58+
```
59+
60+
6. **Commit your changes** with a clear message:
61+
```bash
62+
git commit -m "feat: add new quality check for overlapping features"
63+
```
64+
65+
7. **Push and create a Pull Request**.
66+
67+
## 🎨 Code Style
68+
69+
- **Formatter**: Black (line length = 100)
70+
- **Import sorting**: isort (black profile)
71+
- **Linter**: Ruff
72+
- **Type hints**: Required for all public functions
73+
- **Docstrings**: Google-style docstrings for all public modules, classes, and functions
74+
75+
### Example
76+
77+
```python
78+
def check_validity(gdf: gpd.GeoDataFrame) -> dict[str, Any]:
79+
"""Check geometry validity for all features.
80+
81+
Args:
82+
gdf: The GeoDataFrame to validate.
83+
84+
Returns:
85+
Dictionary with valid_count, invalid_count, and invalid_indices.
86+
87+
Raises:
88+
ValueError: If the GeoDataFrame has no geometry column.
89+
"""
90+
...
91+
```
92+
93+
## 📝 Commit Messages
94+
95+
Follow [Conventional Commits](https://www.conventionalcommits.org/):
96+
97+
- `feat:` — New feature
98+
- `fix:` — Bug fix
99+
- `docs:` — Documentation changes
100+
- `test:` — Test additions/modifications
101+
- `refactor:` — Code refactoring
102+
- `style:` — Formatting changes
103+
- `chore:` — Build/CI changes
104+
105+
## 🧪 Testing
106+
107+
- Write tests in the `tests/` directory using pytest.
108+
- Aim for comprehensive coverage of new functionality.
109+
- Use fixtures for common test data.
110+
111+
## 📜 Code of Conduct
112+
113+
Be respectful and constructive. We follow the [Contributor Covenant](https://www.contributor-covenant.org/) Code of Conduct.
114+
115+
## ❓ Questions?
116+
117+
Open an issue or start a discussion on GitHub. We're happy to help!

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2026 Ammar Yasser Abdalazim
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

LINKEDIN_POST.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# LinkedIn Post — GeoQA Launch
2+
3+
---
4+
5+
## Post Text
6+
7+
I'm excited to share **GeoQA** — an open-source Python package I built for **geospatial data quality assessment and interactive profiling**.
8+
9+
As a GIS developer, I kept running into the same frustrating question: *"Is this geodata good enough to use?"* — and there was no quick, automated way to answer it.
10+
11+
So I built GeoQA.
12+
13+
**What it does:**
14+
15+
- Profiles any vector dataset (Shapefile, GeoJSON, GeoPackage) with a single function call
16+
- Computes an overall **quality score (0–100)** based on geometry validity, attribute completeness, and CRS
17+
- Detects **invalid, empty, and duplicate geometries** automatically
18+
- Generates **interactive web maps** with quality-issue highlighting
19+
- Produces **self-contained HTML quality reports** with charts, tables, and spatial statistics
20+
21+
**What makes it unique:**
22+
23+
GeoQA is like **ydata-profiling** — but purpose-built for geodata. It understands geometry types, coordinate systems, spatial topology, and the real-world data problems that GIS professionals deal with daily.
24+
25+
**It also powers the pre-check module in OVC (Overlap Violation Checker)** — my other open-source tool for detecting overlapping buildings, road conflicts, and topological errors. Before OVC runs its spatial QC pipeline, GeoQA profiles every input dataset to catch fundamental issues early — missing CRS, invalid geometries, empty features — saving compute time and giving clear diagnostics upfront.
26+
27+
Together, GeoQA + OVC form a **complete geospatial quality control workflow**: GeoQA assesses data readiness, and OVC performs deep spatial validation.
28+
29+
**Who is this for?**
30+
31+
- GIS Analysts validating shapefiles before analysis
32+
- Urban Planners checking building and road datasets
33+
- Survey Engineers ensuring geometry integrity
34+
- Data Engineers building geospatial ETL pipelines
35+
- Government agencies auditing cadastral and infrastructure data
36+
- Academic researchers profiling geodata for publications
37+
- Anyone working with vector geospatial data who wants automated quality checks
38+
39+
Both tools are **free, open-source, and MIT-licensed**.
40+
41+
**Try it:**
42+
- GeoQA on PyPI: https://pypi.org/project/geoqa/
43+
- GeoQA on GitHub: https://github.com/AmmarYasser455/geoqa
44+
- OVC on GitHub: https://github.com/AmmarYasser455/ovc
45+
46+
I'd love to hear your feedback — try it on your own datasets and let me know what you think!
47+
48+
---
49+
50+
## Recommended Media (attach to post)
51+
52+
1. **Screenshot of `profile.summary()`** — the rich-formatted terminal output showing dataset overview, quality score, and geometry checks
53+
2. **Screenshot of the interactive Folium web map** — showing quality-highlighted features (valid in blue, issues in red)
54+
3. **Screenshot of the HTML quality report** — the gradient header with quality score badge, overview cards, and charts
55+
4. **Screenshot of OVC + GeoQA working together** — the pre-check output showing data readiness assessment before QC
56+
5. **A short GIF or video** (60–90 seconds) showing the full workflow: profile → map → report in Jupyter
57+
58+
> Tip: LinkedIn favors **carousel posts (PDF)** and **native video**. Consider combining 3–4 screenshots into a carousel PDF for higher engagement.
59+
60+
---
61+
62+
## Hashtags
63+
64+
#GIS #Geospatial #Python #OpenSource #DataQuality #GeoQA #OVC
65+
#GeoPandas #Mapping #SpatialData #GISDev #UrbanPlanning
66+
#OpenData #DataScience #QualityAssurance #QualityControl
67+
#SurveyEngineering #DataEngineering #WebMapping #Cartography
68+
#RemoteSensing #SpatialAnalysis #BuildingData #RoadNetwork
69+
#GISProfessionals #PythonDev #FOSS4G #Leaflet #Folium
70+
71+
---

0 commit comments

Comments
 (0)