Skip to content

r3nanp/crime-analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crime Analytics

A modern, open-source toolkit for extracting, cleaning, and analyzing crime data in the state of Ceará, Brazil. Crime Analytics streamlines the process of converting official PDF crime reports into structured CSV/Excel datasets and provides tools for exploratory analysis, visualization, and reporting—whether you work locally, in the cloud, or on Google Colab.


🚦 Features

  • Automated PDF Extraction Batch-download government crime report PDFs into clean, analysis-ready CSV or Excel files.

  • Data Cleaning & Standardization Remove duplicates, handle missing values, and harmonize column names and formats.

  • Exploratory & Statistical Analysis Generate summary statistics, spot trends, and assess data quality.

  • Temporal & Spatial Insights Analyze crime patterns over time and across locations (AIS, municipalities, neighborhoods).

  • Interactive Visualizations Create charts, heatmaps, and dashboards for deeper insights.

  • Google Colab & Jupyter Support Run the full workflow interactively in notebooks.

  • Automated Reporting Export findings as text summaries, charts, and data files.


🗂️ Project Structure

data/
├── ais_ce.geojson
├── ais.json
├── ceara.geojson
├── fortaleza-neighborhood.geojson

lib/
├── crawler.py
└── pdf_converter.py

notebooks/
└── project.ipynb

src/
├── geojson_ais.py
├── main.py
├── maps.py
└── plot.py

⚡ Getting Started

1. Install Dependencies

This project uses uv for dependency management. Install everything with:

uv sync

Alternatively, you can use plain pip:

pip install -r requirements.txt

2. Run the Crawler

The crawler downloads PDF crime reports into the data/pdfs/ directory.

uv run lib/crawler.py

3. Convert PDFs to CSV

Use the PDF converter to extract structured data from the downloaded reports.

uv run lib/pdf_converter.py

This generates data/cvli.csv.


4. Run the Main Pipeline

Process the dataset, generate visualizations, and build outputs (maps, stats, charts).

uv run src/main.py

📊 Outputs

  • CSV/Excel datasets (data/cvli.csv, data/ais_analysis.xlsx, etc.)
  • GeoJSON files (data/ais_ce.geojson, data/ceara.geojson, etc.)
  • Visualizations (data/top_ais_cvli.png, data/cvli_heatmap_ais.html)
  • Interactive maps for spatial exploration of crime data.

🤝 Contributing

Contributions are welcome! Feel free to open issues, suggest improvements, or submit PRs to extend the toolkit.


About

A data analysis of the crime on the state of Ceara.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •