This repository contains a comprehensive data processing pipeline for YMCA volunteer statistics and reporting, designed to generate data for the Y Monthly Statistics Report 8.31.2025 PowerPoint presentation.
- File:
src/extractors/volunteer_history_extractor.py - Output:
data/raw/VolunteerHistory_YYYYMMDD_HHMMSS.xlsx - Purpose: Downloads volunteer data from VolunteerMatters API (Jan 1, 2025 to current date)
- File:
src/processors/data_preparation.py - Output:
data/processed/Raw_Data_YYYYMMDD_HHMMSS.xlsx - Purpose: Removes 0-hour entries, saves clean "Raw Data" for all subsequent analysis
- File:
src/processors/project_statistics.py - Output:
data/processed/Y_Volunteer_2025_Statistics_YYYYMMDD_HHMMSS.xlsx - Purpose:
- Hours by PROJECT TAG (no deduplication)
- Volunteers by PROJECT CATALOG (deduplicated by ASSIGNEE, PROJECT CATALOG, BRANCH)
- Projects by PROJECT TAG vs PROJECT (with manual adjustments for Competitive Swim/Gymnastics)
- File:
src/processors/branch_breakdown.py - Output:
data/processed/Y_Volunteer_2025_Branch_Breakdown_YYYYMMDD_HHMMSS.xlsx - Purpose:
- Branch Hours (no deduplication)
- Active Volunteers (deduplicated by ASSIGNEE, BRANCH)
- Member Volunteers (filtered by "Yes" for YMCA membership)
- File:
src/processors/yde_breakdown.py - Output:
data/processed/Y_Volunteer_2025_YDE_Breakdown_YYYYMMDD_HHMMSS.xlsx - Purpose:
- YDE - Community Services (includes Music Resource Center)
- YDE - Early Learning Centers
- YDE - Out of School Time
- Hours, Volunteers, and Project numbers for each category
- File:
src/processors/senior_centers_breakdown.py - Output:
data/processed/Y_Volunteer_2025_Senior_Centers_YYYYMMDD_HHMMSS.xlsx - Purpose:
- Clippard YMCA + Clippard Senior Center
- R.C. Durr YMCA + Kentucky Senior Center
- Combined data for ease of reading
- File:
data/processed/YMCA_Volunteer_Summary_Report.txt - Purpose: Single comprehensive report containing summaries from all processing steps
βββ src/
β βββ extractors/ # Data extraction from external APIs
β β βββ volunteer_history_extractor.py
β βββ processors/ # Data cleaning and statistical analysis
β β βββ data_preparation.py
β β βββ project_statistics.py
β β βββ branch_breakdown.py
β β βββ yde_breakdown.py
β β βββ senior_centers_breakdown.py
β βββ utils/ # Shared utilities and configurations
β βββ logging_config.py
β βββ file_utils.py
βββ data/
β βββ raw/ # Raw extracted data files
β βββ processed/ # Cleaned and analyzed data
βββ docs/ # Documentation files
β βββ README.md # Main documentation
β βββ BAR_CHART_USAGE.md # Bar chart usage guide
β βββ LINE_GRAPH_README.md # Line graph documentation
β βββ PIE_CHARTS_README.md # Pie chart documentation
β βββ data_quality_dashboard.html # Data quality dashboard
β βββ frontend/ # Frontend guidelines
βββ visualization_tools/ # Data visualization tools
β βββ README.md # Visualization tools documentation
β βββ generate_*.py # Chart generation scripts
β βββ create_*.py # Chart creation scripts
β βββ visualizations/ # Visualization modules
β βββ visualizers/ # Visualizer modules
β βββ charts/ # Generated chart outputs
β βββ final_charts/ # Final processed charts
βββ tools/ # Additional tools and utilities
β βββ README.md # Tools overview documentation
β βββ reporting/ # Advanced reporting tools
β βββ scheduling/ # Automated scheduling tools
β βββ comparison/ # Data comparison tools
β βββ web_dashboard/ # Web-based dashboard
βββ logs/ # Application logs
βββ main.py # Main entry point
βββ requirements.txt # Python dependencies
- Run Data Extraction:
python src/extractors/volunteer_history_extractor.py - Run Data Preparation:
python src/processors/data_preparation.py - Run Project Statistics:
python src/processors/project_statistics.py - Run Branch Breakdown:
python src/processors/branch_breakdown.py - Run YDE Breakdown:
python src/processors/yde_breakdown.py - Run Senior Centers:
python src/processors/senior_centers_breakdown.py
The system includes comprehensive visualization tools located in the visualization_tools/ directory:
- Bar Charts:
python visualization_tools/generate_bar_charts.py - Line Graphs:
python visualization_tools/generate_line_graphs.py - Pie Charts:
python visualization_tools/create_pie_charts.py - Scatter Plots:
python visualization_tools/create_scatter_plots.py - Histograms:
python visualization_tools/generate_histograms.py
- Automatic data detection from Excel files
- Professional styling with seaborn
- Multiple output formats (PNG, HTML, JSON)
- Flexible time period aggregation
- Trend line overlays
visualization_tools/README.md- Complete visualization tools guidedocs/BAR_CHART_USAGE.md- Bar chart usage instructionsdocs/LINE_GRAPH_README.md- Line graph documentationdocs/PIE_CHARTS_README.md- Pie chart documentation
The system includes comprehensive additional tools located in the tools/ directory:
- Custom Date Range Reports:
python tools/reporting/generate_custom_date_range_reports.py - Quick Metrics Summary:
python tools/reporting/quick_metrics_summary.py - Flexible Report Generator:
python tools/reporting/flexible_report_generator.py
- Schedule Manager:
python tools/scheduling/schedule_manager.py - Pipeline Runner:
python tools/scheduling/run_scheduled_pipeline.py - Automated Scheduling: Cross-platform scheduling support
- Monthly Comparison:
python tools/comparison/monthly_volunteer_comparison.py - Data Analysis: Compare volunteer data across time periods
- Trend Analysis: Identify patterns and growth trends
- Web Interface:
./tools/web_dashboard/run_web_dashboard.sh - Browser Access: View data and reports in web browser
- Interactive Dashboard: Real-time data visualization
tools/README.md- Complete tools overviewdocs/CSV_EXPORT_USAGE.md- CSV export functionalitydocs/FLEXIBLE_DATE_RANGE_REPORT_GENERATOR.md- Custom date range reportsdocs/QUICK_SUMMARY_CLI.md- Quick metrics summarydocs/SCHEDULER_README.md- Scheduling and automationdocs/WEB_DASHBOARD_README.md- Web dashboard interface
All Excel files are formatted for direct import into your Y Monthly Statistics Report 8.31.2025 presentation, with each page having its own dedicated Excel file containing the specific pivot tables and summaries you need.
The system automatically handles:
- β Date filtering (Jan 1, 2025 to current)
- β 0-hour entry removal
- β Proper deduplication strategies
- β Manual adjustments for Competitive Swim/Gymnastics
- β Music Resource Center inclusion in YDE - Community Services
- β Senior Center combinations
- β Monthly data validation preparation
-
Install dependencies:
pip install -r requirements.txt
-
Run the main entry point:
python main.py
-
Or run individual steps:
# Step 1: Extract data python src/extractors/volunteer_history_extractor.py # Step 2: Prepare data python src/processors/data_preparation.py # Step 3: Generate statistics python src/processors/project_statistics.py
- API credentials are configured in
src/extractors/volunteer_history_extractor.py - Logging configuration is centralized in
src/utils/logging_config.py - File handling utilities are in
src/utils/file_utils.py
pandas>=1.5.0- Data manipulation and analysisrequests>=2.28.0- HTTP requests for API callsopenpyxl>=3.0.0- Excel file handling
- Each processing step can be run independently
- Log files are automatically created in the
logs/directory - Data files follow consistent naming conventions with timestamps
- The system handles various deduplication methods based on reporting needs