A Python pipeline to extract contact information from school websites in North Holland and send standardized emails.
- Filters CSV data for active schools in Noord-Holland
- Scrapes school websites for email addresses and phone numbers
- Composes personalized emails with school-specific information
- Sends emails via SMTP
- Saves all data for review and tracking
- Install dependencies:
pip install -r requirements.txt- Configure email settings:
- Copy
.env.exampleto.env - Fill in your SMTP credentials
- Copy
Run the main pipeline:
python main.pyThe script will:
- Filter schools from the CSV file (Noord-Holland)
- Scrape their websites for contact information
- Compose standardized emails using
email_template.txt - Save all email data to
emails_to_send.json
Note: The script requires email_template.txt to exist. If it doesn't exist, email composition will be skipped.
After running the main script, send emails separately:
python send-email.pyThis script will:
- Load emails from
emails_to_send.json - Skip emails that have already been sent
- Send remaining emails via SMTP
- Track which emails have been sent in
sent_emails_tracking.json
get-schools-region.py- Filters CSV for schools in a specific provincescraper.py- Scrapes websites for email and phone numbersemail-composer.py- Composes standardized emails with variablessend-email.py- Sends emails via SMTPmain.py- Runs the entire pipeline
schools_data.json- All scraped school dataemails_to_send.json- All composed emails ready to send (includes sent status)sent_emails_tracking.json- Tracking of which schools have received emailscomposed_emails/- Directory with individual composed email files (for review)
Set environment variables in .env:
SMTP_SERVER- SMTP server addressSMTP_PORT- SMTP server portSMTP_USERNAME- Your email usernameSMTP_PASSWORD- Your email password (or app password)FROM_EMAIL- Sender email address
- The scraper includes delays between requests to be polite to servers
- Scraped data is saved to avoid re-scraping
- Email template (
email_template.txt) is required for email composition - Email sending is a separate step that tracks which emails have been sent
- The system remembers which emails have been sent and won't send duplicates
- All composed emails are saved for review before sending