Skip to content

A repo that consolidates disposable, free, paid-personal email providers to allow filtering those domains out.

Notifications You must be signed in to change notification settings

Rohithzr/email-provider-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Email Provider Filter

A comprehensive, automatically updated dataset of email domains categorized by type. Perfect for implementing email validation and filtering in your applications.

📊 Dataset Overview

141,580+ domains across 3 categories, updated daily:

  • 🗑️ Disposable: 71,627 temporary/throwaway email domains
  • 📧 Free: 69,931 free personal email providers
  • 💳 Paid Personal: 22 paid personal email services

🚀 Quick Start

Download the Data

Latest Release: Get the latest dataset

Direct Links:

Usage Example

# Load and use the dataset
from examples.example_usage import EmailDomainFilter

# Initialize filter (loads from GitHub automatically)
filter = EmailDomainFilter("remote")

# Check if email is from business domain
is_business = filter.is_business_email("[email protected]")  # True
is_business = filter.is_business_email("[email protected]")    # False

# Apply filtering rules
should_block, reason = filter.should_block_email(
    "[email protected]", 
    block_disposable=True
)
# Returns: (True, "Disposable email domain: 10minutemail.com")

👉 See full example

🎯 Use Cases

  • User Registration: Block disposable emails during signup
  • Lead Validation: Identify business vs personal emails
  • Email Marketing: Improve deliverability by filtering bad domains
  • Fraud Prevention: Detect throwaway email usage patterns
  • Data Quality: Clean and categorize existing email lists

📁 Data Sources

This dataset combines multiple reliable sources:

🤖 Automation

  • Daily Updates: GitHub Actions automatically refreshes data every day at 6 AM UTC
  • Smart Releases: New versions only created when data actually changes
  • Deduplication: Automatic removal of duplicates across all sources
  • Quality Control: Allowlist prevents false positives

🧪 Testing

# Run tests to verify data quality
python3 tests/test_categorization.py

# Expected output: 
# 🎉 ALL TESTS PASSED! Domain categorization is working correctly.

🛠️ Development

Setup

git clone https://github.com/Rohithzr/email-provider-filter.git
cd email-provider-filter
python3 scripts/aggregate.py  # Generate fresh dataset

Project Structure

├── sources/           # Source configurations and custom lists
├── scripts/           # Data processing scripts  
├── output/            # Generated dataset files
├── tests/             # Quality assurance tests
├── examples/          # Usage examples and demos
└── .github/           # Automation workflows

🤝 Contributing

We welcome contributions! Help us improve the dataset:

  • 🐛 Report Issues: Found incorrect categorizations? Open an issue
  • 📊 Add Data: Know of new disposable services? Submit domains
  • 💻 Improve Code: Enhance scripts or add features

📖 Contributing Guidelines | 📋 Code of Conduct

📈 Why This Project?

Most existing solutions are either:

  • Incomplete: Missing thousands of domains
  • Outdated: Not regularly maintained
  • Narrow: Focus on only one category
  • Fragmented: Scattered across multiple sources

This project solves all those problems by providing a single, comprehensive, automatically updated source.

📄 License

MIT License - Free for commercial and personal use.

🙏 Acknowledgments

Thanks to all the maintainers of the source repositories and contributors who help keep this dataset accurate and comprehensive.


⭐ Star this repo if it's useful for your project!

About

A repo that consolidates disposable, free, paid-personal email providers to allow filtering those domains out.

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages