Integrate Polars Plugin for High-Volume Email Validation #5
Open
Description
Description:
I propose adding a Polars plugin to the emval
library to enable high-performance email validation directly within Polars DataFrames. This integration would allow users to efficiently validate large datasets of email addresses, leveraging emval
's speed and Polars' data manipulation strengths.
Benefits:
- Performance: Validate entire DataFrames of emails quickly using Rust's performance.
- Integration: Seamlessly incorporate email validation into existing Polars workflows.
- Scalability: Handle large datasets efficiently with minimal performance overhead.
Proposed Usage:
The plugin would enable email validation with the following syntax:
import polars as pl
from emval.polars import validate_email
df = pl.DataFrame({
'email': [
'[email protected]',
'invalid-email',
'[email protected]',
'user@[192.168.1.1]',
''
]
})
# Apply the email validation plugin
df = df.with_columns(
validated=validate_email(
pl.col('email'),
allow_smtputf8=True,
allow_empty_local=False,
allow_quoted_local=False,
allow_domain_literal=False,
deliverable_address=True,
)
)
# Access the fields from the Struct column
df = df.with_columns(
original=pl.col('validated').struct.field('original'),
normalized=pl.col('validated').struct.field('normalized'),
local_part=pl.col('validated').struct.field('local_part'),
domain_name=pl.col('validated').struct.field('domain_name'),
domain_address=pl.col('validated').struct.field('domain_address'),
is_deliverable=pl.col('validated').struct.field('is_deliverable'),
).drop('validated')
print(df)
Proposed Project Structure:
emval/
├── __init__.py
├── validator.py
├── model.py
├── polars/
│ ├── __init__.py
│ └── plugin.py
src/
├── lib.rs # Main module for emval
├── validators/ # Additional validation logic
├── polars_plugin.rs # Polars plugin module
Optional Installation:
The Polars plugin should be an optional dependency, installable via:
pip install emval[polars]
This ensures the base emval
library remains lightweight for users who don’t require the plugin.
Reference Documentation:
- Polars Plugin Guide: Your First Polars Plugin