Skip to content

Pandas validation results in doubled log messages #42

@fredsonnenwald

Description

@fredsonnenwald

If I modify the README.md example to add a logger to log the validation error then that error is printed twice, e.g.,

$ python teset.py 
2025-06-25 14:06:09,963 - simple_example - ERROR - Validation failed!
ERROR:simple_example:Validation failed!

That output was generated by

import logging
import pandas as pd
from pydantic import BaseModel
from pydantic.types import StrictInt
from pandantic import Pandantic

# create logger - https://docs.python.org/3/howto/logging.html#logging-advanced-tutorial
logger = logging.getLogger('simple_example')
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
ch.setFormatter(formatter)
logger.addHandler(ch)

# Define your schema using Pydantic BaseModel
class DataFrameSchema(BaseModel):
    """Example schema for testing."""
    example_str: str
    example_int: StrictInt

# Create a validator instance
validator = Pandantic(schema=DataFrameSchema)

# Example DataFrame with some invalid data
df_invalid = pd.DataFrame(
    data={
        "example_str": ["foo", "bar", 1],  # Last value is invalid (int instead of str)
        "example_int": ["1", 2, 3.0],      # First and last values are invalid (str and float)
    }
)

# Validate with error raising
try:
    validator.validate(dataframe=df_invalid, errors="raise")
except ValueError:
    logger.error("Validation failed!")

I think the double error message comes from the Pandas validator in

logging.debug("Amount of available cores: %s", os.cpu_count())
and could be mitigated by instead using logging the same way the Pandas plugin works with
logger = logging.getLogger(__name__)
and calling logger.debug instead of logging.debug, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions