Description
Context
Current logging strategy
Currently PyAirbyte forces the different loggers to write to disk (get_global_file_logger, new_passthrough_file_logger for connectors and get_global_stats_logger for global stats).
Those loggers can be seen as being hardcoded in the codebase. get_global_file_logger
and get_global_stats_logger
being singletons and new_passthrough_file_logger
being an hardcoded logger implementation that each connector relies on by default.
The only configuration available is using an environment variable called AIRBYTE_LOGGING_ROOT to determine where should those logs be written to.
First rework approach
A first approach has been worked on to allow overriding the global logger (PR), but it would still be missing the loggers used by connectors.
Special behavior
One of the important feature of the existing implementation is that new_passthrough_file_logger
chooses a random filename to store logs on disk.
It allows running PyAirbyte multiple times in a row while keeping separate files.
Design suggestion
Proposal 1: Choose logger behavior using environment variables
We could propose three modes:
- FILE_ONLY: Current behavior, the default
- CONSOLE_ONLY: Writing on stdout/stderr
- FILE_AND_CONSOLE: Keep the current behavior but also write all logs on stdout/stderr
Proposal 2: Add customization methods in logs.py (introduces breaking changes)
We can split get_global_file_logger
and new_passthrough_file_logger
into the following methods:
# Global logger
def get_global_logger() -> logging.Logger:
pass
def set_global_logger_handlers(*handlers: logging.Handler) -> None:
pass
def global_logger_file_handler() -> logging.FileHandler:
pass
# Global stats logger
def get_global_stats_logger() -> logging.Logger:
pass
def set_global_stats_logger_handlers(*handlers: logging.Handler) -> None:
pass
def global_stats_logger_file_handler() -> logging.FileHandler:
pass
# Passthrough logger
def new_passthrough_logger(connector_name: str) -> logging.Logger:
pass
def set_passthrough_logger_factory_handlers(*handlers: Callable[[str, ULID], logging.Handler]) -> None:
pass
def passthrough_logger_file_factory_handler(connector_name: str, ulid: ULID) -> logging.FileHandler:
pass
Proposal 2's breaking change
While the default behavior will be the same, functions will not have the same name.
We could keep the current function names and only add new ones and add the set_...
and ..._handler
methods, but the code quality would drop.
Rejected design ideas
We could rely on a dictConfig or fileConfig.
However it doesn't fit well with the need to have a random file name on disk to avoid mixing logs from a connector being used in different situations.