Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Allow finer logging filtering (follow-up of #10219) #11325

Open
picnixz opened this issue Apr 14, 2023 · 17 comments
Open

[FEATURE] Allow finer logging filtering (follow-up of #10219) #11325

picnixz opened this issue Apr 14, 2023 · 17 comments

Comments

@picnixz
Copy link
Member

picnixz commented Apr 14, 2023

This is a follow-up of #10219. The purpose of this proposal is to formally define a mechanism for suppressing logging messages from third-party extensions assuming they use the Sphinx logging mechanism. In particular, we work under the following assumptions:

  • Extensions declare a top-level logger using logging.getLogger(__name__) where logging is the sphinx.util.logging module.
  • Extensions that do not use this logging mechanism but create loggers using logging.getLogger(__name__) with logging being the standard module will not be supported by the related PR.

I personally developed some extension to suppress messages from third-party extensions (or even built-in extensions), e.g., sphinx.ext.intersphinx which warnss users when an HTTP connection fails (I only want to suppress specific messages and not all warnings from sphinx.ext.intersphinx). The reason why I did this is because I work half of the time offline, meaning I don't have any Internet access. So, obviously, I get some warnings and that bothers me (and I don't want to update my conf.py everytime by swapping which inventory to query in priority). Anyway, that's for the rationale behind this issue, so here is the plan I suggest.

  • Use the logging.Filter interface to intercept records that need to be filtered.

  • Filters can be created by the user (and directly added to the workflow) or they can be configured using the following syntax in conf.py.

  • The idea is to allow users to suppress messages:

    • by their emitting logger,
    • by their logging level,
    • by matching the logger's name with a regular expression,
    • by matching their content with a regular expression.

    The above filters should be composable, namely, one can suppress a specific message emitted with a specific level by a specific logger, or any message matching some regular expression, etc. That way, users can completely customize how to interact with third-party extensions. Note that using type and subtype is not always sufficient since some messages may be to generic (and in the end, the user has not control over the third-party extensions implementations unless they request them to always use type and subtype when logging messages).

  • The loggers are then modified in place by adding the created filters. When loading an extension, Sphinx imports the corresponding (hence executing it). The idea is to apply the logging filtering mechanism as the "last" extension by simply inspecting the modules of each loaded extensions, searching for instances of SphinxLoggerAdapter and modify them accordingly.

Related

@maxpumperla
Copy link

@picnixz this sounds amazing. any chance you can share your plugin and give a minimal example for this? Would be much appreciated.

@picnixz
Copy link
Member Author

picnixz commented Apr 21, 2023

Here's the plugin. Not all operations as explained are specified (an example is provided in the docstring). Feel free to change the name of the configuration values as you deem fit (I called my extension pyzeta because it sounds nice).

Technically, you can suppress logging messages by emitting logger, by emitting logger and logging level and by messages matching and by emitting logger and messages matching. I still need to implement the feature for suppressing by logging level and message matching and allowing regular expressions for logger names.

DISCLAIMER: I only checked my extension for the needs I wanted so there might have bugs that I'm not aware of.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# author: picnixz
# license: MIT

r"""
A filter suppressing logging records issued by a Sphinx logger.

The records are filtered according to their context (logger name and record
level) and their formatted message.

Typical usage::

    zeta_suppress_loggers = {
        'sphinx.ext.autodoc': True                  # suppress logger
        'sphinx.ext.intersphinx': ['INFO', 'ERROR'] # specific levels
    }

    zeta_suppress_records = [
        'error: .+',
        ['sphinx.ext.intersphinx', '.*Name or service not known$']
    ]

Note that Sphinx automatically adds ``sphinx.`` to the logger names when 
instantiating a logger adapter through :func:`sphinx.util.logging.getLogger`.

In particular, *all* Sphinx-related modules and third-party extensions are assumed 
to do the same. Use ``EXTENSION`` and not ``sphinx.EXTENSION`` to suppress the
logger associated with the named extension (e.g., ``sphinx.ext.intersphinx`` to
suppress the logger declared in the :mod:`sphinx.ext.intersphinx` module).

.. confval:: zeta_suppress_loggers = {}

    A dictionary describing which logger to suppress, possibly partially.

    .. code-block::

        # suppress messages from 'sphinx.ext.autodoc'
        zeta_suppress_loggers = {'sphinx.ext.autodoc': True}

        # suppress INFO and ERROR messages from 'sphinx.ext.autodoc'
        zeta_suppress_loggers = {'sphinx.ext.autodoc': ['INFO', 'ERROR']}

.. confval:: zeta_suppress_protect = []

    A list of module names that are known to contain a Sphinx logger but
    that will never be suppressed automatically. This is typically useful
    when an extension contains submodules declaring loggers which, when 
    imported, result in undesirable side-effects.

.. confval:: zeta_suppress_records = []

    A list of message patterns to suppress, possibly filtered by logger.

    .. code-block::

        # suppress messages matching 'error: .*' and 'warning: .*'
        zeta_suppress_records = ['error: .*', 'warning: .*']

        # suppress messages issued by 'sphinx.ext.intersphinx'
        zeta_suppress_records = [('sphinx.ext.intersphinx', '.*')]
"""

__all__ = ()

import abc
import importlib
import inspect
import itertools
import logging
import pkgutil
import re
import warnings
from typing import TYPE_CHECKING

from sphinx.util.logging import NAMESPACE, SphinxLoggerAdapter

if TYPE_CHECKING:
    from sphinx.application import Sphinx
    from sphinx.config import Config
    from sphinx.extension import Extension

def partition(predicate, iterable):
    """
    Return a pair `(no, yes)`, where *yes* and *no* are subsets of *iterable*
    over which *predicate* evaluates to |False| and |True| respectively.
    """

    no, yes = itertools.tee(iterable)
    no, yes = itertools.filterfalse(predicate, no), filter(predicate, yes)
    return no, yes

def notnone(value):
    return value is not None

ALL = object()

def _normalize_level(level):
    if isinstance(level, int):
        return level

    # pylint: disable-next=W0212
    result = logging._nameToLevel.get(level)
    if result is not None:
        return result

    return None  # unknown level

def _parse_levels(levels):
    if not isinstance(levels, (list, tuple)):
        if isinstance(levels, (int, str)):
            levels = [levels]
    return list(filter(notnone, map(_normalize_level, levels)))

class SphinxSuppressFilter(logging.Filter, metaclass=abc.ABCMeta):
    def filter(self, record):
        # type: (logging.LogRecord) -> bool
        return not self.suppressed(record)

    @abc.abstractmethod
    def suppressed(self, record):
        # type: (logging.LogRecord) -> bool
        pass

class SphinxSuppressLogger(SphinxSuppressFilter):
    r"""A filter suppressing logging records issued by a Sphinx logger."""

    def __init__(self, name, levels=()):
        """
        Construct a :class:`SphinxSuppressLogger`.

        :param name: The (real) logger name to suppress.
        :type name: str
        :param levels: Optional logging levels to suppress.
        :type levels: bool | list[int]
        """

        super().__init__(name)
        if isinstance(levels, bool):
            self.levels = ALL if levels else []
        else:
            self.levels = _parse_levels(levels)

    def suppressed(self, record):
        return (
            logging.Filter.filter(self, record)
            and (self.levels is ALL or record.levelno in self.levels)
        )

class SphinxSuppressPatterns(SphinxSuppressFilter):
    r"""A filter suppressing matching messages."""

    def __init__(self, patterns=()):
        """
        Construct a :class:`SphinxSuppressPatterns`.

        :param patterns: Optional logging messages (regex) to suppress.
        :type patterns: list[str | re.Pattern]
        """

        super().__init__('')  # all loggers
        self.patterns = set(map(re.compile, patterns))

    def suppressed(self, record):
        m = record.getMessage()
        return self.patterns and any(p.search(m) for p in self.patterns)

class SphinxSuppressRecord(SphinxSuppressLogger, SphinxSuppressPatterns):
    r"""A filter suppressing matching messages by logger's name pattern."""

    def __init__(self, name, levels=(), patterns=()):
        """
        Construct a :class:`SphinxSuppressRecord` filter.

        :param name: A logger's name to suppress.
        :type name: str
        :param levels: Optional logging levels to suppress.
        :type levels: bool | list[int]
        :param patterns: Optional logging messages (regex) to suppress.
        :type patterns: list[str | re.Pattern]
        """

        SphinxSuppressLogger.__init__(self, name, levels)
        SphinxSuppressPatterns.__init__(self, patterns)

    def suppressed(self, record):
        return (
            SphinxSuppressLogger.suppressed(self, record)
            and SphinxSuppressPatterns.suppressed(self, record)
        )

### event: config-inited

def _get_filters(config):
    format_name = lambda name: f'{NAMESPACE}.{name}'

    filters_by_prefix = {}
    for name, levels in config.zeta_suppress_loggers.items():
        prefix = format_name(name)
        suppressor = SphinxSuppressLogger(prefix, levels)
        filters_by_prefix.setdefault(prefix, []).append(suppressor)

    is_pattern = lambda s: isinstance(s, (str, re.Pattern))
    groups, patterns = partition(is_pattern, config.zeta_suppress_records)
    for group in groups:  # type: tuple[str, ...]
        prefix = format_name(group[0])
        suppressor = SphinxSuppressRecord(prefix, True, group[1:])
        filters_by_prefix.setdefault(prefix, []).append(suppressor)
    # default filter
    default_filter = SphinxSuppressPatterns(patterns)
    return default_filter, filters_by_prefix

def _is_sphinx_logger_adapter(obj):
    return isinstance(obj, SphinxLoggerAdapter)

def _update_logger_in(module, default_filter, filters_by_prefix, _cache):
    if module.__name__ in _cache:
        return

    _cache.add(module.__name__)
    members = inspect.getmembers(module, _is_sphinx_logger_adapter)
    for _, adapter in members:
        for prefix, filters in filters_by_prefix.items():
            if adapter.logger.name.startswith(prefix):
                for f in filters:
                    # a logger might be imported from a module
                    # that was not yet marked, so we only add 
                    # the filter once
                    if f not in adapter.logger.filters:
                        adapter.logger.addFilter(f)
        if default_filter not in adapter.logger.filters:
            adapter.logger.addFilter(default_filter)
   
def install_supress_handlers(app, config):
    # type: (Sphinx, Config) -> None

    default_filter, filters_by_prefix = _get_filters(config)
    seen = set()

    for extension in app.extensions.values():  # type: Extension
        if extension.name in config.zeta_suppress_protect:
            # skip the extension
            continue

        mod = extension.module
        _update_logger_in(mod, default_filter, filters_by_prefix, seen)
        if not hasattr(mod, '__path__'):
            continue

        # find the loggers declared in a submodule
        mod_path, mod_prefix = mod.__path__, mod.__name__ + '.'
        with warnings.catch_warnings():
            warnings.simplefilter('ignore', DeprecationWarning)
            warnings.simplefilter('ignore', PendingDeprecationWarning)
            for mod_info in pkgutil.iter_modules(mod_path, mod_prefix):
                if mod_info.name in config.zeta_suppress_protect:
                    # skip the module
                    continue

                try:
                    mod = importlib.import_module(mod_info.name)
                except ImportError:
                    continue 
                _update_logger_in(mod, default_filter, filters_by_prefix, seen)

def setup(app):
    # type: (Sphinx) -> dict
    app.add_config_value('zeta_suppress_loggers', {}, True)
    app.add_config_value('zeta_suppress_protect', [], True)
    app.add_config_value('zeta_suppress_records', [], True)
    # @contract: no extension is loaded after config-inited is fired
    app.connect('config-inited', install_supress_handlers, priority=1000)
    return {'parallel_read_safe': True, 'parallel_write_safe': True}

@maxpumperla
Copy link

@picnixz this is great, thanks so much! Will try this later.

@picnixz
Copy link
Member Author

picnixz commented Apr 22, 2023

Quick edit:

it appears that in debug mode, loggers will try to format the priority as an integer (using %d). Here, I used an object that is not an integer to represent infinity, so instead of using priority=LAST, use priority=VERY_LARGE_INTEGER (technically, docutils priorities range from 1 to 999 so this should be the "largest". I wanted to avoid being initialized after another extension which would have had the same idea for instance. Anyway, using priority=1000 should be sufficient).

@maxpumperla
Copy link

@picnixz I'm testing this with just

suppress_loggers = {
    'sphinx.ext.autodoc': True,
}

and get this error message:

Exception occurred:
  File "/Users/maxpumperla/code/anyscale/ray/doc/source/_ext/suppress.py", line 146, in suppressed
    or record.levelno in self.levels
TypeError: argument of type 'object' is not iterable

I can look deeper into this, but thought you might have an idea what the issue might be.

@picnixz
Copy link
Member Author

picnixz commented Apr 27, 2023

@maxpumperla Oh, sorry, the line should be and (self.levels is ALL or record.levelno in self.levels) and not and self.levels is ALL or record.levelno in self.levels.

By the way, I just observed that some modules do have loggers declared but are not necessarily present in the extension registry (like the logger declared in sphinx.ext.autodoc.directive). So the extension won't be able to catch it. Let me just make it work for that.

EDIT: It should now work. A logger can still be protected by simply specifying the module name in zeta_suppress_protect (by default, sphinx.ext.autodoc suppresses all loggers declared in sphinx.ext.autodoc, including sphinx.ext.autodoc.directive although the module is not an extension. To enable sphinx.ext.autodoc.directive explicitly, just use zeta_suppress_protect = ['sphinx.ext.autodoc.directive']).

@maxpumperla
Copy link

maxpumperla commented Apr 27, 2023

thanks @picnixz, I think I'm slowly getting there. If I simply patch the line you just mentioned (in your unedited code snippet), the plugin builds. In the edited code above I think there's a bug in:

    is_pattern = p_isinstance((str, re.Pattern))

this p_isinstance is not defined anywhere. Also, this line

    connect('config-inited', install_supress_handlers, priority=1000)

should be app.connect(...), right?

In any case, in the version that works for me (your unedited snippet + patch) I can build my docs, but suppressing e.g. autosummary warnings still doesn't work for some reason. Autosummary is the "big one" for us, since we have so many auto-generated API references that clutter the sphinx build output with warnings.

many, many thanks for your help on this by the way, this is all immensely helpful.

@picnixz
Copy link
Member Author

picnixz commented Apr 27, 2023

this p_isinstance is not defined anywhere.

Oops, sorry, it comes from my own internal util. I'll edit this.

should be app.connect(...), right?

Yes.

but suppressing e.g. autosummary warnings still doesn't work for some reason

I'll have a look.

@maxpumperla Can you give me one of the warnings that is emitted ? in order to trace it back (also, are you suppressing all autosummary messages using suppress_loggers = {'sphinx.ext.autosummary': True} or are you using another constraint ? (maybe that's the one that is buggy).

@maxpumperla
Copy link

Nice!!! it works... thank you! I'll open a PR against ray with this soon.

hey, while I have you @picnixz, would you know how to suppress this classic with your plugin?

WARNING: document isn't included in any toctree

this is collateral of many autodoc/summary additions that aren't easily "orphaned".

@maxpumperla
Copy link

btw, I think you should just wrap this up and open source it as a regular plugin for people to install. much faster turnaround than if you were to sponsor this to core sphinx directly. wdyt?

@picnixz
Copy link
Member Author

picnixz commented Apr 27, 2023

WARNING: document isn't included in any toctree

Oh this one is a 'toc' warning I think, not an autosummary one. You should be able to suppress it natively via suppress_warnings = ['toc'] (see https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-suppress_warnings).

btw, I think you should just wrap this up and open source it as a regular plugin for people to install. much faster turnaround than if you were to sponsor this to core sphinx directly. wdyt?

Well, that would be a possibility, but I still think that it's good to have the possibility to do it at the core level. I'll probably create a small repo myself then (actually, I customized Sphinx a lot to have the minimal amount of work when writing the doc and the prettiest output I could. For instance, I can suppress specific warnings directly by specifying some magic comment inside the docstring so it's good when coupled with autodoc tools or I can customize a directive option at the docstring level instead of the reST document level, or I can use PyCharm syntax for callables instead of the readable Callable[...]). Anyway, when I've got time I'll probably push all my extensions in one go.

@maxpumperla
Copy link

yeah, suppress_warnings = ['toc'] doesn't work (for whatever reason). the obscurity of these confval options is somewhat beyond my understanding. This is a good thread on the topic:

https://stackoverflow.com/questions/15249340/warning-document-isnt-included-in-any-toctree-for-included-file

but the point is that we want to avoid touching the individual documents or curate a list of nested paths to our APIs, so I was hoping your plugin can capture this on a more fundamental level and just filter all those warnings directly.

@picnixz
Copy link
Member Author

picnixz commented Apr 27, 2023

Ok I see where this warning is emitted. It is emitted by the logger in sphinx.environment which is, obviously, not an extension. So you can, again, suppress it using {'sphinx.environment': ['WARNING']} or use suppress_records = ['.*document isn't included in any toctree.*'] configuration value (that would also serve as a test for it !). [Now, I think I should rather use re.search instead of re.match when matching logging messages (and users would simply add ^ if they want a more precise match). (it was already the case)]

@picnixz
Copy link
Member Author

picnixz commented Apr 29, 2023

@maxpumperla Hi, I released the extension: https://github.com/picnixz/sphinx-zeta-suppress.

There was a subtle bug in some function where the suppression is not entirely correct so this version should be the correct and precise one).

Please feel free to submit any issue directly on that repo.

EDIT: I think my extension still does not work for suppressing logging messages if they are issued from a logger declared in a nom-extension. When I wake up, I will write tests for that instead of relying on mental testing.

EDIT 2: @maxpumperla it should (now) be working for everything. I've updated also some parts of the documentation.

@AA-Turner AA-Turner added this to the some future version milestone Apr 29, 2023
@DrShushen
Copy link

Just chiming in to say that I believe more fine-grained warning suppression, like achieved in this extension, would be a very useful feature.

@picnixz
Copy link
Member Author

picnixz commented Oct 22, 2023

If people are willing, I would be glad to incorporate this feature at the level of sphinx directly without relying on an external extension (though I need to find the time for that). Otherwise I would say that you could use my extension.

@DrShushen
Copy link

If people are willing, I would be glad to incorporate this feature at the level of sphinx directly without relying on an external extension (though I need to find the time for that). Otherwise I would say that you could use my extension.

The extension approach is a great staring point in my opinion, but the extension doesn't currently seem to be pip installable - I have created an issue in the extension repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants