Skip to content

[analyze] proposal for extending include-dependencies #24216

@markjm

Description

@markjm

Feature suggestion


Please let me know if I’m being too annoying with all my ruff analyze nonsense. I am convinced this tool is being slept on for all sorts of cases in large projects like smart test selection, refactor helpers, etc!

Its such a fast and easy way to map a repository without needing the extra "metadata layer" that Bazel/Pants BUILD files require (though admittedly does not handle some cases those tools do like 3p dependences (maybe next proposal 😄 )).

Proposal

Extend include-dependencies to support a new in-code comment annotation to guide ruff analyze in finding non-statically-analyzable imports.

That might look like:

from importlib import import_module
import pkgutil

__path__ = pkgutil.extend_path(__path__, __name__)
for importer, modname, ispkg in pkgutil.walk_packages(path=__path__, prefix=__name__ + '.'):
    # analyze include: registry/features/** <- (open to name suggestions)
    import_module(modname)

This allows for a more complete annotation of our codebase which is colocated with the dynamic-import usage instead of being hidden away in the pyproject.toml. Co-locating like this could also lend itself to in-code (like below) or lint rules that ensure such cases are all properly annotated to maintain as complete a map as possible.

Alternatives Considered

In AST-walk phase of import-finder, we can directly find import_module calls and (1) look for similar comments specifically associated with those function calls or (2) attempt a partially-naive fstring-to-glob resolution for specific cases which look like:

import_module(f'assistants.{self.assistant_name}.functions')

Note: in my codebase, I am also exploring requiring a use of a shared internal implementation of import_module with the call signature

def import_module(
    name: str,
    package: str | None = None,
    *,
    # Note: LiteralString ensures the string passed in is developer-provided
    # and not dynamic, so we can more effectively let codemods reason about it.
    include_glob: LiteralString | None = None,
    strict: bool = True,
) -> ModuleType:
    """
    A decorated version of import_module that allows for passing along additional metadata for static analysis.

    include_glob dictates which modules could possibly be imported.

    Args:
      name: The name of the module to import.
      package: The package of the module to import. (optional)
      include_glob: A glob to match the module name against. (optional)
      strict: Whether to raise an error if the module does not match the include_regex. (optional)
    Returns:
      The imported module.
    """

    if include_glob and not fnmatch.fnmatch(name, include_glob) and strict:
        raise ValueError(f"Import metadata does not allow importing {name}")

    return real_import_module(name, package)

To runtime-verify that whatever annotation is “verified” at runtime - but I acknowledge that’s maybe not extendable into the tool 😆

The reason I am suggesting the “floating comment” (not necessarily tied to an import statement) is because it is a little more generalized to cases which we know about but couldn’t statically analyze. As an example, we might want to include analyze include: tests/** in a conftest.py because we know it is ambiently included.

I have a change implementing this (and have a 80% working example of the alternative approach) if y'all think this could be valuable.

cc @MichaReiser @ntBre

Version

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    analyzeRelated to Ruff analyze functionalityneeds-designNeeds further design before implementation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions