-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Details
The ongoing work on rapids-build-backend relies on using rapids-dependency-file-generator as an importable module, not a CLI, like this
from rapids_dependency_file_generator.cli import generate_matrix
from rapids_dependency_file_generator.constants import default_pyproject_dir
from rapids_dependency_file_generator.rapids_dependency_file_generator import (
get_requested_output_types,
make_dependency_files,
)ref: https://github.com/rapidsai/rapids-build-backend/pull/17/files#r1542068869
Given that, I think some work should be done to make the boundary between public and private parts of the API here clearer.
Approach
- use
__all__entries in every module of the library- to limit which things should be imported with
*imports
- to limit which things should be imported with
- use an
__all__entry in the top-level__init__.py- to define what is importable directly from the
rapids_dependency_file_generatormodule without needing to know about submodules - to limit what is imported by
from rapids_dependency_file_generator import *
- to define what is importable directly from the
- prefix all internal-only functions, classes, methods, and other objects with
_
Benefits of this work
Clarifies the boundary between public and private, reducing the need for tight coordination between rapids-build-backend and rapids-dependency-file-generator for some types of changes.
Makes it easier for tools to help enforce those boundaries:
sphinxautodoc will by default only auto-generate documentation for entries in__all__(docs)- some linters will complain about uses of imports prefixed with
_(e.g.ruff's pylintimport-private-namecheck))
Other notes
Consider the following.
only '__version__' is importable from the top-level namespace today (click me)
pip install .
python -c "from rapids_dependency_file_generator import *; print(dir())"['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '__version__']
Only __version__ is exported from the top-level module
| __all__ = [ | |
| "__version__", | |
| ] |
That means that things like rapids-build-backend that want to import other things have imported coupled to the exact file layout in this project, like this.
from rapids_dependency_file_generator.rapids_dependency_file_generator import (
get_requested_output_types,
make_dependency_files,
)In my opinion, it'd be better to re-export that stuff from the main module, so that rapids-build-backend could do this
from rapids_dependency_file_generator import (
get_request_output_types,
make_dependency_files
)And so we could freely re-arrange the modules inside this library without breaking rapids-build-backend.
Exposing submodule import paths is really helpful in projects with large APIs like sklearn or scipy, but for this small library I think it'd be better to push downstream consumers to just import from the top-level namespace.
submodules are re-exporting all their imports (click me)
Consider the following
pip install .
python -c "from rapids_dependency_file_generator.cli import *; print(dir())"['Output', '__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'argparse', 'default_dependency_file_path', 'delete_existing_files', 'generate_matrix', 'load_config_from_file', 'main', 'make_dependency_files', 'os', 'validate_args', 'version', 'warnings']
It's technically possible right now to do something like this:
from rapids_dependency_file_generator.cli import argparseThat should be discouraged, via an __all__ entry in src/rapids_dependency_file_generator/cli.py (and all the other submodules).