Skip to content

Latest commit

 

History

History
525 lines (392 loc) · 16.4 KB

File metadata and controls

525 lines (392 loc) · 16.4 KB

Package Discovery and Namespace Packages

Note

a full specification for the keywords supplied to setup.cfg or setup.py can be found at :doc:`keywords reference </references/keywords>`

Important

The examples provided here are only to demonstrate the functionality introduced. More metadata and options arguments need to be supplied if you want to replicate them on your system. If you are completely new to setuptools, the :doc:`quickstart` section is a good place to start.

Setuptools provides powerful tools to handle package discovery, including support for namespace packages.

Normally, you would specify the packages to be included manually in the following manner:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        #...
        packages =
            mypkg
            mypkg.subpkg1
            mypkg.subpkg2

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            packages=['mypkg', 'mypkg.subpkg1', 'mypkg.subpkg2']
        )

.. tab:: pyproject.toml

    .. code-block:: toml

        # ...
        [tool.setuptools]
        packages = ["mypkg", "mypkg.subpkg1", "mypkg.subpkg2"]
        # ...


If your packages are not in the root of the repository or do not correspond exactly to the directory structure, you also need to configure package_dir:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        # ...
        package_dir =
            = src
            # directory containing all the packages (e.g.  src/mypkg, src/mypkg/subpkg1, ...)
        # OR
        package_dir =
            mypkg = lib
            # mypkg.module corresponds to lib/module.py
            mypkg.subpkg1 = lib1
            # mypkg.subpkg1.module1 corresponds to lib1/module1.py
            mypkg.subpkg2 = lib2
            # mypkg.subpkg2.module2 corresponds to lib2/module2.py
        # ...

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            package_dir = {"": "src"}
            # directory containing all the packages (e.g.  src/mypkg, src/mypkg/subpkg1, ...)
        )

        # OR

        setup(
            # ...
            package_dir = {
                "mypkg": "lib",  # mypkg.module corresponds to lib/module.py
                "mypkg.subpkg1": "lib1",  # mypkg.subpkg1.module1 corresponds to lib1/module1.py
                "mypkg.subpkg2": "lib2",  # mypkg.subpkg2.module2 corresponds to lib2/module2.py
                # ...
            }
        )

.. tab:: pyproject.toml

    .. code-block:: toml

        [tool.setuptools]
        # ...
        package-dir = {"" = "src"}
            # directory containing all the packages (e.g.  src/mypkg1, src/mypkg2)

        # OR

        [tool.setuptools.package-dir]
        mypkg = "lib"
        # mypkg.module corresponds to lib/module.py
        "mypkg.subpkg1" = "lib1"
        # mypkg.subpkg1.module1 corresponds to lib1/module1.py
        "mypkg.subpkg2" = "lib2"
        # mypkg.subpkg2.module2 corresponds to lib2/module2.py
        # ...

This can get tiresome really quickly. To speed things up, you can rely on setuptools automatic discovery, or use the provided tools, as explained in the following sections.

Important

Although setuptools allows developers to create a very complex mapping between directory names and package names, it is better to keep it simple and reflect the desired package hierarchy in the directory structure, preserving the same names.

Automatic discovery

By default setuptools will consider 2 popular project layouts, each one with its own set of advantages and disadvantages [1] [2] as discussed in the following sections.

Setuptools will automatically scan your project directory looking for these layouts and try to guess the correct values for the :ref:`packages <declarative config>` and :doc:`py_modules </references/keywords>` configuration.

Important

Automatic discovery will only be enabled if you don't provide any configuration for packages and py_modules. If at least one of them is explicitly set, automatic discovery will not take place.

Note: specifying ext_modules might also prevent auto-discover from taking place, unless your opt into :doc:`pyproject_config` (which will disable the backward compatible behaviour).

src-layout

The project should contain a src directory under the project root and all modules and packages meant for distribution are placed inside this directory:

project_root_directory
├── pyproject.toml  # AND/OR setup.cfg, setup.py
├── ...
└── src/
    └── mypkg/
        ├── __init__.py
        ├── ...
        ├── module.py
        ├── subpkg1/
        │   ├── __init__.py
        │   ├── ...
        │   └── module1.py
        └── subpkg2/
            ├── __init__.py
            ├── ...
            └── module2.py

This layout is very handy when you wish to use automatic discovery, since you don't have to worry about other Python files or folders in your project root being distributed by mistake. In some circumstances it can be also less error-prone for testing or when using PEP 420-style packages. On the other hand you cannot rely on the implicit PYTHONPATH=. to fire up the Python REPL and play with your package (you will need an editable install to be able to do that).

flat-layout

(also known as "adhoc")

The package folder(s) are placed directly under the project root:

project_root_directory
├── pyproject.toml  # AND/OR setup.cfg, setup.py
├── ...
└── mypkg/
    ├── __init__.py
    ├── ...
    ├── module.py
    ├── subpkg1/
    │   ├── __init__.py
    │   ├── ...
    │   └── module1.py
    └── subpkg2/
        ├── __init__.py
        ├── ...
        └── module2.py

This layout is very practical for using the REPL, but in some situations it can be more error-prone (e.g. during tests or if you have a bunch of folders or Python files hanging around your project root).

To avoid confusion, file and folder names that are used by popular tools (or that correspond to well-known conventions, such as distributing documentation alongside the project code) are automatically filtered out in the case of flat-layout:

.. autoattribute:: setuptools.discovery.FlatLayoutPackageFinder.DEFAULT_EXCLUDE

.. autoattribute:: setuptools.discovery.FlatLayoutModuleFinder.DEFAULT_EXCLUDE

Warning

If you are using auto-discovery with flat-layout, setuptools will refuse to create :term:`distribution archives <Distribution Package>` with multiple top-level packages or modules.

This is done to prevent common errors such as accidentally publishing code not meant for distribution (e.g. maintenance-related scripts).

Users that purposefully want to create multi-package distributions are advised to use :ref:`custom-discovery` or the src-layout.

There is also a handy variation of the flat-layout for utilities/libraries that can be implemented with a single Python file:

single-module distribution

A standalone module is placed directly under the project root, instead of inside a package folder:

project_root_directory
├── pyproject.toml  # AND/OR setup.cfg, setup.py
├── ...
└── single_file_lib.py

Custom discovery

If the automatic discovery does not work for you (e.g., you want to include in the distribution top-level packages with reserved names such as tasks, example or docs, or you want to exclude nested packages that would be otherwise included), you can use the provided tools for package discovery:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        packages = find:
        #or
        packages = find_namespace:

.. tab:: setup.py

    .. code-block:: python

        from setuptools import find_packages
        # or
        from setuptools import find_namespace_packages

.. tab:: pyproject.toml

    .. code-block:: toml

        # ...
        [tool.setuptools.packages]
        find = {}  # Scanning implicit namespaces is active by default
        # OR
        find = {namespaces = false}  # Disable implicit namespaces


Finding simple packages

Let's start with the first tool. find: (find_packages()) takes a source directory and two lists of package name patterns to exclude and include, and then returns a list of str representing the packages it could find. To use it, consider the following directory:

mypkg
├── pyproject.toml  # AND/OR setup.cfg, setup.py
└── src
    ├── pkg1
    │   └── __init__.py
    ├── pkg2
    │   └── __init__.py
    ├── additional
    │   └── __init__.py
    └── pkg
        └── namespace
            └── __init__.py

To have setuptools to automatically include packages found in src that start with the name pkg and not additional:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        packages = find:
        package_dir =
            =src

        [options.packages.find]
        where = src
        include = pkg*
        # alternatively: `exclude = additional*`

    .. note::
        ``pkg`` does not contain an ``__init__.py`` file, therefore
        ``pkg.namespace`` is ignored by ``find:`` (see ``find_namespace:`` below).

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            packages=find_packages(
                where='src',
                include=['pkg*'],  # alternatively: `exclude=['additional*']`
            ),
            package_dir={"": "src"}
            # ...
        )


    .. note::
        ``pkg`` does not contain an ``__init__.py`` file, therefore
        ``pkg.namespace`` is ignored by ``find_packages()``
        (see ``find_namespace_packages()`` below).

.. tab:: pyproject.toml

    .. code-block:: toml

        [tool.setuptools.packages.find]
        where = ["src"]
        include = ["pkg*"]  # alternatively: `exclude = ["additional*"]`
        namespaces = false

    .. note::
        When using ``tool.setuptools.packages.find`` in ``pyproject.toml``,
        setuptools will consider :pep:`implicit namespaces <420>` by default when
        scanning your project directory.
        To avoid ``pkg.namespace`` from being added to your package list
        you can set ``namespaces = false``. This will prevent any folder
        without an ``__init__.py`` file from being scanned.

Important

include and exclude accept strings representing :mod:`glob` patterns. These patterns should match the full name of the Python module (as if it was written in an import statement).

For example if you have util pattern, it will match util/__init__.py but not util/files/__init__.py.

The fact that the parent package is matched by the pattern will not dictate if the submodule will be included or excluded from the distribution. You will need to explicitly add a wildcard (e.g. util*) if you want the pattern to also match submodules.

Finding namespace packages

setuptools provides find_namespace: (find_namespace_packages()) which behaves similarly to find: but works with namespace packages.

Before diving in, it is important to have a good understanding of what :pep:`namespace packages <420>` are. Here is a quick recap.

When you have two packages organized as follows:

/Users/Desktop/timmins/foo/__init__.py
/Library/timmins/bar/__init__.py

If both Desktop and Library are on your PYTHONPATH, then a namespace package called timmins will be created automatically for you when you invoke the import mechanism, allowing you to accomplish the following:

>>> import timmins.foo
>>> import timmins.bar

as if there is only one timmins on your system. The two packages can then be distributed separately and installed individually without affecting the other one.

Now, suppose you decide to package the foo part for distribution and start by creating a project directory organized as follows:

foo
├── pyproject.toml  # AND/OR setup.cfg, setup.py
└── src
    └── timmins
        └── foo
            └── __init__.py

If you want the timmins.foo to be automatically included in the distribution, then you will need to specify:

.. tab:: setup.cfg

    .. code-block:: ini

        [options]
        package_dir =
            =src
        packages = find_namespace:

        [options.packages.find]
        where = src

    ``find:`` won't work because ``timmins`` doesn't contain ``__init__.py``
    directly, instead, you have to use ``find_namespace:``.

    You can think of ``find_namespace:`` as identical to ``find:`` except it
    would count a directory as a package even if it doesn't contain ``__init__.py``
    file directly.

.. tab:: setup.py

    .. code-block:: python

        setup(
            # ...
            packages=find_namespace_packages(where='src'),
            package_dir={"": "src"}
            # ...
        )

    When you use ``find_packages()``, all directories without an
    ``__init__.py`` file will be ignored.
    On the other hand, ``find_namespace_packages()`` will scan all
    directories.

.. tab:: pyproject.toml

    .. code-block:: toml

        [tool.setuptools.packages.find]
        where = ["src"]

    When using ``tool.setuptools.packages.find`` in ``pyproject.toml``,
    setuptools will consider :pep:`implicit namespaces <420>` by default when
    scanning your project directory.

After installing the package distribution, timmins.foo would become available to your interpreter.

Warning

Please have in mind that find_namespace: (setup.cfg), find_namespace_packages() (setup.py) and find (pyproject.toml) will scan all folders that you have in your project directory if you use a :ref:`flat-layout`.

If used naïvely, this might result in unwanted files being added to your final wheel. For example, with a project directory organized as follows:

foo
├── docs
│   └── conf.py
├── timmins
│   └── foo
│       └── __init__.py
└── tests
    └── tests_foo
        └── __init__.py

final users will end up installing not only timmins.foo, but also docs and tests.tests_foo.

A simple way to fix this is to adopt the aforementioned :ref:`src-layout`, or make sure to properly configure the include and/or exclude accordingly.

Tip

After :ref:`building your package <building>`, you can have a look if all the files are correct (nothing missing or extra), by running the following commands:

tar tf dist/*.tar.gz
unzip -l dist/*.whl

This requires the tar and unzip to be installed in your OS. On Windows you can also use a GUI program such as 7zip.

Legacy Namespace Packages

The fact you can create namespace packages so effortlessly above is credited to PEP 420. It used to be more cumbersome to accomplish the same result. Historically, there were two methods to create namespace packages. One is the pkg_resources style that was supported by setuptools and the other one being pkgutils style offered by pkgutils module in Python. Both are now considered deprecated despite the fact they still linger in many existing packages. These two differ in many subtle yet significant aspects and you can find out more on Python packaging user guide.


[1]https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
[2]https://blog.ionelmc.ro/2017/09/25/rehashing-the-src-layout/