Skip to content

Clarifications needed re packaging flow from the perspective of a build backend #1779

Open
@zahlman

Description

@zahlman

Issue Description

I'm writing a build backend which aims to deliver these key features relevant to the discussion (among others):

  1. Sdists will only contain static metadata;
  2. The code for sdist-building and wheel-building is separable, such that, when an sdist is downloaded and installed automatically (e.g. by Pip), only the wheel-building code is needed as a build dependency;
  3. There is no legacy support - sdists contain a PKG-INFO specifying a core metadata version of 2.2 or higher (most likely 2.4) and a pyproject.toml.

There are several confusing points I've encountered in the description of pyproject.toml and of the core metadata format, and how they are used in source trees (pyproject.toml only), sdists (both) and wheels (core metadata only). My goal here is to verify that I can accomplish my goals while remaining standards compliant.

The main conceptual problem I'm having is that pyproject.toml and core metadata are described as canonical metadata formats, yet a non-legacy sdist is expected to contain both. I have many questions as a result.

First, regarding sdist creation: my understanding is that in this process, the build backend:

  • MUST faithfully represent static metadata (if any) from the source tree's pyproject.toml in the PKG-INFO;
  • MAY compute values for dynamic metadata and include these in the PKG-INFO as well.

The question is, what happens to the version of pyproject.toml that ends up in the sdist?

It seems to me that it cannot in general be an exact copy of the source tree's pyproject.toml, because if I compute dynamic metadata then there is a conflict - the field is marked dynamic in pyproject.toml but provided statically in PKG-INFO.

Am I at liberty to create an entirely new pyproject.toml, as long as it follows the spec? For example, can I remove the [project] table (since in general this table isn't required to be present, and I've already fully "compiled" its information into PKG-INFO)? Can I change the [build-system] table, such that a different build backend will be used to create the wheel? (One implementation idea I had was to incorporate an in-tree, wheel-specific backend into the sdist.) Should [project] at least be edited to reflect the dynamic metadata values that were calculated (e.g. add the computed values as static keys, and remove the corresponding names from project.dynamic)?


Then, regarding wheel building. Regarding core metadata, it says that "Fields defined in the following specification should be considered valid, complete and not subject to change."

Does that imply that the wheel's METADATA MUST be a copy of the sdist's PKG-INFO?

Doesn't that prevent computing metadata values at wheel creation time? (Not applicable to me, but still worth raising the question.)

Doesn't that in turn imply that non-legacy sdists need to have all the dynamic metadata values computed, and they can't be deferred to wheel-building? (I think this is intentional, so that e.g. installers can figure out basic information about the package without building it. But as of 24.3.1, Pip still does the build first anyway, even when PKG-INFO declares the latest metadata version.)

Doesn't that cause a problem for PEP 725 – Specifying external dependencies in pyproject.toml, since they propose to give semantics to Requires-External metadata whereby the wheel's version could differ? (In particular: the wheel-building process could use a tool like cibuildwheel to vendor a compiled shared C library whose source is not included in the sdist; by my reading of the PEP, the intent is that PKG-INFO would describe the library as an external requirement, but METADATA would not.)

Also: when building the wheel, is it required to look at pyproject.toml at all, or to validate it? My understanding is that the only mandatory purpose pyproject.toml actually serves at this point in the process is to tell an installer what build backend to use (and what its statically-known dependencies are); the backend itself is free to use other files for configuration (i.e. the config isn't required to be in [tool], and other tools simply won't be invoked at this point), and the [project] metadata is either redundant with PKG-INFO or erroneous.


Bonus round:

Given that PEP 725 isn't accepted yet, is there any circumstance in which it would make sense for a modern build backend to output Requires-External or Supported-Platform values in core metadata? I can't think of any.

Code of Conduct

  • I am aware that participants in this repository must follow the PSF Code of Conduct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: discussionDiscussion of general ideas, design, etc.type: questionA user question that needs/needed an answer

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions