HTMLPage PEP 503 compliance

In `pip._internal.index`, links on a simple API page are found with the following:

1. [`a[href]` values are collected](https://github.com/pypa/pip/blob/b6a2be0e0ecf0d67c524b7e94a2ebacc1f7f792d/src/pip/_internal/index.py#L980)
2. [the filename part is parsed for metadata](https://github.com/pypa/pip/blob/b6a2be0e0ecf0d67c524b7e94a2ebacc1f7f792d/src/pip/_internal/index.py#L814)

But here’s how PEP 503 describes the `a` tags on an individual project page:

> The text of the anchor tag MUST be the filename of the file and the href attribute MUST be an URL that links to the location of the file for download.

A linked file’s name is specified by the text, and there’s no guarantee that the link’s filename part should match it. So instead of parsing the link, pip should use the `a` tag’s text instead to parse for metadata.

Am I interpreting the PEP text correctly? If so, should pip be fixed to follow the spec, or should we just fix the PEP to say the URL’s filename part must match the text (since existing implementations all already do this anyway in order to work)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HTMLPage PEP 503 compliance #6272

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

HTMLPage PEP 503 compliance #6272

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions