Skip to content

Titles in search index can contain HTML and escaped characters #13355

Open
@wlach

Description

@wlach

Describe the bug

We seem to be passing escaped HTML to the search indexer from the parser.

You can see this in the searchindex.js for the Python docs if you search for (for example) <code:

https://docs.python.org/3.14/searchindex.js

I have a PR to address this, will post.

How to Reproduce

This rst file trivially reproduces the issue:

`escaped` title with < and > in it
==================================

this document has escaped content in the title but also the characters < and > in it

Environment Information

Sphinx main as of Feb 17 2024

Sphinx extensions

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions