Skip to content

Compile, sort, and verify the reference database at build time#42447

Open
orlitzky wants to merge 6 commits into
sagemath:developfrom
orlitzky:build-references
Open

Compile, sort, and verify the reference database at build time#42447
orlitzky wants to merge 6 commits into
sagemath:developfrom
orlitzky:build-references

Conversation

@orlitzky

@orlitzky orlitzky commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Build the main reference database page from a "database" containing only references, at build time.

  • Convert src/doc/en/reference/references/index.rst to a template with @REFS_<LETTER>@ placeholders for the content.
  • Leave a dummy index.rst behind, so that the docbuild expects the generated one to (eventually) be there.
  • Add a new script src/sage_docbuild/generate-references.py to "compile" the final page from the reference database and template.
  • Add a custom target in src/doc/en/reference/references/meson.build to (re)build the references page as needed, when either the database or template changes.

This approach has several advantages:

  • It catches syntax errors, this is how I found the problems fixed by src/doc/en/reference/references/index.rst: fix syntax #42352.
  • It ensures that the input/output is sorted.
  • Later on it can be extended to enforce a standardized format.
  • We do not have to parse (and reconstruct) a free-form ReST file, only a list of references.
  • The reference database can live in a more discoverable location, src/doc/reference-database.rst

Documentation shortcuts:

orlitzky added 2 commits June 28, 2026 12:26
This is a sorted list of references from the existing bibliographic
database page. By separating the references from the rest of the page,
it becomes much easier to process (in particular, to sort) the list.
@orlitzky orlitzky marked this pull request as draft June 28, 2026 16:35
@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown

Documentation preview for this PR (built with commit dc3700b; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

orlitzky added 4 commits June 28, 2026 18:29
This is a new script, meant to sort, verify, and "compile" a list of
(ReST) citations into the main bibliographic reference page. It uses
docutils to parse the citations that are now stored in a dedicated
file. For the layout of the page, a template (passed as input to the
script) is used instead.

This will help keep our references in order, while avoiding the need
to parse and/or generate a fully free-form ReST document.
We need a placeholder for the generated index.rst in the source tree
during "meson setup", otherwise the documentation build won't expect
one to be there after we generate it.
Add a custom_target() to run src/sage_docbuild/generate-references.py,
generating the full reference database page from the database of
references and a template. We no longer copyfile(index.rst) because
that is now a placeholder.
We use it to parse/sort the references database.
@orlitzky orlitzky marked this pull request as ready for review June 29, 2026 01:24
@gmou3

gmou3 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Should, e.g., [Ab...] go to the top with [AB...]?

@orlitzky

Copy link
Copy Markdown
Contributor Author

Should, e.g., [Ab...] go to the top with [AB...]?

I decided on case-sensitive within each letter group because it keeps the same authors together, so long as the labels are chosen consistently. For example:

Case-sensitive:

  • [ABCD2024]
  • [AbCd2023]
  • [AbCd2025]
  • [AbCd2026]

versus case-insensitive:

  • [AbCd2023]
  • [ABCD2024]
  • [AbCd2025]
  • [AbCd2026]

I don't feel too strongly about it though. Whatever people want is fine.

@gmou3

gmou3 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

I would vote for case-insensitive because otherwise the distance between AB and Ab is big. We can also sort by the tuple (prefix/case-insensitive letters, prefix/case-sensitive letters, suffix/date).

@orlitzky

Copy link
Copy Markdown
Contributor Author

I just mentioned this on the mailing list because it can be considered a big change, maybe others will chime in. If not I'll just switch it to case-insensitive.

@dimpase

dimpase commented Jun 29, 2026

Copy link
Copy Markdown
Member

I've been saying for years that references should come in .bib files, these can be processed just fine with sphinx.
https://sphinxcontrib-bibtex.readthedocs.io/en/latest/

@orlitzky

Copy link
Copy Markdown
Contributor Author

I've been saying for years that references should come in .bib files, these can be processed just fine with sphinx. https://sphinxcontrib-bibtex.readthedocs.io/en/latest/

That would be much better in the long term, but we have thousands of existing entries that would need to be converted first.

@vincentmacri

Copy link
Copy Markdown
Member

I've been saying for years that references should come in .bib files, these can be processed just fine with sphinx. https://sphinxcontrib-bibtex.readthedocs.io/en/latest/

That would be much better in the long term, but we have thousands of existing entries that would need to be converted first.

I agree with these points.

@orlitzky changing this would be out of scope for this PR, but would this new setup make it possible to do some kind of hybrid system where the old references remain as they are (at least unless someone has the time to go through them), and new references are added to a .bib file? Then this script merges it all together into one index.rst?

@orlitzky

Copy link
Copy Markdown
Contributor Author

@orlitzky changing this would be out of scope for this PR, but would this new setup make it possible to do some kind of hybrid system where the old references remain as they are (at least unless someone has the time to go through them), and new references are added to a .bib file? Then this script merges it all together into one index.rst?

It should be possible, sphinxcontrib-bibtex is using pybtex under the hood, and there is a pybtex-docutils package that looks like it does what we need. The example in its README shows it parsing a bibtex file (well, a string in this case), and then looping through the entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants