Skip to content

[doc] add ipython3 lexer hook for notebooks with shell/magic cells#63515

Merged
elliot-barn merged 3 commits into
ray-project:masterfrom
Aydin-ab:docs/add-lexer-hook
May 20, 2026
Merged

[doc] add ipython3 lexer hook for notebooks with shell/magic cells#63515
elliot-barn merged 3 commits into
ray-project:masterfrom
Aydin-ab:docs/add-lexer-hook

Conversation

@Aydin-ab

Copy link
Copy Markdown
Contributor

Description

Sphinx falls back to the python3 lexer on .ipynb files that don't declare language_info.pygments_lexer, but python3 can't tokenise !shell or %magic cells. The pygments warning is fatal under Readthedocs -W and breaks the docs build.

This source-read hook injects pygments_lexer = "ipython3" into notebooks matching configurable glob patterns (ipython3_lexer_patterns / _exclude_patterns). Globs cover the in-tree example layouts plus _collections/**/*.ipynb for notebooks fetched at build time by sphinx_collections.

Review feedback from the previous round on #59984 (early-return pattern, no broad except) is preserved.

Related issues

Related to #59984 (closed-stale 2026-02-07).

Additional information

Verified locally with make html against upstream/master plus this commit and a fixture notebook at _collections/lexer-fixture/test.ipynb containing a !pip install cell and no language_info: build succeeded, no WARNING.*lexer matches in the log, fixture confirmed source-read.

Today most _collections/ notebooks are excluded from the build via exclude_patterns in favour of their .md counterparts, so the new _collections/**/*.ipynb glob is mostly defensive. The exclude design is upstream of this PR.

@Aydin-ab Aydin-ab requested a review from a team as a code owner May 19, 2026 18:22

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to force the ipython3 pygments lexer on specific Jupyter notebooks during the Sphinx documentation build process. This addresses issues where notebooks without explicit language metadata default to the standard Python lexer, causing failures when encountering shell commands or magic functions. The implementation includes new configuration patterns and a hook to modify notebook metadata on the fly. Feedback suggests optimizing the notebook processing by skipping re-serialization if the lexer is already correctly set and using ensure_ascii=False to better support international characters.

Comment thread doc/source/conf.py Outdated
@dstrodtman

Copy link
Copy Markdown
Contributor

@Aydin-ab We're actively updating all of the Sphinx dependencies, which will include notebook parsers. Just confirming that you ruled out the possibility of solving this with a standard upgrade to one of our installed dependencies? Many are many major versions behind.

@ray-gardener ray-gardener Bot added docs An issue or change related to documentation core Issues that should be addressed in Ray Core labels May 19, 2026
@Aydin-ab

Aydin-ab commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

@dstrodtman Do you have a branch or PR where i can test against those upgrades ? the issue is we're going to sync a subset of ray examples from the templates repo (the in-progress _collections/), and i want to make sure that whatever .ipynb we send to ray docs is not breaking the makethedocs CI

Also i'm making this change not invasive, it's only targetting the notebooks captured by ipython3_lexer_patterns

@dstrodtman dstrodtman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aydin-ab Aydin-ab force-pushed the docs/add-lexer-hook branch from 4a42ad3 to 98efa1b Compare May 19, 2026 22:28
@Aydin-ab Aydin-ab requested review from a team as code owners May 19, 2026 22:28

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 98efa1ba00774cbcd19bfcc3525294637ffa5b09. Configure here.

Comment thread doc/source/data/examples/pytorch_resnet_batch_prediction.ipynb
@Aydin-ab Aydin-ab marked this pull request as draft May 19, 2026 22:42
@Aydin-ab Aydin-ab force-pushed the docs/add-lexer-hook branch from 1156fcb to d87a535 Compare May 19, 2026 22:57
Aydin-ab added 2 commits May 19, 2026 17:17
Sphinx falls back to the python3 lexer on .ipynb files that don't declare
language_info.pygments_lexer, but python3 can't tokenise !shell or %magic
cells. The warning is fatal under Readthedocs -W and breaks CI today.

Register a source-read handler that gates on configurable glob patterns
(ipython3_lexer_patterns / _exclude_patterns) and injects pygments_lexer
= "ipython3" into the notebook JSON before Sphinx parses it. Globs cover
the in-tree example layouts plus _collections/**/*.ipynb for notebooks
fetched at build time by sphinx_collections.

Revives PR ray-project#59984 (closed-stale 2026-02-07). Review feedback from that
round — early-return pattern, no broad except — is preserved.

Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Use ensure_ascii=False so re-serializing notebook JSON doesn't escape CJK
or emoji content just to set one metadata field. Skip the re-serialize
when language_info.pygments_lexer is already "ipython3".

Signed-off-by: Aydin Abiar <aydin@anyscale.com>
@Aydin-ab Aydin-ab force-pushed the docs/add-lexer-hook branch from 78c5a55 to bab2889 Compare May 20, 2026 00:17
ray-core/examples/**/content/**.ipynb and train/examples/**/content/**.ipynb
match zero in-tree notebooks. Ray Core's example notebooks live flat at
ray-core/examples/<name>.ipynb, and Ray Train's anyscale templates use the
<framework>/<template>/README.ipynb layout without a content/ segment.

Keep _collections/**/*.ipynb (defensive for future templates added to
_TEMPLATE_COLLECTIONS) and the remaining four globs that each cover at
least one live template directory.

Signed-off-by: Aydin Abiar <aydin@anyscale.com>
@Aydin-ab Aydin-ab marked this pull request as ready for review May 20, 2026 01:15
@Aydin-ab Aydin-ab added the go add ONLY when ready to merge, run all tests label May 20, 2026
@elliot-barn elliot-barn merged commit fa5037c into ray-project:master May 20, 2026
7 checks passed
@Aydin-ab Aydin-ab deleted the docs/add-lexer-hook branch May 20, 2026 19:27
Neelansh-Khare pushed a commit to Neelansh-Khare/ray-clone that referenced this pull request Jun 5, 2026
…ay-project#63515)

## Description

Sphinx falls back to the `python3` lexer on `.ipynb` files that don't
declare `language_info.pygments_lexer`, but `python3` can't tokenise
`!shell` or `%magic` cells. The pygments warning is fatal under
Readthedocs `-W` and breaks the docs build.

This source-read hook injects `pygments_lexer = "ipython3"` into
notebooks matching configurable glob patterns (`ipython3_lexer_patterns`
/ `_exclude_patterns`). Globs cover the in-tree example layouts plus
`_collections/**/*.ipynb` for notebooks fetched at build time by
`sphinx_collections`.

Review feedback from the previous round on ray-project#59984 (early-return pattern,
no broad `except`) is preserved.

## Related issues

Related to ray-project#59984 (closed-stale 2026-02-07).

## Additional information

Verified locally with `make html` against `upstream/master` plus this
commit and a fixture notebook at `_collections/lexer-fixture/test.ipynb`
containing a `!pip install` cell and no `language_info`: build
succeeded, no `WARNING.*lexer` matches in the log, fixture confirmed
source-read.

Today most `_collections/` notebooks are excluded from the build via
`exclude_patterns` in favour of their `.md` counterparts, so the new
`_collections/**/*.ipynb` glob is mostly defensive. The exclude design
is upstream of this PR.

---------

Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Neelansh Khare <kharen@uci.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core docs An issue or change related to documentation go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants