[doc] add ipython3 lexer hook for notebooks with shell/magic cells#63515
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to force the ipython3 pygments lexer on specific Jupyter notebooks during the Sphinx documentation build process. This addresses issues where notebooks without explicit language metadata default to the standard Python lexer, causing failures when encountering shell commands or magic functions. The implementation includes new configuration patterns and a hook to modify notebook metadata on the fly. Feedback suggests optimizing the notebook processing by skipping re-serialization if the lexer is already correctly set and using ensure_ascii=False to better support international characters.
|
@Aydin-ab We're actively updating all of the Sphinx dependencies, which will include notebook parsers. Just confirming that you ruled out the possibility of solving this with a standard upgrade to one of our installed dependencies? Many are many major versions behind. |
|
@dstrodtman Do you have a branch or PR where i can test against those upgrades ? the issue is we're going to sync a subset of ray examples from the templates repo (the in-progress Also i'm making this change not invasive, it's only targetting the notebooks captured by |
dstrodtman
left a comment
There was a problem hiding this comment.
Stamping; noting follow up convo in Slack: https://anyscale.enterprise.slack.com/archives/C02UGNK9080/p1779220770490799?thread_ts=1779200523.101949&cid=C02UGNK9080
4a42ad3 to
98efa1b
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 98efa1ba00774cbcd19bfcc3525294637ffa5b09. Configure here.
1156fcb to
d87a535
Compare
Sphinx falls back to the python3 lexer on .ipynb files that don't declare language_info.pygments_lexer, but python3 can't tokenise !shell or %magic cells. The warning is fatal under Readthedocs -W and breaks CI today. Register a source-read handler that gates on configurable glob patterns (ipython3_lexer_patterns / _exclude_patterns) and injects pygments_lexer = "ipython3" into the notebook JSON before Sphinx parses it. Globs cover the in-tree example layouts plus _collections/**/*.ipynb for notebooks fetched at build time by sphinx_collections. Revives PR ray-project#59984 (closed-stale 2026-02-07). Review feedback from that round — early-return pattern, no broad except — is preserved. Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Use ensure_ascii=False so re-serializing notebook JSON doesn't escape CJK or emoji content just to set one metadata field. Skip the re-serialize when language_info.pygments_lexer is already "ipython3". Signed-off-by: Aydin Abiar <aydin@anyscale.com>
78c5a55 to
bab2889
Compare
ray-core/examples/**/content/**.ipynb and train/examples/**/content/**.ipynb match zero in-tree notebooks. Ray Core's example notebooks live flat at ray-core/examples/<name>.ipynb, and Ray Train's anyscale templates use the <framework>/<template>/README.ipynb layout without a content/ segment. Keep _collections/**/*.ipynb (defensive for future templates added to _TEMPLATE_COLLECTIONS) and the remaining four globs that each cover at least one live template directory. Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…ay-project#63515) ## Description Sphinx falls back to the `python3` lexer on `.ipynb` files that don't declare `language_info.pygments_lexer`, but `python3` can't tokenise `!shell` or `%magic` cells. The pygments warning is fatal under Readthedocs `-W` and breaks the docs build. This source-read hook injects `pygments_lexer = "ipython3"` into notebooks matching configurable glob patterns (`ipython3_lexer_patterns` / `_exclude_patterns`). Globs cover the in-tree example layouts plus `_collections/**/*.ipynb` for notebooks fetched at build time by `sphinx_collections`. Review feedback from the previous round on ray-project#59984 (early-return pattern, no broad `except`) is preserved. ## Related issues Related to ray-project#59984 (closed-stale 2026-02-07). ## Additional information Verified locally with `make html` against `upstream/master` plus this commit and a fixture notebook at `_collections/lexer-fixture/test.ipynb` containing a `!pip install` cell and no `language_info`: build succeeded, no `WARNING.*lexer` matches in the log, fixture confirmed source-read. Today most `_collections/` notebooks are excluded from the build via `exclude_patterns` in favour of their `.md` counterparts, so the new `_collections/**/*.ipynb` glob is mostly defensive. The exclude design is upstream of this PR. --------- Signed-off-by: Aydin Abiar <aydin@anyscale.com> Signed-off-by: Neelansh Khare <kharen@uci.edu>

Description
Sphinx falls back to the
python3lexer on.ipynbfiles that don't declarelanguage_info.pygments_lexer, butpython3can't tokenise!shellor%magiccells. The pygments warning is fatal under Readthedocs-Wand breaks the docs build.This source-read hook injects
pygments_lexer = "ipython3"into notebooks matching configurable glob patterns (ipython3_lexer_patterns/_exclude_patterns). Globs cover the in-tree example layouts plus_collections/**/*.ipynbfor notebooks fetched at build time bysphinx_collections.Review feedback from the previous round on #59984 (early-return pattern, no broad
except) is preserved.Related issues
Related to #59984 (closed-stale 2026-02-07).
Additional information
Verified locally with
make htmlagainstupstream/masterplus this commit and a fixture notebook at_collections/lexer-fixture/test.ipynbcontaining a!pip installcell and nolanguage_info: build succeeded, noWARNING.*lexermatches in the log, fixture confirmed source-read.Today most
_collections/notebooks are excluded from the build viaexclude_patternsin favour of their.mdcounterparts, so the new_collections/**/*.ipynbglob is mostly defensive. The exclude design is upstream of this PR.