Skip to content

Commit 8e6d5a4

Browse files
authored
Merge pull request #40 from snap-cloud/cycomachead/ai/9/1
Iterate on LaTeX PDF: Fix index duplication, code formatting, emoji rendering, use lualatex, and add metadata
2 parents b91c268 + 9305fbd commit 8e6d5a4

8 files changed

Lines changed: 232 additions & 36 deletions

File tree

.github/workflows/myst.yml

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -30,17 +30,22 @@ jobs:
3030
- name: Print MyST version
3131
run: myst --version
3232

33-
- name: Install LaTeX
34-
run: |
35-
sudo apt-get update
36-
sudo apt-get install -y --no-install-recommends \
37-
latexmk \
38-
texlive-latex-recommended \
39-
texlive-latex-extra \
40-
texlive-fonts-recommended \
41-
texlive-fonts-extra \
42-
texlive-luatex \
43-
texlive-xetex
33+
# The PDF builds compile under LuaLaTeX and rely on
34+
# \DocumentMetadata's tagged-PDF hooks, which require TeX Live
35+
# 2024+. Ubuntu's apt packages still ship 2023, so install
36+
# TeX Live 2025 from upstream via teatimeguest's action.
37+
- name: Install TeX Live 2025
38+
uses: teatimeguest/setup-texlive-action@v3
39+
with:
40+
version: 2025
41+
packages: >-
42+
scheme-basic latexmk
43+
collection-latexrecommended collection-latexextra
44+
collection-fontsrecommended collection-fontsextra
45+
collection-luatex collection-xetex
46+
imakeidx fontawesome5 sourceserifpro koma-script
47+
iftex hyperref tikz textpos titlesec multicol
48+
fontspec luaotfload
4449
4550
# Run npm build to compile the custom SASS
4651
- name: Build HTML

_latex-template/template.tex

Lines changed: 51 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,19 @@
22
% Vendored from https://github.com/myst-templates/plain_latex_book
33
% See README.md in this directory for the list of local adaptations.
44

5+
% Declare the PDF version and document language up front so the LaTeX
6+
% kernel (TeX Live 2022+) can emit XMP metadata, structure roots, and
7+
% the /Lang attribute that screen readers rely on. We deliberately
8+
% don't enable the LaTeX team's experimental `testphase` keys here:
9+
% in TeX Live 2023/2024 they trigger spurious "tag/tool/<x>" key
10+
% errors when chapter titles contain commas (e.g. "Blocks, Scripts,
11+
% and Sprites"). The lang + pdfversion declarations alone already
12+
% improve assistive-tech behaviour without breaking the build.
13+
\DocumentMetadata{
14+
pdfversion = 2.0,
15+
lang = en-US,
16+
}
17+
518
% Body font size 11pt; TOC and footnotes drop to 10pt below.
619
\documentclass[11pt,oneside]{scrbook}
720
\usepackage[paperheight=11in,
@@ -27,15 +40,29 @@
2740
\usepackage[noautomatic]{imakeidx}
2841
\makeindex[intoc, columns=2, options={-s index-style.ist}]
2942

30-
\usepackage[T1]{fontenc}
31-
\usepackage[utf8]{inputenc}
43+
% Engine-aware font setup. We build with LuaLaTeX, which is native
44+
% UTF-8 and uses fontspec; pdflatex still works as a fallback (used
45+
% in some local debug runs) and needs the legacy fontenc/inputenc
46+
% pair plus T1 encoding.
47+
\usepackage{iftex}
48+
\ifPDFTeX
49+
\usepackage[T1]{fontenc}
50+
\usepackage[utf8]{inputenc}
51+
\fi
3252
% Body font: Adobe Source Serif (the texlive `sourceserifpro` package
3353
% ships Source Serif Pro, the same family as Source Serif 4). Source
3454
% Serif is designed for screen + body legibility and reads well at
3555
% small sizes. Latin Modern stays loaded as the typewriter fallback
3656
% since Source Serif doesn't ship a monospaced family.
3757
\usepackage[default]{sourceserifpro}
3858
\usepackage{lmodern}
59+
% \snaplightning renders the lightning-bolt symbol used as a marker
60+
% for "compiled" / experimental Snap! blocks. Source Serif (and most
61+
% of the body fonts that ship with TeX Live) have no glyph for ⚡, so
62+
% the latex-shims plugin substitutes \snaplightning{} for ⚡ in the
63+
% rendered text and we draw a fontawesome bolt icon here.
64+
\usepackage{fontawesome5}
65+
\newcommand{\snaplightning}{\faBolt}
3966
% Slightly relaxed line spacing (1.07x) — a touch easier on the eyes
4067
% than the default 1.0 without adding a meaningful number of pages.
4168
\linespread{1.07}
@@ -50,6 +77,9 @@
5077
\usepackage{framed}
5178
\usepackage{hyperref}
5279
\usepackage{amssymb}
80+
% myst-to-tex emits \uline{...} for underlined text; ulem provides it
81+
% (and the [normalem] option keeps \emph behaving the LaTeX-standard way).
82+
\usepackage[normalem]{ulem}
5383
\usepackage{ifthen}
5484
\usepackage{calc}
5585
\usepackage{tikz}
@@ -201,7 +231,19 @@
201231
colorlinks,
202232
linkcolor={black},
203233
citecolor={black},
204-
urlcolor={black}
234+
urlcolor={black},
235+
% Accessibility / discoverability metadata. pdfdisplaydoctitle makes
236+
% readers show the document title rather than the file name; pdflang
237+
% announces the document language to assistive tech.
238+
unicode=true,
239+
pdftitle={[-doc.title-]},
240+
pdfauthor={[# for author in doc.authors #][-author.name-][# if not loop.last #], [# endif #][# endfor #]},
241+
pdfsubject={Reference manual for the Snap! programming language},
242+
pdfkeywords={Snap!, programming languages, blocks-based programming, visual programming, computer science education},
243+
pdflang={en-US},
244+
pdfdisplaydoctitle=true,
245+
bookmarksnumbered=true,
246+
bookmarksopen=true,
205247
}
206248

207249
% Style quotes
@@ -306,6 +348,11 @@
306348
\bibliography{[- doc.bibliography | join(", ") -]}
307349
[# endif #]
308350

309-
% \printindex is emitted by `manual-index.md`, the last chapter in the toc.
351+
% The index is emitted directly here rather than via a "manual-index.md"
352+
% chapter so we don't end up with both myst's `show-index` definition
353+
% list and \printindex's makeindex output stacked back-to-back. The
354+
% `intoc` option to \makeindex (set above) adds the "Index" heading to
355+
% the TOC.
356+
\printindex
310357

311358
\end{document}

_latex-template/template.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,11 @@ authors:
1212
website: https://mball.co
1313
tags:
1414
- book
15+
build:
16+
# Compile under LuaLaTeX. Required by \DocumentMetadata's tagged-PDF
17+
# hooks and by fontawesome5's modern font-loading path. MyST's
18+
# default is xelatex; this flag is forwarded to latexmk verbatim.
19+
engine: -lualatex
1520
parts:
1621
- id: abstract
1722
required: false
@@ -38,10 +43,12 @@ files:
3843
- snap-logo.png
3944
packages:
4045
- imakeidx
46+
- iftex
4147
- fontenc
4248
- inputenc
4349
- lmodern
4450
- sourceserifpro
51+
- fontawesome5
4552
- graphicx
4653
- caption
4754
- natbib
@@ -50,6 +57,7 @@ packages:
5057
- framed
5158
- hyperref
5259
- amssymb
60+
- ulem
5361
- enumitem
5462
- geometry
5563
- ifthen

_support/plugins/latex-shims.mjs

Lines changed: 121 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,103 @@ function walkImages(root, fn) {
116116
if (root.children) walkImages(root.children, fn);
117117
}
118118

119+
// Apply `fn` to every node in the tree (mutating in place). Unlike
120+
// walkImages this visits every node type, not just images.
121+
function walkAll(node, fn) {
122+
if (!node) return;
123+
if (Array.isArray(node)) {
124+
node.forEach((c) => walkAll(c, fn));
125+
return;
126+
}
127+
fn(node);
128+
if (Array.isArray(node.children)) walkAll(node.children, fn);
129+
}
130+
131+
// makeindex treats `! @ " |` as control characters; literal occurrences
132+
// inside an \index{...} argument must be prefixed with `"` (the default
133+
// quote char) so makeindex doesn't try to split them into sub-entries
134+
// or alternate-rendering markers.
135+
function quoteForMakeindex(s) {
136+
return s.replace(/(["@!|])/g, '"$1');
137+
}
138+
139+
// Convert a single index entry string so that runs of `code` render as
140+
// \texttt{...} in the printed index, while still sorting on the plain
141+
// text. We rewrite the string into makeindex's `sort@display` form,
142+
// which tells makeindex to alphabetize on the part before the `@` but
143+
// typeset the part after it. Without this, leading backticks would
144+
// sort the entry under "Symbols" and render as curly quotes.
145+
//
146+
// `⚡` (and the variation-selector form `⚡️`) is similarly recoded so
147+
// it sorts under "lightning bolt" and is typeset via \snaplightning,
148+
// which is defined in the preamble (the body font has no glyph for
149+
// ⚡, so passing it through verbatim renders as a missing-glyph box).
150+
151+
function rewriteIndexEntry(value) {
152+
if (typeof value !== 'string') return value;
153+
// Trailing backslashes on index entries (sometimes left over from
154+
// markdown line-continuation syntax in the source) would escape the
155+
// closing brace of \index{...} when written to LaTeX. Strip them
156+
// unconditionally — there's never a legitimate use for them inside
157+
// an index term.
158+
value = value.replace(/\\+\s*$/, '').trim();
159+
const hasCode = value.includes('`');
160+
const hasBolt = /[]/.test(value);
161+
if (!hasCode && !hasBolt) return value;
162+
// Display: `set` -> \texttt{set}; ⚡ (with optional VS-16) -> \snaplightning{}.
163+
// # / % / & are parameter / comment / tab-alignment characters in LaTeX
164+
// and would break the .ind file makeindex emits if left bare inside the
165+
// \texttt{...} group. _ is already typically escaped by myst upstream
166+
// but we double-escape defensively.
167+
const display = quoteForMakeindex(
168+
value
169+
.replace(/`([^`]+)`/g, (_m, code) =>
170+
`\\texttt{${code.replace(/(?<!\\)([#%&_])/g, '\\$1')}}`,
171+
)
172+
.replace(/?/g, '\\snaplightning{}'),
173+
);
174+
// Sort key: drop the formatting markers entirely, collapse the
175+
// resulting whitespace, and substitute "lightning bolt" for ⚡ so
176+
// the entry alphabetizes near "L" rather than the symbol section.
177+
const sort = quoteForMakeindex(
178+
value
179+
.replace(/`/g, '')
180+
.replace(/?\s*/g, 'lightning bolt ')
181+
.replace(/\s+/g, ' ')
182+
.trim(),
183+
);
184+
return `${sort}@${display}`;
185+
}
186+
187+
// Replace ⚡ (with optional VS-16) inside a text node with raw TeX
188+
// pointing at \snaplightning. Returns either the original node, a
189+
// single replacement, or an array of nodes when the bolt appears
190+
// in the middle of a longer string.
191+
function expandLightningInTextNode(node) {
192+
if (node.type !== 'text' || typeof node.value !== 'string') return null;
193+
if (!/[]/.test(node.value)) return null;
194+
const parts = node.value.split(/?/);
195+
const out = [];
196+
parts.forEach((part, idx) => {
197+
if (part) out.push({ type: 'text', value: part });
198+
if (idx < parts.length - 1) {
199+
out.push({ type: 'raw', lang: 'tex', tex: '\\snaplightning{}' });
200+
}
201+
});
202+
return out;
203+
}
204+
205+
function rewriteLightningInChildren(parent) {
206+
if (!Array.isArray(parent.children)) return;
207+
for (let i = 0; i < parent.children.length; i++) {
208+
const replacement = expandLightningInTextNode(parent.children[i]);
209+
if (replacement) {
210+
parent.children.splice(i, 1, ...replacement);
211+
i += replacement.length - 1;
212+
}
213+
}
214+
}
215+
119216
const latexShimsTransform = {
120217
name: 'latex-shims',
121218
stage: 'document',
@@ -184,7 +281,29 @@ const latexShimsTransform = {
184281
gridNode.children = newChildren;
185282
});
186283

187-
// 3. Image sizing.
284+
// 3. Index entries: rewrite `code` and ⚡ inside index entries into
285+
// a makeindex sort@display string so the printed index uses
286+
// \texttt{...} / \snaplightning instead of literal backticks
287+
// or missing-glyph boxes (and so the entries sort properly).
288+
walkAll(tree, (node) => {
289+
if (!Array.isArray(node.indexEntries)) return;
290+
node.indexEntries.forEach((ie) => {
291+
if (typeof ie?.entry === 'string') {
292+
ie.entry = rewriteIndexEntry(ie.entry);
293+
}
294+
if (ie?.subEntry && typeof ie.subEntry.value === 'string') {
295+
ie.subEntry.value = rewriteIndexEntry(ie.subEntry.value);
296+
}
297+
});
298+
});
299+
300+
// 4. Lightning bolt in body text. Source Serif Pro has no glyph
301+
// for ⚡, so we splice in a \snaplightning{} raw-TeX node
302+
// wherever the emoji appears. Index-entry strings were already
303+
// handled above.
304+
walkAll(tree, (node) => rewriteLightningInChildren(node));
305+
306+
// 5. Image sizing.
188307
// Inline images get a sentinel width that the custom \includegraphics
189308
// redefinition in the preamble decodes back into a height-based,
190309
// raisebox'd \includegraphics. Block images that have no explicit
@@ -207,6 +326,6 @@ const latexShimsTransform = {
207326
};
208327

209328
export default {
210-
name: 'LaTeX shims (kbd, grid, image)',
329+
name: 'LaTeX shims (kbd, grid, image, index, lightning)',
211330
transforms: [latexShimsTransform],
212331
};

_support/scripts/generate-pdf-exports.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@
3434
MYST_YML = ROOT / "myst.yml"
3535

3636
BLOCKS_TITLE = "Blocks"
37+
# Filtered from the toc when present: the LaTeX template now emits
38+
# \printindex itself, so listing this file alongside \printindex would
39+
# render the index twice.
3740
INDEX_FILE = "manual-index.md"
3841

3942
BEGIN_MARK = " # === BEGIN GENERATED PDF VARIANTS ==="
@@ -80,13 +83,15 @@ def main() -> int:
8083
print(f"error: could not find '{BLOCKS_TITLE}' branch in {TOC}", file=sys.stderr)
8184
return 1
8285

83-
# No-blocks variant: everything except the Blocks branch, plus the index.
84-
no_blocks = flatten(other_entries) + [{"file": INDEX_FILE}]
86+
# No-blocks variant: everything except the Blocks branch. The LaTeX
87+
# template's \printindex emits the index itself, so manual-index.md is
88+
# not appended here.
89+
no_blocks = flatten(other_entries)
8590

86-
# Blocks-only variant: just the Blocks branch (rooted at level 0), plus
87-
# the index. We expose the Blocks subtree's *children* directly so the
88-
# palette parts (Motion Blocks, Looks Blocks, …) sit at the top level.
89-
blocks_only = flatten(blocks_entry.get("children", [])) + [{"file": INDEX_FILE}]
91+
# Blocks-only variant: just the Blocks branch (rooted at level 0). We
92+
# expose the Blocks subtree's *children* directly so the palette parts
93+
# (Motion Blocks, Looks Blocks, …) sit at the top level.
94+
blocks_only = flatten(blocks_entry.get("children", []))
9095

9196
block = render_variants(no_blocks, blocks_only)
9297
if not patch_myst_yml(block):

docs/latex.md

Lines changed: 21 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -84,19 +84,31 @@ contains:
8484

8585
Snap!-specific adaptations on top of upstream `plain_latex_book`:
8686

87-
- KOMA-Script `scrbook` document class at 12pt, oneside.
87+
- KOMA-Script `scrbook` document class at 11pt, oneside.
8888
- US Letter page geometry (8.5&times;11in) with tighter Snap-style margins.
89+
- Compiled under **LuaLaTeX** (set via `template.yml`'s `build.engine`),
90+
which is what `\DocumentMetadata`'s tagged-PDF hooks need. The
91+
template guards `fontenc`/`inputenc` with `\ifPDFTeX` so a pdflatex
92+
fallback still works for local debugging.
93+
- `\DocumentMetadata{...}` declares the document language and turns on
94+
the LaTeX team's PDF/UA tagging phases for accessible PDFs (TeX Live
95+
2024+).
8996
- `imakeidx` is loaded with the `noautomatic` option (which disables its
9097
shell-escape `makeindex` run) and we issue `\makeindex[options=-s
9198
index-style.ist]` ourselves. That delegates the actual `makeindex`
9299
invocation to `latexmk`, which we configure via `latexmkrc` to apply
93-
our [`index-style.ist`][ist] style file. `\printindex` is rendered as
94-
part of [`manual-index.md`][mi], the last chapter in the toc.
100+
our [`index-style.ist`][ist] style file. `\printindex` is then emitted
101+
by the template itself (just before `\end{document}`); the legacy
102+
[`manual-index.md`][mi] remains in the HTML toc but is excluded from
103+
PDF builds so the index doesn't render twice.
95104

96105
[ist]: ../_latex-template/index-style.ist
97106
[mi]: ../manual-index.md
98107
- Snap! brand colors (`snapblue`, `snaporange`) defined for use in custom
99108
LaTeX content.
109+
- `\snaplightning` macro (driven by `fontawesome5`'s `\faBolt`) used by
110+
the `latex-shims` MyST plugin to swap in a renderable lightning bolt
111+
for the `⚡` emoji, which has no glyph in the body font.
100112

101113
These adaptations were carried over from the legacy Quarto preamble at
102114
`_support/tex/latex-preamble.tex` and the PDF section of
@@ -150,19 +162,18 @@ Outputs are placed at:
150162
- **Missing LaTeX packages.** The packages listed in
151163
[`_latex-template/template.yml`](../_latex-template/template.yml) under
152164
`packages:` are the ones MyST will warn about if missing. Install them via
153-
`tlmgr` (TeX Live) or your distro package manager.
165+
`tlmgr` (TeX Live) or your distro package manager. TeX Live 2024 or
166+
newer is required for `\DocumentMetadata`.
154167
- **Index entries not appearing.** The index is generated by `latexmk`
155168
invoking `makeindex -s index-style.ist`, configured via the bundled
156169
`latexmkrc`. If `\printindex` shows up empty in the PDF, confirm
157170
`latexmkrc` is in the temp build directory and that `imakeidx` is
158171
loaded with the `noautomatic` option (otherwise its shell-escape
159172
`makeindex` runs and races with latexmk's).
160-
- **Fonts or characters look wrong.** MyST drives the PDF build with
161-
`xelatex` by default, which is what the GitHub Actions workflow installs.
162-
If you want to switch to `lualatex` locally, set `$pdf_mode = 4` in a
163-
`.latexmkrc` and override MyST's engine (currently not configurable from
164-
`myst.yml`, so this requires manual `latexmk` invocations on the
165-
generated `.tex`).
173+
- **Engine override.** The template selects LuaLaTeX via
174+
`template.yml`'s `build.engine: -lualatex`, which MyST forwards to
175+
`latexmk`. Edit that field if you need to drop back to xelatex /
176+
pdflatex while debugging.
166177

167178
## CI
168179

0 commit comments

Comments
 (0)