Problem
The user manual currently follows a "no alt text" convention (DOCUMENTATION-STYLE-GUIDE.md: ). This has two drawbacks as AI-driven documentation workflows mature:
- Accessibility — screen reader users get no description beyond the auto-generated
Figure X.Y caption.
- AI maintenance metadata is missing — the docs-screenshot-capturer, docs-update-writer, and docs-update-reviewer agents have no machine-readable description of what each PNG is supposed to show. File names like
session_type_batch.png are ambiguous; agents cannot reliably decide whether to reuse, recapture, or replace an image.
The capture agent already knows the intent of every screenshot it takes (which wizard step, which toggle is on, which action result is being shown). That ground-truth intent should be recorded as alt at capture time — not reverse-engineered from the rendered pixels later.
Constraint: figcaption / alt conflict
markdown-processor.ts currently renders <figcaption>Figure X.Y — {alt}</figcaption> whenever alt is non-empty. Filling alt with a 1–2 sentence intent description would make every visible caption noisy. The renderer must be updated so alt (rich, screen-reader + AI metadata) and figcaption (short visible label) are independent.
Scope
Renderer changes (packages/backend.ai-docs-toolkit)
markdown-processor.ts + markdown-processor-web.ts: default figcaption to Figure X.Y only. An optional visible caption uses the existing markdown title syntax  — titleAttr is already parsed by the renderer; promote it to the caption source.
- alt is always emitted as the
<img alt="..."> attribute regardless of caption presence.
Style/convention docs (packages/backend.ai-webui-docs)
DOCUMENTATION-STYLE-GUIDE.md: replace the "No alt text (current convention)" rule with "alt text required (intent-based, 1–2 sentences)" plus authoring guidance — state what is on screen + the step/state context the agent knows from the capture flow; do not describe user actions.
SCREENSHOT-GUIDELINES.md: document the alt authoring rule and that the capture agent must emit alt at capture time (intent → alt, NOT image → inferred description).
TRANSLATION-GUIDE.md: declare alt as a translatable string. English first, then ko / ja / th synchronized by the translation pipeline.
Agent updates (.claude/agents/)
docs-screenshot-capturer.md: add an explicit workflow step — before calling browser_take_screenshot, record the captured screen state (page, wizard step, active filters, action just performed) and write it as the alt text in the corresponding markdown reference. Capture the four language versions sequentially using the same intent string, then translate (or hand off to the translator).
docs-update-writer.md: when authoring or refreshing a docs page, fill alt for any new image reference based on the intent it represents.
docs-update-reviewer.md: add a review rule — any PR that changes an image file must also revise the corresponding alt if the intent changed; alt-only changes are valid when the intent description needs sharpening.
Out of scope
- No bulk backfill of existing image alts. Adoption is incremental: alt is filled when an image is recaptured or its referencing page is edited. A separate maintenance issue can drive a one-shot backfill once the policy is live.
Dependencies
- Builds on the renderer/style work in the sibling issue (matte + auto size cap + capture guidance). The two issues touch overlapping files (
markdown-processor*.ts, SCREENSHOT-GUIDELINES.md, docs-screenshot-capturer.md), so this issue is sequenced after the sibling lands to avoid merge conflicts.
Verification
- Build a sample docs page that uses both the no-caption form
 and the explicit-caption form . Confirm rendered HTML has <img alt="Rich alt text describing intent"> in both cases and figcaption text reflects "Figure X.Y" vs "Figure X.Y — Short label" respectively.
- Run the capture agent on a single page and verify the produced markdown references include alt populated from capture-time intent, not from after-the-fact inspection of the PNG.
bash scripts/verify.sh passes.
JIRA Issue: FR-2908
Problem
The user manual currently follows a "no alt text" convention (
DOCUMENTATION-STYLE-GUIDE.md:). This has two drawbacks as AI-driven documentation workflows mature:Figure X.Ycaption.session_type_batch.pngare ambiguous; agents cannot reliably decide whether to reuse, recapture, or replace an image.The capture agent already knows the intent of every screenshot it takes (which wizard step, which toggle is on, which action result is being shown). That ground-truth intent should be recorded as
altat capture time — not reverse-engineered from the rendered pixels later.Constraint: figcaption / alt conflict
markdown-processor.tscurrently renders<figcaption>Figure X.Y — {alt}</figcaption>wheneveraltis non-empty. Filling alt with a 1–2 sentence intent description would make every visible caption noisy. The renderer must be updated so alt (rich, screen-reader + AI metadata) and figcaption (short visible label) are independent.Scope
Renderer changes (
packages/backend.ai-docs-toolkit)markdown-processor.ts+markdown-processor-web.ts: default figcaption toFigure X.Yonly. An optional visible caption uses the existing markdown title syntax—titleAttris already parsed by the renderer; promote it to the caption source.<img alt="...">attribute regardless of caption presence.Style/convention docs (
packages/backend.ai-webui-docs)DOCUMENTATION-STYLE-GUIDE.md: replace the "No alt text (current convention)" rule with "alt text required (intent-based, 1–2 sentences)" plus authoring guidance — state what is on screen + the step/state context the agent knows from the capture flow; do not describe user actions.SCREENSHOT-GUIDELINES.md: document the alt authoring rule and that the capture agent must emit alt at capture time (intent → alt, NOT image → inferred description).TRANSLATION-GUIDE.md: declare alt as a translatable string. English first, then ko / ja / th synchronized by the translation pipeline.Agent updates (
.claude/agents/)docs-screenshot-capturer.md: add an explicit workflow step — before callingbrowser_take_screenshot, record the captured screen state (page, wizard step, active filters, action just performed) and write it as thealttext in the corresponding markdown reference. Capture the four language versions sequentially using the same intent string, then translate (or hand off to the translator).docs-update-writer.md: when authoring or refreshing a docs page, fill alt for any new image reference based on the intent it represents.docs-update-reviewer.md: add a review rule — any PR that changes an image file must also revise the corresponding alt if the intent changed; alt-only changes are valid when the intent description needs sharpening.Out of scope
Dependencies
markdown-processor*.ts,SCREENSHOT-GUIDELINES.md,docs-screenshot-capturer.md), so this issue is sequenced after the sibling lands to avoid merge conflicts.Verification
and the explicit-caption form. Confirm rendered HTML has<img alt="Rich alt text describing intent">in both cases and figcaption text reflects "Figure X.Y" vs "Figure X.Y — Short label" respectively.bash scripts/verify.shpasses.JIRA Issue: FR-2908