Skip to content

docs: introduce intent-based alt text for screenshots and separate figcaption #7452

@yomybaby

Description

@yomybaby

Problem

The user manual currently follows a "no alt text" convention (DOCUMENTATION-STYLE-GUIDE.md: ![](images/file.png)). This has two drawbacks as AI-driven documentation workflows mature:

  1. Accessibility — screen reader users get no description beyond the auto-generated Figure X.Y caption.
  2. AI maintenance metadata is missing — the docs-screenshot-capturer, docs-update-writer, and docs-update-reviewer agents have no machine-readable description of what each PNG is supposed to show. File names like session_type_batch.png are ambiguous; agents cannot reliably decide whether to reuse, recapture, or replace an image.

The capture agent already knows the intent of every screenshot it takes (which wizard step, which toggle is on, which action result is being shown). That ground-truth intent should be recorded as alt at capture time — not reverse-engineered from the rendered pixels later.

Constraint: figcaption / alt conflict

markdown-processor.ts currently renders <figcaption>Figure X.Y — {alt}</figcaption> whenever alt is non-empty. Filling alt with a 1–2 sentence intent description would make every visible caption noisy. The renderer must be updated so alt (rich, screen-reader + AI metadata) and figcaption (short visible label) are independent.

Scope

Renderer changes (packages/backend.ai-docs-toolkit)

  • markdown-processor.ts + markdown-processor-web.ts: default figcaption to Figure X.Y only. An optional visible caption uses the existing markdown title syntax ![alt](url "caption")titleAttr is already parsed by the renderer; promote it to the caption source.
  • alt is always emitted as the <img alt="..."> attribute regardless of caption presence.

Style/convention docs (packages/backend.ai-webui-docs)

  • DOCUMENTATION-STYLE-GUIDE.md: replace the "No alt text (current convention)" rule with "alt text required (intent-based, 1–2 sentences)" plus authoring guidance — state what is on screen + the step/state context the agent knows from the capture flow; do not describe user actions.
  • SCREENSHOT-GUIDELINES.md: document the alt authoring rule and that the capture agent must emit alt at capture time (intent → alt, NOT image → inferred description).
  • TRANSLATION-GUIDE.md: declare alt as a translatable string. English first, then ko / ja / th synchronized by the translation pipeline.

Agent updates (.claude/agents/)

  • docs-screenshot-capturer.md: add an explicit workflow step — before calling browser_take_screenshot, record the captured screen state (page, wizard step, active filters, action just performed) and write it as the alt text in the corresponding markdown reference. Capture the four language versions sequentially using the same intent string, then translate (or hand off to the translator).
  • docs-update-writer.md: when authoring or refreshing a docs page, fill alt for any new image reference based on the intent it represents.
  • docs-update-reviewer.md: add a review rule — any PR that changes an image file must also revise the corresponding alt if the intent changed; alt-only changes are valid when the intent description needs sharpening.

Out of scope

  • No bulk backfill of existing image alts. Adoption is incremental: alt is filled when an image is recaptured or its referencing page is edited. A separate maintenance issue can drive a one-shot backfill once the policy is live.

Dependencies

  • Builds on the renderer/style work in the sibling issue (matte + auto size cap + capture guidance). The two issues touch overlapping files (markdown-processor*.ts, SCREENSHOT-GUIDELINES.md, docs-screenshot-capturer.md), so this issue is sequenced after the sibling lands to avoid merge conflicts.

Verification

  • Build a sample docs page that uses both the no-caption form ![Rich alt text describing intent](images/foo.png) and the explicit-caption form ![Rich alt](images/foo.png "Short label"). Confirm rendered HTML has <img alt="Rich alt text describing intent"> in both cases and figcaption text reflects "Figure X.Y" vs "Figure X.Y — Short label" respectively.
  • Run the capture agent on a single page and verify the produced markdown references include alt populated from capture-time intent, not from after-the-fact inspection of the PNG.
  • bash scripts/verify.sh passes.

JIRA Issue: FR-2908

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions