-
Notifications
You must be signed in to change notification settings - Fork 63
Standardizing Extended Descriptions: User Stories, Testing Results, and Current Limitations
This document outlines use cases for extended descriptions in EPUB and web content, addressing the need for standardized semantic markup and consistent reading system behavior. Extended descriptions provide detailed textual alternatives for complex visual content including images, diagrams, tables, videos, and mathematical formulas. The standardization of extended descriptions is important for interoperability, and user experience consistency across the digital publishing ecosystem.
This document first presents important use cases from various stakeholder perspectives, then elaborates the problem statement based on testing results and current technical limitations.
Web standards provide multiple mechanisms for extended descriptions (summary/details elements, links, aria-details attribute), but each presents limitations in publishing contexts (both in EPUB and on the web). The summary/details pattern becomes problematic in paginated content, as these elements can break across page boundaries, disrupting the user experience.
The aria-details attribute serves a valuable function in helping assistive technologies locate extended description content. However, a critical gap emerges when content creators place extended descriptions in separate HTML files to maintain content organization and reduce file size. In such cases, aria-details can reference a link (a element) to the external file, but the link itself lacks specific semantics to identify its purpose. Without semantic markers on the link, assistive technologies, text-to-speech engines and user agents cannot programmatically determine that the link leads to an extended description rather than to other supplementary content like footnotes, glossaries, or general references.
This semantic gap means that while aria-details successfully points to the link element, the chain of programmatic relationships breaks at that point. User agents and reading systems cannot identify with certainty what type of content the link references, limiting their ability to provide specialized user interface treatments or enhanced navigation features for extended descriptions.
Similarly, extended description containers in the target documents are commonly marked up as generic section elements without distinctive semantics. This absence of specific semantic roles makes it impossible for automated tools, quality assurance systems, or adaptive user interfaces to distinguish extended description content from other sections within the document structure. Establishing clear semantic roles for both the linking elements and the description containers would address this gap and enable richer, more accessible user experiences.
It is worth noting that similar semantic identification challenges have been successfully addressed in other contexts. For example, DPUB ARIA roles provide the doc-footnote and doc-noteref roles to clearly identify notes and their references, enabling assistive technologies and text-to-speech engines to announce them appropriately and user agents to implement specialized navigation features. This precedent demonstrates the value of specific semantic roles for different types of linked supplementary content.
The following user stories demonstrate requirements that extend beyond EPUB to encompass the broader web ecosystem. While digital publishing formats like EPUB face unique challenges such as pagination and offline reading environments, the fundamental accessibility needs for extended descriptions are universal across all digital content platforms. Web applications, educational platforms, news websites, and online documentation systems all serve users with diverse abilities who require detailed descriptions of complex visual content. The semantic markup patterns and programmatic identification methods needed for extended descriptions must therefore work consistently across web standards and publishing formats to ensure a seamless accessible experience. Publishers often repurpose content across multiple channels—from EPUB to web to mobile applications—making cross-platform compatibility essential. Additionally, many modern reading systems and content management platforms are web-based, meaning that solutions developed for web accessibility directly impact the publishing workflow and user experience in digital books.
Text-To-Speech in this context refers to the audio rendering functionality built into reading systems (such as read-aloud features in EPUB readers), in addition to dedicated screen readers.
- As a text-to-speech or screen reader user I want extended descriptions to be programmatically identifiable, so that the technology can announce their availability
- As a text-to-speech or screen reader user I want to return to the exact location where I accessed an extended description, so that I maintain my reading flow and don't lose context
- As a text-to-speech or screen reader user I want links to extended descriptions and the containers of extended descriptions to be programmatically identifiable, so that my technology can announce a richer experience
- As a text-to-speech or screen reader user I want to know when an extended description begins and ends, so as not to confuse it with the main content or other nearby extended descriptions
- As a dyslexic visual reader I want to easily identify when images have extended descriptions available, so that I can access additional context and information when needed
- As a visual reader I want to access extended descriptions without losing my place in the content, so that I can return to my reading position efficiently
User Agents in the publishing world are EPUB Reading Systems that are usually built on web views, or developed as web applications or are browser plugins which enhance the browser's functionality for reading publications.
- As a user agent developer I want standardized semantic roles for extended description markup, so that I can implement consistent user interface behaviors
- As a user agent developer I want programmatically identifiable extended descriptions, so that I can develop innovative presentation methods (sidebars, overlays, detail views)
- As a user agent developer I want programmatically identifiable extended descriptions, so I can allow readers to escape and skip them
Content Creators in the publishing world are mostly publishers that are usually create digital publications (in EPUB) that are distributed and consumed in different EPUB Reading Systems. User experience management is normally left to EPUB Reading Systems, which enable content features based on semantic structure (e.g. footnote management).
- As a content creator I want clear, standardized markup patterns for extended descriptions, so that I can ensure consistency and accessibility across my content
- As a content creator I want extended descriptions to work consistently across different user agents, so that all users receive the same accessible experience
- As a content creator I want extended description patterns that work in both EPUB and general web contexts, so that I can maintain consistency across different publication formats
- As a content creator I want to programmatically identify which images have extended descriptions, so that I can edit my content with specific tools and UI for managing extended descriptions
- As a quality assurance tool (like ACE by DAISY) I want to systematically identify extended descriptions, so that I can provide accurate extended description reports
- As a content management system I want to generate inventories of all extended descriptions in a collection of content, so that content creators can track and maintain their accessible content
We conducted practical tests to evaluate how current web standards could address the Extended Description User Stories requirements. These tests revealed significant limitations and gaps in existing approaches.
The fundamental limitation across all tested approaches is the inability to programmatically identify extended descriptions with certainty. This identification challenge manifests in two critical areas:
-
Link Semantics: when a link points to an extended description (whether referenced by
aria-detailsor used standalone), there is no standardized semantic marker to distinguish it from other types of links (footnotes, glossaries, general references, or external resources). This prevents assistive technologies from announcing the link's purpose appropriately and limits user agents' ability to provide specialized navigation or presentation features. -
Container Semantics: extended description content itself lacks a specific semantic role to distinguish it from generic sections, making it impossible for quality assurance tools, content management systems, or adaptive interfaces to automatically locate and process extended descriptions within a document or across a collection of documents.
Testing revealed three additional constraints when attempting workarounds using declarative approaches:
-
Cross-Document Chain Breakage: XPath-based identification systems function only within single documents. When extended descriptions reside in separate files, the programmatic relationship chain breaks, leaving containers semantically unidentifiable.
-
Computational Overhead: XPath approaches require user agents to perform complex document preprocessing and DOM traversal operations for each extended description discovery. This creates significant computational burden, particularly in documents with multiple images or when processing large content collections.
-
Implementation Complexity: XPath-based solutions demand sophisticated parsing logic from user agents, making implementation more error-prone and resource-intensive compared to direct, imperative markup approaches.
The analysis of current implementations and user requirements reveals several open questions that need to be addressed to achieve a comprehensive solution for extended descriptions:
- How can we reduce the computational overhead associated with reverse check
aria-detailsto identify links to extended descriptions? - How can programmatic relationships be maintained when extended descriptions reside in separate HTML files?
- How can we identify extended description containers across document boundaries?