-
Notifications
You must be signed in to change notification settings - Fork 286
Tweak understanding for 1.2.3 and 1.2.5 #1790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* turn the "During existing pauses..." sentence around, so as not to suggest that an exception exists * add mention of audio ducking (quack) * add a note for 1.2.3 that explicitly says that a lack of gaps is not an excuse/exemption
Any news on this @alastc ? |
any chance this could be considered/discussed at some point? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion to replace the slightly odd "redone": "as it would either require the audio to be edited to have sufficient pauses for audio description"
IMO, this change to include audio ducking risks over-reaching the current guidance and requirements of 1.2.5. I am moving this PR back to the Drafted project column, until we have time to address that Response. |
Co-authored-by: Mike Gower <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In standard audio description, narration is added during existing pauses in dialogue.
I don't know of a "standard" to cite. Maybe "traditional"?
This may require lowering the volume of background music...
Any concerns for use of "require" here?
I will revisit this, as I may have run head-first into changes - particularly for the 1.2.3 case. Will reconsider. |
After discussion in the WCAG 2.x TF call, and mulling this over further, I'd suggest reworking my PR here to:
|
2616bd9
to
3103888
Compare
This technique only applies if there is a single talking head and if there is no other important visual information. Neither of those is the case in most of the videos we test. |
WCAG does not have an adequate definition for "synchronized media". I raised the question a couple of years ago (possibly on WebAIM rather than here) and opinion was split. The consensus was that a video with a music track, but no spoken content, was still synchronized media. The rationale was that a single control is used to start and stop the visual and audio content, and this constitutes them being synchronized. This means that AD is required at level AA. I can't say I am happy with that interpretation, but it was the majority view. Even if you reject that argument and say that that video is video-only, there are other difficulties. What if the video lasts several minutes and there are a few seconds of spoken content. Does that mean it's now synchronised media? What if the audio track contains all the visual information but they are substantially unsynchronised? That would be really confusing for assistive technology users who have some sight. We really need a solid definition because the meaning of "synchronized media" isn't as obvious as might first appear. |
I'd agree with that. The goal when we replaced "multimedia" (which is even worse to try to define) with "synchronized media" was to highlight that we are talking about the combination of at least one time-based media (audio or video) with other content (more audio or video, images, etc) where the additional content and the time-based media are synchronized to each other. Video with a music track is definitely synchronized media. Not all of the content that is synchronized with the original media is carrying needed information, but it may be.
Yes.
What does "substantially unsynchronized" mean? If both are playing at the same time then it may be poorly edited and confusing content, but it is still synchronized media. |
I mean that audio and the visual content it relates to occur at different times in the video. Some of the videos we test comprise a sequence of numerous short video clips. Sometimes the audio for a clip needs to be significantly longer than the visual content, which leads to the audio and video becoming out of sync with each other. From your comments it appears that there is a clear and straightforward definition for synchronized media, but it is not specified in WCAG. It would be good if that can be done. |
In what way do you not find this definition clear?
It feels like you are implying that because audio is out of synch with the video (due to technology gaffs, poor sound editing or even intentional techniques) it is no longer synchronized media. But that is confusing the medium with the message. We are defining the medium. Whether we're talking about Jean-Luc Godard's unconventional combinations of asynchronous images and sound or the lastest novel approaches, it is still a combination of image and sound designed to be played in concert over time, which is what is defined as synchronized media. This is to differentiate it from an audio-only or video-only delivery. Would a note something like this help?
|
I don't think technique G203 should be sufficient for SC 1.2.5. It seems sufficient for SC 1.2.3 and perhaps SC 1.1.1. Audio description needs to be synchronized when used and a static text alternative is not. It may be possible that a talking head video already passes 1.2.5 because of no pauses - but this technique confuses the situation and implies a video with audio can be treated like video-only content for 1.2.5 and I don't think that is the case. |
I was trying to find an awkward but workable way out of the impasse, but I'm not clear why you think that isn't acceptable either. So in effect you're saying: yes, by design, if a video has NO media alternative and NO sufficient gaps in the audio that would have allowed AD, it passes both 1.2.3 and 1.2.5 in your view? and then you say that this is not nothing? the end result is that you have a video that has visual information not currently conveyed in the existing audio, but because there's no sufficient audio gaps for AD you're handwaving it through as a pass for both 1.2.3 and 1.2.5? that, to me, IS nothing, sorry. the end result is still that a blind user won't get any of the visual information, but this conforms happily to WCAG? I really fail to see the logic here, other than "it's easier for content providers to pass WCAG" instead of "we're trying to provide a baseline access to information for blind/VI users with this" yes, the weird interconnection between 1.2.3 and 1.2.5 (unless i'm mistaken, the only case in WCAG where this happens? where the aspect of one SC is actually a separate SC as well) is tricky. but as we're trying to at least mitigate the problems this has caused without outright changing the normative wording, I would have thought my proposal was the least worst option. then at the very least if we accept (begrudgingly) that no sufficient audio gaps gives you a free pass for AD, then at least the need for media alternative remains as a last resort. otherwise we may as well just auto-pass these SCs and be done with them.... |
Yes.
To recap whatI said, I wrote: "I understand that people may think that this situation is doing nothing, but I would say that to pass 1.2.5 a site owner needs to assess a video to determine if there is any information that is only provided via video visuals, assess the ability to incorporate descriptions, incorporate the descriptions (if any are possible), and assume any risk associated with their decisions. That isn't nothing."
I do object to "happily" in your characterization. When talking about conformance there is no happy or sad, good or bad, just whether something meets the success criteria. I don't think that this is enough to provide a good experience for B/VI users either, but do think that it conforms, yes. And I'm not happy about that, either, but want to follow the wording in the standard and the original intent.
1.2.3 and 1.2.8 share the same relationship, as do 3.3.4 and 3.3.6. Possibly 1.3.4 and 1.4.6 also. But this seems to be the only pairing where neither SC is AAA. We should make sure that this is on the list of items to work on in WCAG 3. |
Totally agree, @mraccess77. |
So, the author "assumes the risk". When challenged, they can say "i'm fully conformant with WCAG SCs 1.2.3 and 1.2.5, so I don't see what the problem is". Again then, if people are convinced that this is indeed the intention, let's add explicit, clearly written out notes in the understanding documents for 1.2.3 and 1.2.5, where this interpretation is explicitly stated: if there aren't sufficient gaps in the audio for AD, you can automatically pass 1.2.3 and 1.2.5 (but you then need to assume any risk associated with your decision)". |
Either way this goes, I agree with Patrick. Be more explicit, even if that means gaps are pointed out rather than glossed over / left to differing interpretations. State that what is required is not necessarily the best user experience. And then strive for addressing that gap with wcag 3 |
I agree we want to strive for some clarity here. I guess I just feel like we have a fairly profound disagreement about interpretation in this discussion. All the incremental PRs for 1.2 I've created, and which are currently working through the TF, are an attempt to chip away at stuff that I believe we can agree on. Once the dust has settled from those, maybe we'll have a bit more clarity/alignment on audio description. |
From another recent discussion thread about correcting glossary formatting:
Should we consider moving the substance of Notes 1 to 3 into the definition for audio description? That might look something like:
Parenthesis added to help with parsing. Also, this approach implies it might be better to move the “not already provided in existing audio” condition into the 1.2.3/5/7 SC phrasing. |
@bruce-usab Normative changes that could cause folks to reinterpret the standard go in a different bucket. I'm not saying we can't go that route; just that it cannot be incorporated without a new WCAG 2.x release. It would end up in our "Future version updates" column of the TF project board. My point in the other discussion you quoted from is that folks are using notes in 1.2.5 to justify an interpretation which cannot be arrived at from the normative text alone. |
Had some late night thoughts on this which may or may not be helpful. Assumption: Audio description is primarily for those who cannot see the video |
what you're describing here is 1.2.7 Extended Audio Description (AAA) https://www.w3.org/WAI/WCAG22/Understanding/extended-audio-description-prerecorded.html yes, that's how you'd solve it (assuming you can't do anything better), but that doesn't help with the 1.2.3/1.2.5 determination of whether or not absence of sufficient gaps is a get-out-of-jail pass or fail |
Bad synchronisation between the audio and visual content isn't the main problem, although I do think the definition should explicitly state that it is permitted, because definitions should be as unambiguous as possible. The bigger issue is that the definition does not address whether a video with no spoken content but some music or other noises (which may have a shorter duration than the visual content) is synchronised media. Andrew has confirmed that it is, but it is far from obvious. Your proposed definition works for me. |
I think Understanding (somewhere) includes an example of classic silent movies — which were traditionally paired with dramatically timed music — as being good candidates for both audio description (including ducking of the music) and static alternatives (i.e., a screenplay-like document). That example could address this question. |
In a quick review of the examples in Understanding 1.2.x SC the closest I could fine was this in 1.2.1:
|
the question is “is it important information to understanding the movie”
… On Apr 25, 2025, at 7:59 AM, Bruce Bailey ***@***.***> wrote:
bruce-usab
left a comment
(w3c/wcag#1790)
<#1790 (comment)>
The bigger issue is that the definition does not address whether a video with no spoken content but some music or other noises ... is synchronized media.
I think Understanding (somewhere) includes an example of classic silent movies — which were traditionally paired with dramatically timed music — as being good candidates for both audio description (including ducking of the music) and static alternatives (i.e., a screenplay-like document). That example Could address this question.
—
Reply to this email directly, view it on GitHub <#1790 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACNGDXXWLRZKA5BCYAOCBR323JEVVAVCNFSM6AAAAAB2AHFWDCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZQGY3TGMBZGA>.
You are receiving this because you were mentioned.
|
While the silent movie example is fine, as a general rule I don't find such positive examples particularly useful because they generally describe situations where the developer has done the right thing. As a tester who is invariably testing something that has not been designed optimally, it's useful to have examples that demonstrate non-conformances or clarify non-obvious situations. For example, a video with no spoken content in which the music bears no relation at all to the visual content - we are agreed that this is synchronized media even though that's rather counter-intuitive to people who are not intimately familiar with the SC and the relevant definitions. |
@TestPartners I've tweaked the definition some more and put it in PR #4371 |
My thoughts/ opinion on this
|
coming back in late, but as i see some folks really getting hung up on "zero gaps is an edge case" ... we're talking about zero usable gaps. there may be a few half second gaps in narration here and there, but i'd still count those as not being "gaps" that are in any way usable to cram in AD. think the majority of tiktok videos with voiceovers. those are the ones I take issue with claiming that they are magically exempt and PASS the SC |
It is my understanding that music did enhance silent films and helped to convey the situations occurring in the movie. Almost all silent films had accompanying music - so it would seem to be synchronized in my opinion. We should poll whether having room at the front or end of the video counts as a pause as it seems like some of us think it could. |
@GreggVan I agree that where there are any audio descriptions and no available gaps to add more, the video meets 1.2.5. But I disagree fundamentally that a movie with zero audio descriptions can pass 1.2.5. That is not supported by the normative language. As I've mentioned elsewhere, you are citing a non-normative note to support that argument, not the normative text which stakes clearly that audio descriptions must be provided unless it is a media alternative. If there is no audio description, how can we say we've met the the normative wording:
Here is that wording with all relevant normative definitions (in bold) incorporated, with minor editorial tweaks and swizzling to make it comprehensible:
I do not understand how a video that has ZERO narration added can pass that language. |
The original film of the silent movie obviously had not sound. If someone takes that source material and creates a video-only record, that is not synchronized media, and it could be met through a media alternative to meet 1.2.3, and pass 1.2.5 (i.e., N/A) because it is not synchronized media. However, if someone attempts to replicate the original movie theatre experience of the time by adding sound to the video, they are now providing synchronized media (it has images and audio) -- at which point it is entirely possible to add audio descriptions and it needs to pass 1.2.5 with audio descriptions. |
@mbgower You are right that the note clarifying that audio description is added during existing pauses is not normative. So there is some ambiguity there. I do think that the definition for extended audio description helps clarify this.
Of course, the note there is also clarifying, and also non-normative. But I believe that the presence of both of these notes signals the intent of the WG at the time (and the memories of @GreggVan, @bruce-usab, and myself are in agreement on this point also). You say that you are ok with only a single audio description if there is no additional space for other audio description. Why just one? Would you fail a video with a single audio description if there was space for additional description and content that would be helpful to clarify in a description? Are you pointing out that there is no actual normative language that says that audio description needs to include information about all visual information necessary for comprehension of the video? |
<blockquote>
for clarityCloses #1768
Preview | Diff