refactor(packages): improve webkit presentation API handling and button availability#1362
refactor(packages): improve webkit presentation API handling and button availability#1362mihar-22 wants to merge 14 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
✅ Deploy Preview for vjs10-site ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📦 Bundle Size Report🎨 @videojs/html
Presets (7)
Media (8)
Players (3)
Skins (17)
UI Components (25)
Sizes are marginal over the root entry point. ⚛️ @videojs/react
Presets (7)
Media (7)
Skins (14)
UI Components (20)
Sizes are marginal over the root entry point. 🧩 @videojs/core
Entries (9)
🏷️ @videojs/element — no changesEntries (2)
📦 @videojs/store — no changesEntries (3)
🔧 @videojs/utils — no changesEntries (10)
📦 @videojs/spf — no changesEntries (3)
ℹ️ How to interpretAll sizes are standalone totals (minified + brotli).
Run |
…ature - Add missing `available` field to test helpers for FullscreenButtonState and PiPButtonState - Pass `media` argument to `exitFullscreen()` in cast feature (signature changed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…st button Aligns cast button with fullscreen and pip buttons — adds derived `available` boolean state and `data-available` data attribute. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nsition Listen for webkitpresentationmodechanged on the video element instead of the media proxy, and chain requestPictureInPicture after exitFullscreen so PiP activates once fullscreen exits. Also removes a stray console.log from the fullscreen request path.
Covers the rationale for using aria-disabled over HTML disabled, HTML hidden for unsupported features, and separate data-disabled/data-hidden styling hooks across cast, fullscreen, and pip buttons.
…idden` Feature buttons now expose `disabled` (non-interactive) and `hidden` (unsupported) instead of a single `available` boolean. Unsupported buttons get the HTML hidden attribute; disabled buttons stay visible with aria-disabled and data-disabled for styling. Aligns with WAI-ARIA APG toolbar guidance on keeping disabled controls focusable.
Listen for the `play` event so pip state updates when playback starts in picture-in-picture mode (e.g., after returning from background on iOS Safari). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace `has-data-[availability]:not-data-[available]:hidden` with `data-[disabled]` styling classes to match the new disabled/hidden button state model. Hidden buttons use the native HTML hidden attribute; disabled buttons get reduced opacity and grayscale via data-disabled. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…button docs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PiP is unsupported on WebKit so the button receives the `hidden` attribute and is removed from the DOM. Only assert `data-availability` when the pip button is visible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Check standard `requestPictureInPicture` before webkit fallback in `isPictureInPictureEnabled`, matching `requestPictureInPicture` order. - Guard `exitPictureInPicture` webkit path with presentation mode check to avoid accidentally exiting fullscreen. - Update fullscreen button test to expect error propagation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b4cdc2d. Configure here.
| export function isHTMLMediaElementHost(value: unknown): value is HTMLMediaElementHost<HTMLMediaElement, any> { | ||
| return isObject(value) && MEDIA_ELEMENT_HOST_SYMBOL in value; | ||
| } | ||
|
|
||
| export function isMediaVolumeCapable(value: unknown): value is MediaVolumeCapability { | ||
| return isObject(value) && 'volume' in value && 'muted' in value; | ||
| export function isHTMLVideoElementHost(value: unknown): value is HTMLVideoElementHost { | ||
| return isObject(value) && VIDEO_ELEMENT_HOST_SYMBOL in value; | ||
| } | ||
|
|
||
| export function isMediaPlaybackRateCapable(value: unknown): value is MediaPlaybackRateCapability { | ||
| return isObject(value) && 'playbackRate' in value; | ||
| export function isHTMLAudioElementHost(value: unknown): value is HTMLAudioElementHost { | ||
| return isObject(value) && AUDIO_ELEMENT_HOST_SYMBOL in value; | ||
| } | ||
|
|
||
| export function isMediaBufferCapable(value: unknown): value is MediaBufferCapability { | ||
| return isObject(value) && 'buffered' in value && 'seekable' in value; | ||
| export function isHlsMedia(value: unknown): value is HlsMedia { | ||
| return isObject(value) && HLS_MEDIA_SYMBOL in value; | ||
| } | ||
|
|
||
| export function isMediaErrorCapable(value: unknown): value is MediaErrorCapability { | ||
| return isObject(value) && 'error' in value; | ||
| export function isDashMedia(value: unknown): value is DashMedia { | ||
| return isObject(value) && DASH_MEDIA_SYMBOL in value; | ||
| } | ||
|
|
||
| export function isMediaTextTrackCapable(value: unknown): value is MediaTextTrackCapability { | ||
| return isObject(value) && 'textTracks' in value; | ||
| } | ||
| // -- Resolve helpers -- | ||
|
|
||
| export interface RemotePlaybackLike extends EventTarget { | ||
| readonly state: string; | ||
| prompt(): Promise<void>; | ||
| watchAvailability(callback: (available: boolean) => void): Promise<number>; | ||
| cancelWatchAvailability(id?: number): Promise<void>; | ||
| export function resolveMediaRemote(media: Media): MediaRemotePlaybackCapability['remote'] | null { | ||
| if (isMediaRemotePlaybackCapable(media)) { | ||
| return media.remote; | ||
| } | ||
|
|
||
| return null; | ||
| } | ||
|
|
||
| export interface MediaRemotePlaybackCapability { | ||
| readonly remote: RemotePlaybackLike; | ||
| export function resolveHTMLMediaElement(media: Media): HTMLMediaElement | null { | ||
| if (media instanceof HTMLMediaElement) return media; | ||
| if (isHTMLMediaElementHost(media)) return media.target; | ||
| return null; | ||
| } | ||
|
|
||
| export function isMediaRemotePlaybackCapable(value: unknown): value is MediaRemotePlaybackCapability { | ||
| return isObject(value) && 'remote' in value && isObject((value as Record<string, unknown>).remote); | ||
| export function resolveHTMLVideoElement(media: Media): HTMLVideoElement | null { | ||
| if (media instanceof HTMLVideoElement) return media; | ||
| if (isHTMLVideoElementHost(media)) return media.target; | ||
| return null; | ||
| } | ||
|
|
||
| export function isQuerySelectorAllCapable<T extends string>( | ||
| value: unknown | ||
| ): value is { | ||
| querySelectorAll: (selectors: T) => NodeListOf<HTMLElementTagNameMap[Extract<T, keyof HTMLElementTagNameMap>]>; | ||
| } { | ||
| return ( | ||
| isObject(value) && 'querySelectorAll' in value && isFunction((value as Record<string, unknown>).querySelectorAll) | ||
| ); | ||
| export function resolveHTMLAudioElement(media: Media): HTMLAudioElement | null { | ||
| if (media instanceof HTMLAudioElement) return media; | ||
| if (isHTMLAudioElementHost(media)) return media.target; | ||
| return null; |
There was a problem hiding this comment.
This feels like it's going against the uniform media API design. We shouldn't have to do this. I might need more context.
There was a problem hiding this comment.
No problem, happy to discuss more! I get where that sense is coming from, valid. I'll do my best to lay my thoughts out below.
As you know, we have an interesting and unique challenge of wanting to support native browser media out of the box (e.g., <video>) and also have a flexible enough Media interface that anyone can implement easily. The "easy" part comes from the last wave of changes we made to untangle assumptions about what subset of DOM-like APIs we support, and how any "Media" can satisfy all, or some, of the complete interface through "capabilities."
Based on our definition of "Media", we can't assume how it's implemented. At the end of the day, how someone satisfies the interface is up to the them. This means we don't actually know contractually if it's a HTMLMediaElement that's been directly given to the store, something of the shape { target: HTMLMediaElement }, rendering inside an <iframe>, rendering to a <canvas>, or however they've decided to build it out depending on the platform, environment, or tech they want to support. As another example, moq.dev is built on WebTransport, WebCodecs, and WebAudio. Another cool library that someone might build upon is Mediabunny. React Native won't have a DOM.
Basically this all means, we need a way to resolve DOM media elements if they're needed in our store DOM features to support certain APIs like fullscreen and pip out of the box. Normally, we would move that logic into the HTMLVideoElementHost but because we want to support <video> itself for spec-compliance, we simply can't. This would be true for external store feature authors too, they might need access to these specific elements when available.
The media type guards like isHlsMedia and isDashMedia are for lower-level access to media/engine APIs without bundling the media themselves via instanceof checks or some other direct reference (keeping it treeshakeable). Important to remember the store is a base/generic Media interface, it needs to be narrowed for deeper access. You can imagine ourselves or someone else externally building out a store feature that might need it for something like analytics, logging, debugging, testing, etc. It could even be an odd case where we need a small workaround in a store feature for a specific media. This happens all the time.
In summary, the resolvers and media type guards don't invalidate the uniform Media API design. They're just there to support the flexible contracts we've set up.
Something I haven't tackled in this PR is that Media authors should be able to provide their own fullscreen and pip interfaces separately from the DOM. This means notify the store of availability, handle request/exit, dispatch events, and whatever else is needed. This can ultimately be a separate conversation in another PR. Thought it was worth noting here.
There was a problem hiding this comment.
I was thinking about this today a bit more, wanted to illustrate the problem and a potential way we could decide to move forward now, or in the future to strength our Media API design.
So the problem I was describing above is because as of today we're expecting:
In HTML via fallback:
<media-container>
<video />
</media-container>Which will essentially do the following in JS:
store.attach(video); // video = <video>This is what leads to the awkwardness and expectations in our store features. Technically it's also why we have to design our Media API a certain way.
Basically it can be the <video> element or it could be a target (e.g., HTMLVideoElementHost).
There's a world where the default expectation could be:
import { HTMLVideoElementHost } from '@videojs/html';
const media = new HTMLVideoElementHost();
media.attach(video); // video = <video>
store.attach(media);This is no different to playback tech in VJS 8
This means we now control the Media API completely, not partially by the DOM. But it also means in order to do this via HTML we need a custom element that attaches native media so host class is treeshaken out:
<media-container>
<html-video> <!-- performs the code above -->
<video />
</html-video>
</media-container>A bit ugly if someone wants to use native media elements, but it does mean we can move native fullscreen/pip handling out of store features and into media layer (where it belongs imo). Why I like this:
- Media API becomes solely responsible for media-related behaviour - it's not snuck into store features or living in potentially multiple places. It also means the store features are further disconnected from the DOM.
- Maybe
documentfullscreen and pip handling could be composed in via classes or mixins so others can take advantage of it too. Clearer boundary there too between what is specifically native media handling vs. document. - Consistent with our other media in terms of how context is used for discovery and attaching. Technically anyone could pass in a
<video>compliant element in and it would work.
Anyway, all for food for thought :) I just wanted to make sure I illustrated some of the challenges we have. It all kind of hinges on how we strongly we believe passing in <video> with no adapters should work and more importantly why.
There was a problem hiding this comment.
There's some fundamental misalignment happening here that is hard to nail down and we should talk IRL, but I think it might boil down to the following. And I can't fully tell if we're disagreed on something or if we didn't implement something as planned.
Basically this all means, we need a way to resolve DOM media elements if they're needed in our store DOM features to support certain APIs like fullscreen and pip out of the box. Normally, we would move that logic into the
HTMLVideoElementHostbut because we want to support<video>itself for spec-compliance, we simply can't. This would be true for external store feature authors too, they might need access to these specific elements when available.
Specifically "need a way to resolve DOM media elements if they're needed in our store"
If the store needs access to the internal DOM media element then we're not doing the Media API well. Nothing should have to reach through the Media API contract and grab the internal media. We might do that for debugging, but any normal feature operation that needs to reach through actually needs an extension of the Media API.
This is no different to playback tech in VJS 8
I intentionally moved away from techs when building Media Chrome because it was completely redundant. We had a <video> API (vjs player) wrapping a <video> API (tech) wrapping a <video>API (<video>). We definitely haven't missed techs in that codebase. It does add needed intentionally in extending the media API. But I would rather get more aggressive in extending the API than going back to Techs (i.e. HTMLVideoElementHost). We should be intentional, naturally, because it's an API devs will work with directly. But at the same time we shouldn't see it as heavy thing we can't touch and have to work around. We're in control of our ecosystem.
@mihar-22 This is where I think I'm feeling some hesitancy from you, at least from the suggestion of requiring a <html-video> wrapper for <video>.
but it does mean we can move native fullscreen/pip handling out of store features and into media layer
This is where I get confused where the misalignment is happening, because I'd agree with this and assumed this was already the case. When the store triggers PiP and <hls-video> is being used, does it call requestPictureInPicture on <hls-video> or does it reach deeper and call it directly on the internal <video>. If it's the former, that's intended, if it's the latter it goes against the Media API contract.
So help me understand if you can see where the misalignment is happening more clearly.

Summary
Refactor WebKit fullscreen and picture-in-picture handling to use the modern presentation mode API (
webkitSetPresentationMode) instead of deprecated methods. Replace theavailableboolean on buttons withdisabledandhiddenstates that follow standard ARIA patterns —aria-disabledfor non-interactive controls and HTMLhiddenfor unsupported features.Changes
webkitSetPresentationMode('fullscreen')instead ofwebkitEnterFullscreen/webkitExitFullscreenfor iOS Safariasync/awaitto sync promise chaining to preserve user-activation tokens on iOS SafariisWebKitVideoElement, etc.) replacing loose optional-field interfacesisMediaPauseCapable, etc.) to runtime-agnosticcore/media/predicate.tsresolveHTMLMediaElementhelpersavailable: boolean+data-availablewithdisabledandhiddenstates on cast, fullscreen, and PiP buttonsaria-disabledis now set from button state (disabled by prop or unavailability), not just propshiddenattribute applied when a feature is unsupported (e.g., no fullscreen API)nullwhenstate.hiddenis truehas-data-[availability]:not-data-[available]:hiddenTailwind classes withdata-[disabled]styling (both default and minimal skins)&[data-disabled]instead of&[disabled]playevent for iOS Safari background-to-foreground transitionswebkitpresentationmodechangedrequestPictureInPictureafterexitFullscreenso PiP activates once fullscreen exitsImplementation details
The async-to-sync change preserves the user activation token that iOS Safari requires for fullscreen requests. With
await, the microtask boundary caused Safari to reject requests silently.The
disabled/hiddensplit follows the design doc ininternal/design/ui/disabled-hidden.md:aria-disabledkeeps the button visible and discoverable (screen readers still announce it) whilehiddenremoves it entirely for unsupported features. This is more semantically correct than the singleavailableboolean.The PiP feature now correctly listens on the video element for
webkitpresentationmodechangedand chainsrequestPictureInPictureafterexitFullscreenwhen transitioning from fullscreen. It also syncs onplayto handle iOS Safari returning from background while in PiP mode.Testing
pnpm -F @videojs/core test— unit tests for button cores and feature statepnpm -F @videojs/react test— React button integrationaria-disabledwhen unavailable andhiddenwhen unsupportedNote
Medium Risk
Touches core presentation (fullscreen/PiP/cast) flows and changes button rendering/attributes across HTML, React, skins, and tests; behavior differs per browser and now propagates errors instead of swallowing them.
Overview
Refactors fullscreen and picture-in-picture support detection and transitions to use WebKit’s presentation mode API (via new
presentation/webkit.ts) and newresolveHTML*helpers/symbol-based host guards, updating cast/fullscreen/PiP features and tests to match.Updates Cast/Fullscreen/PiP buttons to model
disabledvshiddenexplicitly:aria-disablednow reflects both prop-based disabling and feature unavailability, unsupported features set nativehidden(and React buttons returnnullviaisSupported), and styling shifts from[disabled]/data-availabilityhiding to[data-disabled]with docs/e2e/tests adjusted accordingly.Reviewed by Cursor Bugbot for commit b4cdc2d. Bugbot is set up for automated code reviews on this repo. Configure here.