feat: handle video-to-audio downgrade for incoming calls before answer (WT-713)#1394
Open
SERDUN wants to merge 9 commits into
Open
feat: handle video-to-audio downgrade for incoming calls before answer (WT-713)#1394SERDUN wants to merge 9 commits into
SERDUN wants to merge 9 commits into
Conversation
A video call downgraded to audio by the caller while still ringing left the callee with a stale video icon: the video flag is derived once from the initial SDP offer and soft mute changes no SDP, so nothing reaches the callee before answer. Introduce a lightweight media_state signaling request and event (no SDP, no renegotiation). The caller sends the camera state after a soft-mute toggle and re-sends it on ringing or progress when the toggle happened before the first provisional response, which the upstream rejects as too early. The callee applies it only to an unanswered incoming call - where the video flag still represents the remote offer rather than the local camera intent - updating the in-app icon and the native call UI via reportUpdateCall. Answered and outgoing calls ignore the event, and older clients are unaffected: unknown events and requests are dropped.
The downgrade path (soft mute on an existing video track) already sent the media state, but enabling the camera on an audio-started call goes through the upgrade path (track added plus renegotiation) which sent nothing - the remote side learned about the video only after the renegotiation completed, post-answer for a ringing call. Send the same signal there, and re-send the actual track state on ringing or progress in both directions instead of only the camera-off case.
A call downgraded to audio before answer still negotiates the video m-line (soft mute keeps the track alive with black frames), so right after accepting the UI flipped back into a video call with a black remote tile - contradicting the audio icon shown while ringing. Track the reported remote camera state on the call (media_state event, now applied in-call as well) and let it override the video presentation: an explicit camera-off report wins over the black-frame track, so the call looks and behaves as an audio call end to end. When the remote camera comes back the frames are already flowing and the UI switches to video instantly, no renegotiation involved.
The camera at answer followed the raw SDP offer, which still advertises m=video after a soft-mute downgrade: the callee answered a call shown as audio with the camera silently on, streaming video to the caller. Drive the answer camera from the presented state instead - the video flag updated by media_state, with the explicit remote camera-off report also guarding against incoming-event replays that reset the flag from the jsep.
4a1618d to
0121443
Compare
…app state (WT-713)
Replace the media_state-specific request/event with a generic peer_message
envelope carrying an opaque {type, data} pair. media_state (camera state,
data {video}) becomes the first carried type; the client routes inbound
peer_message by type and ignores unknown ones, so future in-call signals reuse
the same channel without protocol changes. "message" over "signal" to avoid
collision with the SIP/WS signaling plane - this is an app-to-app message.
0121443 to
5463c8e
Compare
UnknownPeerMessageEvent.fromJson cast type/data straight from JSON, so a non-string type or non-map data threw a TypeError - exactly the payloads the Unknown fallback exists to absorb. Guard the casts to null. Covers the case where a known type (media_state) carries non-map data too. WT-713
The signaling layer already decodes media_state into a typed MediaStatePeerMessageEvent with a bool video, but the internal bloc event re-boxed it into Map<String, dynamic> and the handler re-extracted + re-checked is! bool. Carry the typed bool through instead, dropping the redundant re-box and runtime check. WT-713
_CallSignalingEventMediaState did not convey whose state it carries. The event delivers a peer's reported media (camera) state for the call and drives remoteCameraEnabled, so name it peerMediaState - matching the signaling-layer MediaStatePeerMessageEvent and the peer_message channel. WT-713
The remoteVideo fallback reads the local-intent video flag, which is easy to misread as a remote signal next to remoteCameraEnabled. Spell out that video is the local camera intent used only as a provisional proxy, and that remoteCameraEnabled is the authoritative remote signal. WT-713
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
When a caller starts a video call and turns the camera off while the call is still ringing, the callee keeps seeing a stale video icon: the video flag is derived once from the SDP offer and a soft-muted track keeps the offer unchanged, so no signaling reaches the callee before answer.
This PR introduces an informational
media_statesignaling exchange (no SDP renegotiation - media negotiation stays driven by the offer/answer as is):webtrit_signaling: newMediaStateRequestandMediaStateEvent(call-scoped,media: {video: bool}).reportUpdateCall) flip between video and audio.ActiveCall.remoteCameraEnabledflag makes an explicit remote camera-off report win over the black-frame track that soft mute keeps delivering, so a downgraded call looks and behaves as an audio call end to end; the answer flow drives the camera from the presented state instead of the raw offer (the offer still advertises video after a downgrade, which used to put the callee on air in a call shown as audio).The feature requires the corresponding core-side delivery (separate PR). Without it nothing changes: the event never arrives,
remoteCameraEnabledstays unset and every decision degrades to the current offer-driven behavior; older clients ignore the unknown event and request.Out of scope (follow-ups): Android background isolate (needs an update-call API in callkeep), callee-to-caller direction of the same signal.