Skip to content

feat: handle video-to-audio downgrade for incoming calls before answer (WT-713)#1394

Open
SERDUN wants to merge 9 commits into
developfrom
feat/WT-713-media-state
Open

feat: handle video-to-audio downgrade for incoming calls before answer (WT-713)#1394
SERDUN wants to merge 9 commits into
developfrom
feat/WT-713-media-state

Conversation

@SERDUN

@SERDUN SERDUN commented Jun 11, 2026

Copy link
Copy Markdown
Member

Overview

When a caller starts a video call and turns the camera off while the call is still ringing, the callee keeps seeing a stale video icon: the video flag is derived once from the SDP offer and a soft-muted track keeps the offer unchanged, so no signaling reaches the callee before answer.

This PR introduces an informational media_state signaling exchange (no SDP renegotiation - media negotiation stays driven by the offer/answer as is):

  • webtrit_signaling: new MediaStateRequest and MediaStateEvent (call-scoped, media: {video: bool}).
  • Caller side: the camera toggle reports the actual camera state - after a soft mute and on the upgrade path (track added); the state is re-sent on ringing/progress to cover a toggle made before the first provisional response.
  • Callee side: the event updates the presented state of the incoming call in real time - the incoming-call UI and the native call screen (reportUpdateCall) flip between video and audio.
  • Post-answer consistency: a new ActiveCall.remoteCameraEnabled flag makes an explicit remote camera-off report win over the black-frame track that soft mute keeps delivering, so a downgraded call looks and behaves as an audio call end to end; the answer flow drives the camera from the presented state instead of the raw offer (the offer still advertises video after a downgrade, which used to put the callee on air in a call shown as audio).
  • When the remote camera comes back, frames are already flowing, so the UI switches to video instantly - no renegotiation involved.

The feature requires the corresponding core-side delivery (separate PR). Without it nothing changes: the event never arrives, remoteCameraEnabled stays unset and every decision degrades to the current offer-driven behavior; older clients ignore the unknown event and request.

Out of scope (follow-ups): Android background isolate (needs an update-call API in callkeep), callee-to-caller direction of the same signal.

This comment was marked as resolved.

SERDUN added 4 commits June 18, 2026 11:15
A video call downgraded to audio by the caller while still ringing left
the callee with a stale video icon: the video flag is derived once from
the initial SDP offer and soft mute changes no SDP, so nothing reaches
the callee before answer.

Introduce a lightweight media_state signaling request and event (no SDP,
no renegotiation). The caller sends the camera state after a soft-mute
toggle and re-sends it on ringing or progress when the toggle happened
before the first provisional response, which the upstream rejects as too
early. The callee applies it only to an unanswered incoming call - where
the video flag still represents the remote offer rather than the local
camera intent - updating the in-app icon and the native call UI via
reportUpdateCall. Answered and outgoing calls ignore the event, and
older clients are unaffected: unknown events and requests are dropped.
The downgrade path (soft mute on an existing video track) already sent
the media state, but enabling the camera on an audio-started call goes
through the upgrade path (track added plus renegotiation) which sent
nothing - the remote side learned about the video only after the
renegotiation completed, post-answer for a ringing call. Send the same
signal there, and re-send the actual track state on ringing or progress
in both directions instead of only the camera-off case.
A call downgraded to audio before answer still negotiates the video
m-line (soft mute keeps the track alive with black frames), so right
after accepting the UI flipped back into a video call with a black
remote tile - contradicting the audio icon shown while ringing.

Track the reported remote camera state on the call (media_state event,
now applied in-call as well) and let it override the video presentation:
an explicit camera-off report wins over the black-frame track, so the
call looks and behaves as an audio call end to end. When the remote
camera comes back the frames are already flowing and the UI switches
to video instantly, no renegotiation involved.
The camera at answer followed the raw SDP offer, which still advertises
m=video after a soft-mute downgrade: the callee answered a call shown as
audio with the camera silently on, streaming video to the caller. Drive
the answer camera from the presented state instead - the video flag
updated by media_state, with the explicit remote camera-off report also
guarding against incoming-event replays that reset the flag from the
jsep.
@SERDUN SERDUN force-pushed the feat/WT-713-media-state branch 2 times, most recently from 4a1618d to 0121443 Compare June 18, 2026 09:51
…app state (WT-713)

Replace the media_state-specific request/event with a generic peer_message
envelope carrying an opaque {type, data} pair. media_state (camera state,
data {video}) becomes the first carried type; the client routes inbound
peer_message by type and ignores unknown ones, so future in-call signals reuse
the same channel without protocol changes. "message" over "signal" to avoid
collision with the SIP/WS signaling plane - this is an app-to-app message.

This comment was marked as resolved.

SERDUN added 4 commits June 19, 2026 12:47
UnknownPeerMessageEvent.fromJson cast type/data straight from JSON, so a
non-string type or non-map data threw a TypeError - exactly the payloads the
Unknown fallback exists to absorb. Guard the casts to null. Covers the case
where a known type (media_state) carries non-map data too.

WT-713
The signaling layer already decodes media_state into a typed
MediaStatePeerMessageEvent with a bool video, but the internal bloc event
re-boxed it into Map<String, dynamic> and the handler re-extracted +
re-checked is! bool. Carry the typed bool through instead, dropping the
redundant re-box and runtime check.

WT-713
_CallSignalingEventMediaState did not convey whose state it carries. The
event delivers a peer's reported media (camera) state for the call and
drives remoteCameraEnabled, so name it peerMediaState - matching the
signaling-layer MediaStatePeerMessageEvent and the peer_message channel.

WT-713
The remoteVideo fallback reads the local-intent video flag, which is easy
to misread as a remote signal next to remoteCameraEnabled. Spell out that
video is the local camera intent used only as a provisional proxy, and that
remoteCameraEnabled is the authoritative remote signal.

WT-713
@SERDUN SERDUN requested a review from digiboridev June 22, 2026 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants