Skip to content

Feature/serve web view#1726

Merged
doronz88 merged 13 commits into
masterfrom
feature/serve-web-view
Jun 15, 2026
Merged

Feature/serve web view#1726
doronz88 merged 13 commits into
masterfrom
feature/serve-web-view

Conversation

@doronz88

Copy link
Copy Markdown
Owner

No description provided.

doronz88 added 13 commits June 15, 2026 18:17
A pure-CSS rounded-rect bezel wraps the video canvas, with a `Frame`
toggle in the util-tray (preference persisted in localStorage). No
device-model awareness -- the frame just hugs whatever aspect ratio
the canvas ends up at. Input mapping is unchanged.
…canvas

Pointer capture is supposed to keep pointerup routed back to the
canvas even outside its bounds, but the browser silently drops capture
on window blur / Cmd-Tab / right-click menu / OS interruptions. The up
event then dispatches to whatever element the cursor sits on, our
canvas-only listener never fires, and the contact stays "held"
device-side.

Mirror pointermove/up/cancel listeners onto `window`, add a
`lostpointercapture` backstop, and treat window blur /
visibilitychange as implicit releases. Cache the last known coords so
the synthetic release has a sensible position even when triggered
without an event.
…stay crisp

CSS max-width/max-height forced the browser to bilinear-downscale the
~1264x2752 source to fit the viewport, which softened text and UI.
Drop the CSS caps and let JS (`fitCanvasToViewport`) size the canvas
in CSS pixels at `backing-store / devicePixelRatio`: on a Retina
display each backing pixel lands on exactly one device pixel = the
crisp 1:1 look, while still fitting inside the viewport.

If even the DPR-divided size doesn't fit we shrink proportionally;
`image-rendering: high-quality` keeps that fallback path from going
visibly blocky in Chrome.
`screen_stream.py` carried the entire viewer page as a ~760-line raw
string literal -- comfortable to write inline but impossible to edit
with proper syntax highlighting or linting. Move the markup, styling,
and decoder/input/audio JS into separate files under
`pymobiledevice3/resources/serve_web/`, load them via
`importlib.resources` (matching the pattern used by
`automation_session.py`), and add `/viewer.css` / `/viewer.js` routes
to the existing HTTP server. The `__AUDIO_DEFAULT_ON__` per-request
substitution now happens on the JS bytes (where the placeholder
lives) instead of the HTML. Behaviour is unchanged.
…ilder

The video + audio mediaBlob templates were 5-byte-replaceable hex
literals captured verbatim from Xcode -- impossible to tweak any field
other than `session_id` without re-capturing.

Reverse-engineer the protobuf schema (cross-referenced against
iShareScreen's AVConference-port) and replace the templates with named
constructors. Defaults reproduce the captured templates byte-for-byte
(asserted by the new `_self_check`, runnable via `python -m
pymobiledevice3.remote.core_device.media_stream_offer`); kwargs expose
the negotiation knobs we'd previously been unable to touch -- notably
the protobuf-level `ltrpEnabled` and `allowRTCPFB` flags. The outer
options-dict equivalents were confirmed ignored by the daemon; these
inner-blob flags haven't been probed yet.
The protobuf-level switches reverse-engineered in the previous commit
were reachable from Python but not from the CLI. Plumb them through
`DisplayService.start_video_stream` (new kwargs), the
`ScreenStreamServer` / `VncStreamServer` constructors, and finally
two `--no-ltrp` / `--rtcp-fb` flags on each of `serve-web` and
`serve-vnc`.

Defaults preserve the existing captured-Xcode behaviour. These are
explicit experiment flags -- `--no-ltrp` should kill the mid-stream
LTRP-driven tearing on iPhone (memory: `project_hevc_motion_tears`),
and `--rtcp-fb` invites the device to close the open-loop rate
control loop (memory: `project_avcrc_ignores_rr`).
On-device probing (iPhone18,4 iOS 27.0) confirmed the protobuf-level
`ltrpEnabled` switch is HONOURED by the device -- setting field 7 to 0
in VideoSettings yields `IsltrpEnabled: false` in the response's
streamConfig. The earlier "LTRP is forced on" memory note was about
the outer options dict; the inner mediaBlob field is a different code
path that actually flows into the encoder.

LTRP-off should eliminate the mid-stream tearing pattern under UDP
loss (no FEC/RTX, occasional loss): with LTRP on, a partially-lost
long-term reference frame stays corrupted for the rest of the GOP;
with LTRP off, references slide forward and corruption self-clears.

Flip the default everywhere (mediaBlob builder, DisplayService,
ScreenStreamServer, VncStreamServer). The CLI now exposes `--ltrp` to
opt back into the captured-Xcode behaviour for regression testing,
replacing the previous `--no-ltrp` (which is moot when LTRP is
already off by default). `--rtcp-fb` stays opt-in.

Also probed (negative results, logged so we don't re-walk them):
- Setting `f4=width` / `f5=height` inside VideoSettings of the offer
  is silently ignored -- the device returns panel-native 1264x2736
  regardless. Those fields are device->client only.
- `allow_rtcp_fb=True` produces no observable change in streamConfig.
…args

Recovered the full field-number → property-name mapping for
VCMediaNegotiationBlobVideoSettings by dumping the __objc_methname
table at 0x83c381..0x83c463 (the 14 properties live in adjacent slots,
and offset order matches proto field-tag order against every
previously-verified anchor: f1=SSRC, f2=allowRTCPFB, f3=banks,
f4=customVideoWidth, f5=customVideoHeight, f7=ltrpEnabled,
f12=blackFrameOnClearScreenEnabled all match).

The new map gives us f6=tilesPerFrame, f8=pixelFormats,
f9=hdrModesSupported, f10=fecEnabled, f11=rtxEnabled,
f13=foveationSupported, f14=enableInterleavedEncoding -- expose
fec_enabled and tiles_per_frame as kwargs on build_media_blob_video()
and DisplayService.start_video_stream().

On-device probe results (iPhone18,4 iOS 27.0):

- fec_enabled=True: ACCEPTED by the device (not Invalid Parameter)
  but no echo in the answer's VideoSettings and no fecEnabled key in
  streamConfig. Either silently honoured or silently dropped.
- rtx_enabled=True: REJECTED with Invalid Parameter. The
  screen-blob path likely requires separate localRTXSSRC/remoteRTXSSRC
  fields (referenced in AVConference VCVideoStream logs).
- tiles_per_frame=4: ACCEPTED but ignored. The device negotiates AVC
  over our HEVC bank (probably because our HEVC feature string is too
  sparse), and AVC doesn't support tile-level parallelism.

Defaults all left off -- these are research knobs, not perf wins.
The user-visible motion tears at heavy interaction are most likely
the bitrate-cap-driven QP escalation already captured in
project_displayservice_bitrate_cap, not something a negotiation knob
can fix.
The device accepts `f10=fecEnabled` in the offer without rejecting it
(unlike `f11=rtxEnabled` which returns Invalid Parameter), but the
answer doesn't echo it back and `streamConfig` has no `fecEnabled`
key, so we can't observe whether the encoder is actually emitting FEC
NALs. The risk of flipping the default is therefore low (the device
clearly tolerates it) and the potential upside is non-zero (if it
silently honours the request, lost UDP packets get repaired and motion
tears improve). Worst case it's a no-op.

`--no-fec` is not exposed on the CLI yet -- users wanting to test
without FEC can pass `fec_enabled=False` via the Python API.
Both viewers are tools the user wants reachable from other devices on
their LAN -- defaulting to 127.0.0.1 forced an explicit `--bind
0.0.0.0` every time. Flip the default. Help text now spells out the
security implication: the HTTP `/touch /button /key` endpoints have
no auth, the VNC server has no password, so anyone reaching the port
can both watch and control the iPhone. Pass `--bind 127.0.0.1` to
restore the old behaviour.
WebCodecs is gated on a secure context: browsers refuse to expose the
API on plain http:// from any non-loopback origin. Accessing serve-web
from another LAN host therefore fails with `isConfigSupported threw`
before a single frame can decode.

Generate an ed25519 self-signed cert on startup (SANs cover localhost,
::1, 127.0.0.1, and the bind address when concrete), load it into an
`ssl.SSLContext` via a tempfile that's unlinked as soon as the context
has loaded the PEM, and pass `ssl=` to `asyncio.start_server`. The
browser warns on first visit, the user accepts, and WebCodecs unlocks.

CLI: `--https`. Default off so the localhost-only case stays plain
http (no cert warning prompt).
…0.0.0

Previously the self-signed cert's SAN list was just the three loopback
entries when `--bind 0.0.0.0`, so connecting to https://<lan-ip>:port/
from another machine produced a CN/SAN mismatch the browser would
either reject outright (Safari) or render as a clickable warning
(Chrome). Some browsers won't click-through self-signed mismatches at
all on a public-IP-looking target.

When bind is the wildcard, enumerate every local interface IP via
`getaddrinfo(gethostname(), ...)` plus a UDP `connect()` probe for the
default route's local address, and add each as a SAN. The browser now
gets a cert that actually claims to be the host the user typed.
Chrome and Safari handle ed25519 server certs inconsistently in 2026
-- some builds reject them with "connection unexpectedly closed"
during the TLS handshake, no actionable browser warning. RSA-2048 is
universally accepted at the cost of a slightly slower generate (still
<200 ms on a modern Mac, once per server start).
@doronz88 doronz88 merged commit b3d62b6 into master Jun 15, 2026
17 checks passed
@doronz88 doronz88 deleted the feature/serve-web-view branch June 15, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant