You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
iphone: coordinate-space handling for tap/swipe + screenshot metadata (#1317)
## Why
`iphone.tap`/`swipe` take **points**; `iphone.screenshot` returns
**pixels** (points × display scale, 3× on this device). Reading a
coordinate off a screenshot and passing it to `tap` lands `scale`× too
far — often off-screen entirely. Hit this live driving a real iPhone:
taps read off screenshots silently missed.
## What
- `tap(x, y, *, space="points")` and `swipe(..., space="points")` accept
`"points"` (default, back-compat), `"pixels"` (screenshot coords), or
`"fraction"` (0..1, robust to image rescaling). Conversion factored into
a pure, unit-tested `_to_points`.
- `screen()` and `window_size()` helpers expose geometry (points,
pixels, scale).
- `screenshot()` stamps `img.info` with `scale` / `point_size` /
`pixel_size` so an image is self-describing.
- UI actions (`tap`/`swipe`/`press`/`type_text`/`home`/`tap_element`)
raise a one-line `IphoneError` naming `wda_start` when WDA is down,
instead of a deep urllib stack.
- Docstrings (the `api()`/`help()` source of truth) steer toward
`tap_element(label=...)` and `space="fraction"` over hand-converted
pixels.
## Verification
Local: `nix build .#mcp.tests.iphoneTests` → 18 passed;
`.#mcp.tests.iphoneBundled` → ok; `nix run .#lint` clean.
Live experiment (real iPhone, scale 3, 402×874 pt), 8 on-screen targets:
| arm | hit rate |
|---|---|
| baseline (pixels-as-points) | **0/8** |
| `space="pixels"` | **8/8** |
| `space="fraction"` | **8/8** |
End-to-end: tapping a dock icon's pixel center as points did **not**
launch the app (reproduces the bug); `space="pixels"` launched it.
`tap_element(label=...)` and `screenshot().info` stamping confirmed
live.
<!-- Macroscope's pull request summary starts here -->
<!-- Macroscope will only edit the content between these invisible
markers, and the markers themselves will not be visible in the GitHub
rendered markdown. -->
<!-- If you delete either of the start / end markers from your PR's
description, Macroscope will append its summary at the bottom of the
description. -->
> [!NOTE]
> ### Add coordinate-space handling for `tap`/`swipe` and screenshot
metadata to iPhone module
> - `tap` and `swipe` now accept float coordinates and a `space`
parameter (`'points'`, `'pixels'`, or `'fraction'`), resolving
coordinates via new helpers `_to_points` and `_resolve_points` in
[__init__.py](https://github.com/indexable-inc/index/pull/1317/files#diff-f92c2ac3b6f1cee30bc3c534269f16357522a2d23c4c9e1e984a8f32030227b7).
> - New `screen` and `window_size` async helpers expose screen geometry
(points, pixels, scale) by querying WDA.
> - `screenshot` now populates `Image.info` with `scale`, `pixel_size`,
and `point_size` when captured via WDA.
> - All UI actions (`tap`, `swipe`, `press`, `type_text`, `home`,
`tap_element`) now call `_require_wda()` and raise `IphoneError`
immediately if WDA is unreachable.
> - Cached display scale (`_wda_scale`) is reset on `wda_start` and
`wda_stop` to prevent stale values across sessions or devices.
> - Behavioral Change: `tap` and `swipe` signatures now accept `float`
instead of `int`; passing an unknown space or out-of-range fraction
raises `IphoneError`.
>
> <!-- Macroscope's review summary starts here -->
>
> <sup><a href="https://app.macroscope.com">Macroscope</a> summarized
1720daa.</sup>
> <!-- Macroscope's review summary ends here -->
>
<!-- macroscope-ui-refresh -->
<!-- Macroscope's pull request summary ends here -->
0 commit comments