Version
v0.3.0
Model
UI-TARS-1.5-7B
Deployment Method
Local
Issue Description
Summary
On macOS Tahoe 26, the red click prediction marker shown by UI-TARS Desktop can appear offset from where the cursor actually moves/clicks.
The underlying action execution appears to be separate from the visual marker positioning. In practice, this makes it look like the model or operator clicked the wrong place, even when the cursor movement itself may be targeting a different location than the marker suggests.
Environment
- OS: macOS Tahoe 26.1
- Build: 25B78
- App: UI-TARS Desktop
- Operator: Local Computer Operator
What I Observed
When the VLM returns a click action, the app shows a red prediction marker on screen. On my machine, that marker is visibly offset from where the cursor actually moves.
This makes it difficult to tell whether a bad click came from:
- the VLM predicting the wrong coordinate
- the operator executing the coordinate incorrectly
- the visual prediction marker being rendered in the wrong coordinate space
Expected Behavior
The red prediction marker should be centered on the same screen location where the operator moves/clicks the cursor.
Actual Behavior
The red prediction marker appears offset from the actual cursor position.
Suspected Area
This may be related to DPR / coordinate-space conversion for the prediction overlay.
The marker path appears to go through:
apps/ui-tars/src/main/shared/setOfMarks.ts
apps/ui-tars/src/main/window/ScreenMarker.ts
The execution path appears to go through:
packages/ui-tars/action-parser/src/actionParser.ts
packages/ui-tars/sdk/src/utils.ts
packages/ui-tars/operators/nut-js/src/index.ts
apps/ui-tars/src/main/agent/operator.ts
The issue may be that the marker overlay and the actual cursor execution are not using the same coordinate space on macOS Tahoe 26 / Retina-style displays.
Impact
This is more than a debugging issue. The VLM can use the visual click indicator in subsequent screenshots to understand where its previous action landed. If the red marker is offset from the actual clicked location or target button, the model may receive misleading feedback and adjust its next action based on an incorrect visual signal.
That can degrade the agent loop by making recovery and self-correction less reliable, especially for tasks that require precise GUI interaction.
Error Logs
No response
Version
v0.3.0
Model
UI-TARS-1.5-7B
Deployment Method
Local
Issue Description
Summary
On macOS Tahoe 26, the red click prediction marker shown by UI-TARS Desktop can appear offset from where the cursor actually moves/clicks.
The underlying action execution appears to be separate from the visual marker positioning. In practice, this makes it look like the model or operator clicked the wrong place, even when the cursor movement itself may be targeting a different location than the marker suggests.
Environment
What I Observed
When the VLM returns a click action, the app shows a red prediction marker on screen. On my machine, that marker is visibly offset from where the cursor actually moves.
This makes it difficult to tell whether a bad click came from:
Expected Behavior
The red prediction marker should be centered on the same screen location where the operator moves/clicks the cursor.
Actual Behavior
The red prediction marker appears offset from the actual cursor position.
Suspected Area
This may be related to DPR / coordinate-space conversion for the prediction overlay.
The marker path appears to go through:
apps/ui-tars/src/main/shared/setOfMarks.tsapps/ui-tars/src/main/window/ScreenMarker.tsThe execution path appears to go through:
packages/ui-tars/action-parser/src/actionParser.tspackages/ui-tars/sdk/src/utils.tspackages/ui-tars/operators/nut-js/src/index.tsapps/ui-tars/src/main/agent/operator.tsThe issue may be that the marker overlay and the actual cursor execution are not using the same coordinate space on macOS Tahoe 26 / Retina-style displays.
Impact
This is more than a debugging issue. The VLM can use the visual click indicator in subsequent screenshots to understand where its previous action landed. If the red marker is offset from the actual clicked location or target button, the model may receive misleading feedback and adjust its next action based on an incorrect visual signal.
That can degrade the agent loop by making recovery and self-correction less reliable, especially for tasks that require precise GUI interaction.
Error Logs
No response