fix(control_utils): replace pynput with readchar for Wayland-compatible keyboard listener#3130
Conversation
…le keyboard listener pynput relies on X11 global event hooks which are blocked by Wayland's security model. The keyboard listener thread would start silently but never fire on_press callbacks, making arrow keys and Escape inoperable during lerobot-record/lerobot-teleoperate on all modern Ubuntu/Fedora/Arch systems that default to Wayland. Replace pynput in init_keyboard_listener() with readchar, which reads directly from stdin in POSIX raw mode — no display server required. Also fix is_headless() to detect headless environments via DISPLAY / WAYLAND_DISPLAY env vars instead of using pynput import as a proxy (pynput imports fine on Wayland, so the old proxy gave a false negative). - readchar works on X11, Wayland, and SSH sessions with a TTY - daemon thread with threading.Event stop signal; listener.stop() shim preserves the existing call-site in lerobot_record.py unchanged - Adds readchar>=4.0.0,<5.0.0 to core dependencies; pynput kept for teleop_keyboard.py and gamepad_utils.py - Docs: remove stale "$DISPLAY workaround" note in il_robots.mdx and lekiwi.mdx; replace with accurate guidance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR updates LeRobot’s runtime keyboard shortcut handling to work reliably on Wayland (where pynput’s global hooks don’t fire), by switching lerobot-record’s keyboard listener implementation to a stdin-based approach.
Changes:
- Replace
pynputkeyboard listener usage ininit_keyboard_listener()with areadchar+ background thread implementation. - Rework
is_headless()detection to useDISPLAY/WAYLAND_DISPLAYon Linux rather than relying onpynputimport behavior. - Add
readchardependency and update docs to remove the outdated$DISPLAYworkaround guidance.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/lerobot/utils/control_utils.py |
Implements the new readchar-based keyboard listener and updates headless detection logic. |
pyproject.toml |
Adds readchar to core dependencies with bounded version constraints. |
docs/source/lekiwi.mdx |
Updates troubleshooting note to reflect Wayland-compatible keyboard shortcuts. |
docs/source/il_robots.mdx |
Updates troubleshooting note to reflect Wayland-compatible keyboard shortcuts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| _stop = threading.Event() | ||
|
|
||
| def on_press(key): | ||
| try: | ||
| if key == keyboard.Key.right: | ||
| def listen(): | ||
| while not _stop.is_set(): | ||
| try: | ||
| key = readchar.readkey() | ||
| except Exception: | ||
| break | ||
| if key == readchar.key.RIGHT: | ||
| print("Right arrow key pressed. Exiting loop...") | ||
| events["exit_early"] = True | ||
| elif key == keyboard.Key.left: | ||
| print("Left arrow key pressed. Exiting loop and rerecord the last episode...") | ||
| elif key == readchar.key.LEFT: | ||
| print("Left arrow key pressed. Re-recording episode...") | ||
| events["rerecord_episode"] = True | ||
| events["exit_early"] = True | ||
| elif key == keyboard.Key.esc: | ||
| elif key == readchar.key.ESC: | ||
| print("Escape key pressed. Stopping data recording...") | ||
| events["stop_recording"] = True | ||
| events["exit_early"] = True | ||
| except Exception as e: | ||
| print(f"Error handling key press: {e}") | ||
| break | ||
| if events["stop_recording"]: | ||
| break | ||
|
|
||
| listener = keyboard.Listener(on_press=on_press) | ||
| listener = threading.Thread(target=listen, daemon=True) | ||
| listener.start() | ||
| listener.stop = _stop.set # compatibility shim: lets callers do listener.stop() |
There was a problem hiding this comment.
listener.stop() only sets _stop, but the listener thread may remain blocked inside readchar.readkey() and not observe _stop until another key is pressed (or stdin errors). This makes stopping non-deterministic and can leave the keyboard thread running longer than expected. Consider switching to a polling approach (e.g., select/timeout before calling readkey) or another mechanism that makes the read interruptible so stop() reliably terminates the thread quickly.
| def init_keyboard_listener(): | ||
| """ | ||
| Initializes a non-blocking keyboard listener for real-time user interaction. | ||
|
|
||
| This function sets up a listener for specific keys (right arrow, left arrow, escape) to control | ||
| the program flow during execution, such as stopping recording or exiting loops. It gracefully | ||
| handles headless environments where keyboard listening is not possible. | ||
| Reads directly from stdin using `readchar`, which works on both X11 and Wayland sessions | ||
| without any display-server dependency. Keyboard input is unavailable when stdin is not a | ||
| TTY (e.g. piped input or a truly headless server). | ||
|
|
||
| Returns: | ||
| A tuple containing: | ||
| - The `pynput.keyboard.Listener` instance, or `None` if in a headless environment. | ||
| - A dictionary of event flags (e.g., `exit_early`) that are set by key presses. | ||
| - A ``threading.Thread`` with a ``stop()`` method, or ``None`` if stdin is not a TTY. | ||
| - A dictionary of event flags (``exit_early``, ``rerecord_episode``, ``stop_recording``) | ||
| that are set by the corresponding key presses. | ||
| """ | ||
| # Allow to exit early while recording an episode or resetting the environment, | ||
| # by tapping the right arrow key '->'. This might require a sudo permission | ||
| # to allow your terminal to monitor keyboard events. | ||
| events = {} | ||
| events["exit_early"] = False | ||
| events["rerecord_episode"] = False | ||
| events["stop_recording"] = False | ||
|
|
||
| if is_headless(): | ||
| import readchar | ||
|
|
||
| events = { | ||
| "exit_early": False, | ||
| "rerecord_episode": False, | ||
| "stop_recording": False, | ||
| } | ||
|
|
||
| if not sys.stdin.isatty(): | ||
| logging.warning( | ||
| "Headless environment detected. On-screen cameras display and keyboard inputs will not be available." | ||
| "Stdin is not a TTY. Keyboard inputs will not be available. " | ||
| "You won't be able to change the control flow with keyboard shortcuts." | ||
| ) | ||
| listener = None | ||
| return listener, events | ||
| return None, events | ||
|
|
There was a problem hiding this comment.
init_keyboard_listener() now enables keyboard shortcuts whenever sys.stdin.isatty() is true, independent of is_headless(). However, existing callers (e.g., lerobot_record.py) still gate listener.stop() behind not is_headless(), so in "no DISPLAY/WAYLAND_DISPLAY" SSH sessions the listener will be created but never stopped. To keep behavior consistent, either adjust the API/return value here to align with the old is_headless() semantics, or update call sites to stop the listener based solely on listener is not None.
| while not _stop.is_set(): | ||
| try: | ||
| key = readchar.readkey() | ||
| except Exception: |
There was a problem hiding this comment.
The exception handling in the listener loop swallows all errors from readchar.readkey() and exits silently. This can make failures (e.g., stdin being closed or terminal/TTY errors) hard to diagnose. Consider logging the exception (at least at debug level) before breaking so users have some visibility into why keyboard input stopped working.
| except Exception: | |
| except Exception as exc: | |
| logging.debug("Keyboard listener stopped due to exception from readchar.readkey(): %s", exc, exc_info=True) |
Summary
Fixes keyboard shortcuts (arrow keys, Escape) being non-functional during
lerobot-recordandlerobot-teleoperateon Wayland sessions (Ubuntu 21.04+, Fedora, Arch, and any distro that defaults to Wayland).Root cause:
pynputuses X11 global event hooks, which Wayland's compositor security model intentionally blocks. The listener thread starts silently buton_pressnever fires — raw ANSI codes are echoed to the terminal instead.Fix: Replace
pynputininit_keyboard_listener()withreadchar, which reads directly from stdin in POSIX raw mode with no display-server dependency. Also works correctly when the Rerun viewer steals terminal focus (a secondary pynput failure mode).Changes
src/lerobot/utils/control_utils.pyis_headless(): detect headless environments viaDISPLAY/WAYLAND_DISPLAYenv vars instead of using a pynput import as a proxy (pynput imports successfully on Wayland, making the old proxy give a false negative)init_keyboard_listener(): replacedpynput.keyboard.Listenerwith areadchar+threading.Threadimplementation; guards withsys.stdin.isatty()instead ofis_headless()so keyboard shortcuts also work over SSH;listener.stop = _stop.setshim keeps the existinglerobot_record.pycall-site unchangedpyproject.toml: addreadchar>=4.0.0,<5.0.0;pynputkept (still used byteleop_keyboard.pyandgamepad_utils.py)docs/source/il_robots.mdxanddocs/source/lekiwi.mdx: remove stale$DISPLAYworkaround note (ineffective on Wayland); replace with accurate guidanceTest plan
lerobot-recordon Wayland session — Right Arrow, Left Arrow, Escape all fire correctlylerobot-recordon X11 session — keyboard shortcuts unchangedlerobot-recordover SSH with TTY — keyboard shortcuts worklerobot-recordin non-TTY / piped context — graceful warning logged, no crashlerobot-teleoperatewith gamepad — unaffected (pynput path untouched)Closes #879
🤖 Generated with Claude Code