Skip to content

Conversation

@Dav1nGen
Copy link

What this does

Summary

This PR makes dataset frame recording robust to user-triggered interruptions by synchronizing calls to dataset.add_frame in the control loop.

When recording episodes interactively, users may terminate an episode (e.g. by pressing the S key) while image frames are still being written. Previously, this could result in partially written or empty image files, which later caused failures during episode saving or video encoding.

By introducing a lock around dataset.add_frame, this change ensures that frame writes are completed atomically with respect to episode termination, preventing corrupted frame files from being produced.

Problem

When I run

python -m lerobot.rl.gym_manipulator --config_path config/jaka_collect_reward_classifier_task.json

error occurs:

"/home/joysonrobot/miniconda3/envs/lerobot/lib/python3.10/site-packages/PIL/ImageFile.py", line 374, in load
    s = read(read_bytes)
  File
"/home/joysonrobot/miniconda3/envs/lerobot/lib/python3.10/site-packages/PIL/PngImagePlugin.py", line 996, in load_read   
    cid, pos, length = self.png.read() 
  File
"/home/joysonrobot/miniconda3/envs/lerobot/lib/python3.10/site-packages/PIL/PngImagePlugin.py", line 178, in read 
    length = i32(s) 
  File

"/home/joysonrobot/miniconda3/envs/lerobot/lib/python3.10/site-packages/PIL/_binary.py", line 95, in i32be
  return unpack_from(">I", c, o)[0]
struct.error: unpack_from requires a buffer of at least 4 bytes for unpacking 4 bytes at offset 0 (actual buffer size is 0)

I realized that the error was caused by a forced exit after pressing the S key, which interrupted the recording of the last frame’s information in the code. As a result, the final frame’s image file was corrupted, leading to this error.

Changes

  • Add a threading.Lock in the control loop
  • Guard dataset.add_frame() with this lock
  • Ensure frame writes are not interrupted by episode termination or forced exit

This change does not alter dataset format, recording semantics, or public APIs. It only enforces proper synchronization during frame recording.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant