Skip to content

Refactor: Safety, Code Health, and Live Video Feed#5

Merged
PaulTR merged 2 commits intogoogle-gemini:mainfrom
williamito:refactor
Feb 11, 2026
Merged

Refactor: Safety, Code Health, and Live Video Feed#5
PaulTR merged 2 commits intogoogle-gemini:mainfrom
williamito:refactor

Conversation

@williamito
Copy link
Copy Markdown
Collaborator

Summary

This PR significantly improves the safety, reliability, and developer experience of the robotics pointing sample. It introduces strict safety checks, a live video feed with persistent overlays during robot operation, and comprehensive documentation improvements. It also cleans up technical debt by consolidating on MuJoCo for kinematics and enforcing strict type hinting.

Key Changes

🛡️ Safety & Reliability

  • Automatic Retraction: Added RETRACT_POSE and logic to automatically move the arm to a safe, retracted position before calibration and before every vision capture in the main loop.
  • User Confirmation: Implemented confirm_action to require explicit user approval before autonomous robot movements (calibration, main loop).
  • Safety Limits: Enforced min_z safety floor in perform_move to prevent table collisions.

🎥 Live Video & UX

  • Live Feed: Implemented wait_and_show to maintain a live camera feed even during blocking robot movements.
  • Persistent Overlays: Added a draw_callback mechanism to keep target markers and coordinates visible on the video feed while the robot is moving.
  • Status Indicators: Added visual overlays ("LIVE", "PAUSED", "PROCESSING", "FOCUSING") to the video window to communicate system state.
  • Autofocus Dwell: Added a 2.0s dwell time after retraction to allow the camera autofocus to settle before image capture.
  • macOS Fix: Applied cv2.waitKey(1) workaround to fix OpenCV window closing issues on macOS.

🧹 Code Health

  • Strict Typing: Replaced Any with specific types (e.g., SO101Follower, cv2.VideoCapture) throughout workshop.py.
  • Dependency Management: Created requirements.txt and removed unused imports (requests, subprocess).
  • Kinematics Cleanup: Removed legacy LeRobot and Argo kinematics code to focus solely on MuJoCo.

📚 Documentation

  • Conceptual Walkthrough: Added a high-level explanation of the Vision-Language-Action loop to README.md.
  • Enhanced Docstrings: Updated all docstrings to follow Google Style, with detailed argument descriptions and educational comments (especially for the DLS IK algorithm).

Verification

  • Verified on SO-101 robot hardware.
  • Confirmed safety stops and retraction behavior.
  • Validated live video feed, persistent overlays, and status updates during operation.
  • Checked cross-platform compatibility (Linux/macOS).

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @williamito, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and usability of the robotics pointing sample. It introduces critical safety mechanisms to protect both the robot and its environment, alongside a more interactive and informative live video feedback system. The underlying codebase has been modernized through strict typing, streamlined dependency management, and a unified kinematics solution based on MuJoCo, reducing technical debt and improving maintainability. These changes collectively aim to provide a more reliable, safer, and developer-friendly platform for vision-guided manipulation tasks.

Highlights

  • Enhanced Safety Features: Implemented automatic arm retraction to a safe RETRACT_POSE before calibration and vision captures, added explicit user confirmation for autonomous robot movements, and enforced a min_z safety floor to prevent collisions.
  • Improved Live Video Feed and User Experience: Introduced a live camera feed (wait_and_show) that persists during blocking robot movements, enabled persistent visual overlays for target markers and coordinates, added dynamic status indicators ('LIVE', 'PAUSED', 'PROCESSING', 'FOCUSING'), included a 2.0s autofocus dwell time, and applied a macOS-specific fix for OpenCV window handling.
  • Significant Code Health Improvements: Refactored workshop.py to use strict type hinting, created a requirements.txt for better dependency management, and consolidated the kinematics engine to exclusively use MuJoCo, removing legacy LeRobot and Argo kinematics code.
  • Updated Documentation: Added a comprehensive 'Conceptual Walkthrough' to README.md explaining the Vision-Language-Action loop and updated all docstrings to Google Style with detailed argument descriptions and educational comments.
  • MuJoCo-based Inverse Kinematics: The kinematics engine has been refactored to solely utilize MuJoCo, including the implementation of a Damped Least Squares (DLS) method for Inverse Kinematics (IK) to accurately calculate joint angles for target poses.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@williamito williamito marked this pull request as ready for review January 17, 2026 03:33
@williamito williamito requested a review from PaulTR January 17, 2026 03:34
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that brings substantial improvements to the robotics pointing sample. The introduction of safety features like automatic retraction, user confirmation, and Z-height limits is a critical enhancement. The live video feed with persistent overlays and status indicators dramatically improves the user experience and debugging capabilities. I also appreciate the significant code health improvements, including the consolidation to MuJoCo for kinematics, removal of dead code and dependencies, and the move towards stricter type hinting. The documentation updates, especially the conceptual walkthrough in the README, are very clear and helpful for new users.

My review includes a few suggestions to further improve type safety by replacing a few remaining Any types with more specific ones, fixing some type inconsistencies, and removing a small piece of unreachable code left over from the refactoring. Overall, this is a high-quality contribution.

Comment thread workshop.py Outdated
Comment thread workshop.py Outdated
Comment thread workshop.py
Comment on lines +348 to +351
# Check if gripper pos is missing and restore if needed (LeRobot/Argo only compute 5 arm joints)
if q_sol is not None and len(q_sol) == 5:
# Append current gripper position to the 5 arm joint solutions
q_sol = np.append(q_sol, current_joints_deg[-1])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code appears to be a leftover from a previous implementation that supported multiple kinematics backends. In the current MuJoCo-only implementation, q_sol is constructed with 6 elements, so the condition len(q_sol) == 5 will never be met. This code is unreachable and can be safely removed to improve clarity.

Comment thread workshop.py Outdated
Comment thread workshop.py Outdated
Comment thread workshop.py
Comment on lines +459 to +461
if client is None:
logger.error("Gemini client is not initialized.")
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This check for client is None is unreachable. The main function ensures that setup_gemini returns a valid client, otherwise it exits. Since main_loop is the only caller, client will never be None here. This block can be removed.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Comment thread third_party/SO101/so101_new_calib.xml
@PaulTR PaulTR merged commit 912f75f into google-gemini:main Feb 11, 2026
2 checks passed
@williamito williamito deleted the refactor branch February 18, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants