Real World deployment for Voxposer

NOT Yet Finished

Gripper rotation and position (current implementation is fix the position before running a task)
Robust testing is not stable (if I move the target while the robot arm is executing a task, it will not update to the new position)
whole-arm obstacle avoidance planning
multi depth camera? (current implementation will only get the center point from one surface)
transfer the whole application to ros?

🛠️ Setup Instructions

Hardware Setup

Robot arm (TCP connection)
Gripper (Serial connection)
Depth Cameras (RealSense camera)
Workspace (remember to strict the workspace bounds for robot arm inside real_env.py)

You should use scripts under src/toolbox to test the connection of external devices

Initial Configuration

Obtain an OpenAI API key, and put it inside config.ini

Install required submodules:

# Clone submodules for vision components(XMem)
git submodule update --init --recursive

Create a conda environment:

conda create -n voxposer-realworld-env python=3.10
conda activate voxposer-realworld-env

Install dependencies:

pip install -r requirements.txt

you may need to run

conda install -c conda-forge libstdcxx-ng

if you encounter the following error

libGL error: MESA-LOADER: failed to open iris: /usr/lib/dri/iris_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: iris
libGL error: MESA-LOADER: failed to open iris: /usr/lib/dri/iris_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: iris
libGL error: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libGL error: failed to load driver: swrast
[Open3D WARNING] GLFW Error: GLX: Failed to create context: GLXBadFBConfig
[Open3D WARNING] Failed to create window
[Open3D WARNING] [DrawGeometries] Failed creating OpenGL window.

Download following models from ultralytics and XMem
- sam2.1_b.pt
- yolo11-seg.pt
- yolo11x.pt
- XMem.pth
Perform camera-to-robot calibration:
```
python src/toolbox/perceptron/d435i/calibration/cal_transform_mat.py
```
- Press Space to capture the current frame and process a calibration sample.
- Move the robot arm to different positions (10 or more captures recommended) and repeat the last step.
- Press d to delete the last collected sample if needed.
- Press r to reset all collected calibration data.
- Press Esc to finish calibration and compute the transformation matrix.
- Copy the resulting transform matrix to cam2robot.py and test the accuracy.
- Replace the transform matrix to run.py
Note: For multi-camera setup, calibration needs to be performed for each camera.
Start to play

You may need to adjust the code acoording the devices you used
```
python src/run.py
```

📁 Code Structure

.
├── .vscode/                    # VS Code configuration
│   └── launch.json             # Debug configurations
├── media/
├── src/                        # Main project implementation
│   ├── configs/                # Configuration files
│   ├── model_weight/           # Pre-trained model weights
│   ├── prompts/                # LLM prompt templates
│   │   └── rlbench/            # RLBench prompt templates
│   ├── toolbox/                # Core functionality modules
│   │   ├── perceptron/         # Vision and perception tools
│   │   │   ├── XMem/           # Video object segmentation
│   │   │   ├── d435i/          # RealSense camera tools
│   │   │   │   └── calibration/ # Camera-robot calibration
│   │   │   └── ...             # Other perception tools
│   │   ├── my_prompt/          # Custom prompt templates
│   │   ├── real_env.py         # Environment interface
│   ├── envs/                   # Environment definitions
│   └── run.py                  # Main execution entry point
├── requirements.txt            # Python dependencies
├── config.ini                  # API and config keys
└── README.md                   # Project documentation

Acknowledgements

This project is built on top of VoxPoser.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
media		media
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
config.ini		config.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real World deployment for Voxposer

NOT Yet Finished

🛠️ Setup Instructions

Hardware Setup

Initial Configuration

📁 Code Structure

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

tkgaolol/VoxPoser-realworld

Folders and files

Latest commit

History

Repository files navigation

Real World deployment for Voxposer

NOT Yet Finished

🛠️ Setup Instructions

Hardware Setup

Initial Configuration

📁 Code Structure

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages