Mirokai Script

A Python script built during Robohack 2025 to enable the Mirokai robot to perceive its surroundings, describe them, and speak fun facts aloud, all in real time.

This project showcases integration of computer vision, natural language processing, and robotic speech using a combination of OpenCV, BLIP, spaCy, and Wikipedia API.

Features

Captures real-time images from Mirokai's RTSP camera feed using OpenCV
Uses the BLIP image captioning model to describe the image
Extracts nouns from the caption using spaCy
Queries the Wikipedia API for fun facts based on detected nouns
Speaks the result using Mirokai's built-in robot.say() method
Testing was done with a simulation of Mirokai on the Gazebo Simulation

Demo

Watch the robot in action: [https://youtu.be/Rsk4kfzCyzI]

How to Run

Clone the repository:

git clone https://github.com/Sahin-Halil/Mirokai-Script.git
cd Mirokai-Script

Install Python dependencies:
- OpenCV:
```
pip install opencv-python
```
- Transformers (for BLIP)
```
pip install transformers
```
- PyTorch (required for BLIP to run):
```
pip install torch
```
- spaCy
```
pip install spacy
```
To run the script you have two options:

Option A — Run on the Physical Mirokai Robot
- Make sure the Mirokai robot is powered on and connected to the same network.
- Identify the robot's IP address and API key.
- Run the script:
```
python Mirokai_Script.py --ip <ROBOT_IP> --api-key <API_KEY>
```
Option B — Run with Gazebo Simulation
- If you don't have access to a physical Mirokai robot, you can test the script in a simulated environment using Gazebo.
- Make sure Docker and Docker Compose are installed.
- Follow the official documentation to set up the Mirokai Gazebo simulation:
```
https://gazebosim.org/docs/latest/getstarted/
```
- Once your simulation is running, use the appropriate simulated IP and run:
```
python Mirokai_Script.py --ip <SIMULATED_IP> --api-key <API_KEY>
```
- Note: The Gazebo simulation replicates image capture and processing but does not support audio output, so you won’t hear the robot speak.

Tech Stack

Image Capture: OpenCV, RTSP
Captioning: BLIP (Hugging Face Transformers)
Natural Language Processing: spaCy (en_core_web_sm)
Fun Fact Retrieval: Wikipedia REST API
Speech Output: robot.say() (pymirokai SDK)
Simulation Environment: Gazebo, Docker Compose

Notes

This project was developed during a 2-day hackathon under time constraints.
Due to limited documentation, we reverse-engineered functionality from the pymirokai SDK.
The entire pipeline was tested in a Gazebo simulation before being run live on the robot for the first time during the final demo.

Credits:

Built by Sahin Halil and Rayyan Parkar at Robohack 2025.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Mirokai_Script.py		Mirokai_Script.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mirokai Script

Features

Demo

How to Run

Tech Stack

Notes

Credits:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Sahin-Halil/Mirokai-Script

Folders and files

Latest commit

History

Repository files navigation

Mirokai Script

Features

Demo

How to Run

Tech Stack

Notes

Credits:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages