🤖 AIfred – Your Clever Robotic Study Companion

AIfred is an interactive robotic lamp designed to enhance your learning experience. It responds to hand gestures, retrieves real-time information, and seamlessly blends digital and physical spaces using computer vision and AI. Built with ROS, Mediapipe, and Gemini API, it turns study time into an intuitive and focused conversation.

📺 Watch the full demo on YouTube:
YouTube Demo Video
https://www.youtube.com/watch?v=L3PLWqSPDGM

Intro

This project introduces AIfred, a Clever Lamp, and innovative robotic lighting system designed to be the perfect companion for users. The system features a WX250s robotic arm from Trossen Robotics, equipped with a Kodak Mini projector mounted on its end-effector. The Clever Lamp combines advanced robotics with customizable projection capabilities, creating a versatile and interactive lighting solution.

The robotic arm's precise movements allow the projector to illuminate and transform the surrounding environment with tailored images and videos. Users can effortlessly manipulate the lamp's position and projection content, offering a unique and personalized lighting experience. The simplicity of the 3D design ensures ease of use and installation, while the customization options open up a wide range of applications, from mood lighting and entertainment to educational and professional uses.

By integrating cutting-edge robotics with user-centric design, the Clever Lamp offers immense potential for creative and practical applications, making it an indispensable addition to modern living spaces.

The objective of the project is to have a friend and tool at the disposal of the user. With a simple webcam, AIfred can detect what is appening on the user workspace and provide with usefull insights such as YouTube videos and Wikypedia links. Moreover, AIfred will be able to see your workspace intercat with you with voice (speech-to-speech) and solve on paper math for you.

The agent has 3 intelligent modes:

HOMEWORK - Project examples, wisdome pills, and explanations to help you with your homework, and learning process.
GENERATE IMAGE - Render sketches, diagrams, or hand drawings from phisical space to digital space, and project them on the table.
DRAW - Draw on paper with the robot arm projecting alligned youtube videos to improve your drawing skills.

Before running the code

To run the code you will need some prerequisites:

Optitrack system: It is a system of camares that detect position and orientation of cetain objects thank to capability of reflection of the ball markers.
Install natnet_ros_cpp ros package to send messages from Optitrack to your roscore.
Install interbotics_ws ros package to move and nteract wth wx250s robot arm from trossenrobotics.
3D print our universal marker. It is an object that will be easily detected from Optitrack. We call it umh_0.
3D print our custom wx250s base, for the robot arm. It has M3 scrues to host marker balls in place and detect position of robot base. We call it real_base_wx250s.

AIfred USAGE (Ros Package)

Hardware set-up

Setup:
Take Trossenrobotics wx250s, and secure it on table. Connect it to his power supply and connect signal USB to the computer.
Mount Kodak Projector on the end-effector of the Trossenrobotics Robot arm (wx250s) (download this for the attachment).
Attach chromcast to the HDMI of the kodak projector mini and power with a USB from chromcast to projector.
Connect a USB camera to PC and make it point on your working station.
Turn on optitrack:

a. Turn on rigid body for mark base robot (download here real_base_wx250s)

b. Turn on rigid body for universal marker (download hereumh_0)

Install software

Create a Gemini API and it them in a .env file
Download natnet_ros_cpp ROS package
Downlaod Trossenrobotics ROS Pakages (our guide here)

Download AIfred ROS Package:

cd ~/catkin_ws/src
git clone https://github.com/IERoboticsAILab/clever_lamp.git
cd ..
catkin build  #OR catkin_make
. devel/setup.bash

Open chrome (better account with youtube premium) tab (will be used for casting generated content)
Open FireFox tab (will be used for showiung user instructions)
Cast the tab of chrome with chromcast that is on the robot arm end effector Kodak Projector
Create virtual environment and download all the dependencies for computer vision

uv sync

Run demo

roslaunch alfred_clever_lamp demo.launch

Step by step instructions

The launch file will execute all the necessary nodes to have the full demo running, but if you want to run it step by step, here you have the instructions:

Publishing messages from Optitrack to ros:

roslaunch natnet_ros_cpp gui_natnet_ros.launch

OR

roslaunch natnet_ros_cpp natnet_ros.launch serverIP:=10.205.3.3 clientIP:=10.205.3.150 pub_rigid_body:=true pub_rigid_body_marker:=true serverType:=unicast

Check that the topics have been published using:
```
rostopic list
```
If NOT, change from Multicast to Unicast (and viceversa) and press start again untill the messages are published

Connect to Trossenrobotics Robot Arm. Source the interbotix workspace and run the controll package:

source interbotics_ws/devel/setup.bash
roslaunch interbotix_xsarm_control xsarm_control.launch robot_model:=wx250s

Make robot follow and point on the marker.

a. brodcast marker: This section of the project combine digital space wit real word with a user frendly interface. In RViz the robot is set in (0,0,0) that is the word cordinate space. But in the reality the robot is in a diffrent position in space (it depends where you position the working table). Here we take the Optitrack cordinates of the real robot base (/natnet_ros/real_base_wx250s/pose) in relation with the real marker (/natnet_ros/umh_2/pose), and we transform that relation with the digital robot base (wx250s/base_link), publishing a new tf for the marker (umh_2_new)

b. clever lamp: Look at the tf transformation of the universal marker position relative to the digital space and move end effector accordingly.
Run the computer vision ROS Node: a. computer vision: Look at the webcam, detect using mediapipe if you are pointing at something with your finger, take screenshot and shows YouTube video and Wikipedia of what you are looking at. If it detect some math, it solve it step by tep with you, projecting on the paper the solution. To run this script you will need to create a virtual environment with all the dependencies and activate it when launching the alfred node (point at things → webcam → gemini → video casted).

Usage

Move the Universal Marker and the robot will follow pointing the projector content on the table.
Point on the workspace with your finger to trigger the screenshot, that will pass to Gemini API, and generate Personalize Content for you. Then it will be projected on the table.
Rotate the marker to show next/previous.
Lift TUI to go to the next mode

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Videos_and_pictures		Videos_and_pictures
alfred_clever_lamp		alfred_clever_lamp
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 AIfred – Your Clever Robotic Study Companion

Intro

Before running the code

AIfred USAGE (Ros Package)

Hardware set-up

Install software

Run demo

Step by step instructions

Usage

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 AIfred – Your Clever Robotic Study Companion

Intro

Before running the code

AIfred USAGE (Ros Package)

Hardware set-up

Install software

Run demo

Step by step instructions

Usage

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages