Skip to content

Commit 37d052f

Browse files
authored
Merge pull request #460 from espressif/blog/add_esp32s3_sparkbot
blog add esp32s3 sparkbot
2 parents b892f7f + 4b6071f commit 37d052f

File tree

3 files changed

+185
-0
lines changed

3 files changed

+185
-0
lines changed
28.5 KB
Loading
272 KB
Loading
Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
---
2+
title: "ESP-SparkBot:Large Language Model Robot with ESP32-S3"
3+
date: 2025-04-23
4+
showAuthor: false
5+
featuredAsset: "ESP-SparkBot-featured.webp"
6+
authors:
7+
- "cai-guanhong"
8+
tags: ["ESP32-S3", "ESP-NOW", "Offline Speech Recognition", "Face Recognition", "Motion Detection", "USB Screen Mirror"]
9+
---
10+
> This article provides an overview of the ESP-SparkBot, its features, and functionality.It also details the hardware design and outlines the software framework that supports its operation.
11+
12+
## Introduction
13+
14+
With the booming development of generative artificial intelligence, large language models (LLMs) are becoming a core technology in the AI field. They are not only driving the realization of application scenarios such as AI programming, intelligent customer service, and smart office, but also enriching interactive experiences and service quality in areas such as smart homes, remote healthcare, online education, and personalized entertainment. However, these technologies typically rely on powerful cloud computing resources, and extending them to edge devices requires overcoming numerous challenges such as computing power, latency, and power consumption. Espressif Systems with its leading wireless SoC technology, provides a solution to this challenge and is committed to bringing the powerful capabilities of AI to a wider range of edge devices, making AI technology more accessible and serving people’s daily lives.
15+
16+
In this article, we introduce the ESP-Sparkbot AI Robot—a versatile solution designed to meet a range of needs. Whether you're looking to create a smart home system, enhance your experience with a reliable voice assistant, or find an engaging AI toy for your children, the ESP-Sparkbot has you covered.
17+
18+
19+
## Overview of ESP-SparkBot
20+
21+
ESP-SparkBot is a low-cost, multi-functional, AI large language model robot with ESP32-S3. It is an intelligent device integrating voice interaction, facial recognition, remote control, motion detection, and multimedia functions.
22+
23+
- In home automation, ESP-SparkBot can be your personal assistant.
24+
- In smart office scenarios, ESP-SparkBot act as your computer's secondary screen.
25+
- In outdoor entertainment settings, the ESP-SparkBot can seamlessly transform into a compact speaker and portable camera.
26+
27+
In this video you can see the functions and application scenario of ESP-SparkBot.
28+
29+
{{<youtubeLite id="meZDJf8QTdM" label="ESP-Demo: Large Language Model Robot with ESP32-S3">}}
30+
31+
32+
The ESP-ESP-SparkBot can be powered in two ways:
33+
34+
- **Button Battery (Default Power Supply):**
35+
The ESP-SparkBot is equipped with a 2000mA lithium battery, supporting power supply via the lithium battery and charging of the lithium battery via its base.
36+
37+
- **USB Power Supply (via ESP32-S3 USB Interface):**
38+
The ESP-SparkBot also features a USB Type-C port, enabling continuous power supply via USB, which simplifies program downloading and debugging. This added functionality expands usage scenarios and enhances overall convenience for users.
39+
40+
41+
## Key Features and Capabilities
42+
43+
This section highlights the ESP-SparkBot's key functionalities and the innovative features that make it a versatile AI Conversation Robot for smart device control and integration.
44+
45+
### Smart Voice Weather Assistant & Time Manager
46+
47+
The ESP-SparkBot serves as a smart voice assistant, offering real-time access to local data such as the current date, time, weather conditions, temperature fluctuations, and air quality via IP address. This makes the ESP-SparkBot not only your personal assistant but also an essential smart companion, delivering timely and valuable weather and time information whenever you need it.
48+
49+
50+
51+
<!-- <video controls width="640" height="480"> <source src="./img/clock-weather-assistant.webm" type="video/webm"> -->
52+
53+
{{< youtubeLite id="meZDJf8QTdM" label="Smart Voice Assistant" params="start=23&end=29&controls=0" >}}
54+
55+
56+
### Large Language model AI Chat Assistant Robot
57+
58+
The ESP-SparkBot utilizes the OpenAI platform and integrates the advanced ChatGPT model to support seamless online voice interaction. This transforms the ESP-SparkBot into not just a smart home assistant, but also an intelligent conversational partner, enabling users to engage in natural language communication and effortlessly retrieve information, thus enhancing both interactivity and convenience in the smart home experience.
59+
60+
<!-- <video controls width="640" height="480"> <source src="./img/chat-with-openAI.webm" type="video/webm"> -->
61+
62+
{{< youtubeLite id="meZDJf8QTdM" label="AI Chat Assistant Robot" params="start=46&end=85&controls=0" >}}
63+
64+
65+
### Relaxation Game
66+
67+
ESP-SparkBot is equipped with a touch button on top, allowing users tap a virtual zen drum and accumulate merit. With the ESP-NOW broadcast function, multiple ESP-SparkBots can be controlled simultaneously to tap the virtual zen drums, exponentially increasing the accumulated merit.
68+
69+
<!-- <video controls width="640" height="480"> <source src="./img/esp-now.webm" type="video/webm"> -->
70+
71+
{{< youtubeLite id="meZDJf8QTdM" label="Relaxation Game" params="start=86&end=111&controls=0" >}}
72+
73+
74+
### Virtual 3D Die
75+
76+
The ESP-SparkBot also features a built-in accelerometer, enabling it to function as a virtual die. By randomly rotating or shaking ESP-SparkBot, the 3D die displayed on the screen will rotate according to the accelerometer data. Once the movement stops, the on-screen die will gradually come to a halt and display the final result.
77+
78+
<!-- <video controls width="640" height="480"> <source src="./img/cyber-dice.webm" type="video/webm"> -->
79+
80+
{{< youtubeLite id="meZDJf8QTdM" label="Virtual 3D Die" params="start=111&end=116&controls=0" >}}
81+
82+
83+
### 2048 Game
84+
85+
ESP-SparkBot comes integrated with the 2048 game. After entering the 2048 game interface, users can interact with the game through gesture recognition enabled by the built-in accelerometer. Tapping the touch button on top will reset the game.
86+
87+
88+
<!-- <video controls width="640" height="480"> <source src="./img/2048-game.webm" type="video/webm"> -->
89+
90+
{{< youtubeLite id="meZDJf8QTdM" label="2048 Game" params="start=116&end=129&controls=0" >}}
91+
92+
93+
### Offline speech recognition, face recognition and motion detection
94+
95+
In addition to online interactions with cloud-based large models, ESP-SparkBot also supports running various offline AI models locally, such as offline speech recognition, face recognition, and motion detection.
96+
97+
- __Speech Recognition__
98+
By leveraging the [ESP-SR](https://gitee.com/link?target=https://github.com/espressif/esp-sr) library, ESP-SparkBot can perform local speech recognition with ease.
99+
100+
- __Face recognition__ and __motion detection__
101+
The ESP-SparkBot features a foldable camera on top, enabling real-time facial detection and recognition. Users can easily add or remove faces through voice commands. With the [ESP-WHO](https://gitee.com/link?target=https://github.com/espressif/esp-who) library, integrating additional vision AI models is simple, including capabilities such as cat face recognition, human face recognition, motion detection, and pedestrian detection.
102+
103+
104+
<!-- <video controls width="640" height="480"> <source src="./img/speech-face-recognition-motion-detection.webm" type="video/webm"> -->
105+
106+
{{< youtubeLite id="meZDJf8QTdM" label="Offline Speech Recognition & Face Recognition & Motion Detection" params="start=130&end=188&controls=0" >}}
107+
108+
109+
### Remote Controlled Reconnaissance Robot
110+
111+
The ESP-SparkBot can function as a wireless, remote-controlled reconnaissance robot, responding to voice commands to control its movement direction and light displays. Additionally, users can issue voice commands to capture photos while the robot is in motion.
112+
113+
<!-- <video controls width="640" height="480"> <source src="./img/moving-car.webm" type="video/webm"> -->
114+
115+
{{< youtubeLite id="meZDJf8QTdM" label="Remote Controlled Reconnaissance Robot" params="start=213&end=229&controls=0" >}}
116+
117+
118+
By connecting to the robot's WebSocket server, users can achieve two-way interaction with mobile remote control and real-time video streaming. No dedicated app installation is required—users can simply access the remote control interface via a web browser, featuring a simulated joystick design for smooth and intuitive operation.
119+
120+
<!-- <video controls width="640" height="480"> <source src="./img/remote-control-car.webm" type="video/webm"> -->
121+
122+
{{< youtubeLite id="meZDJf8QTdM" label="Robot's WebSocket Server" params="start=231&end=291&controls=0" >}}
123+
124+
125+
### USB Screen Mirror
126+
127+
With just a single USB cable, the ESP-SparkBot transforms into a plug-and-play USB secondary display. It supports bi-directional audio transmission and touch control, enabling it to function as both a speaker and a microphone. In addition to providing smooth video streaming for TV shows, it offers an immersive experience for gaming, including esports and AAA titles.
128+
129+
130+
131+
<!-- <video controls width="640" height="480"> <source src="./img/usb-screen-mirror.webm" type="video/webm"> -->
132+
133+
{{< youtubeLite id="meZDJf8QTdM" label="USB Screen Mirror" params="start=294&end=3301&controls=0" >}}
134+
135+
## Hardware Design
136+
137+
The hardware system of the ESP-SparkBot is composed as follows:
138+
139+
140+
{{< figure
141+
default=true
142+
src="img/sparkbot-hardware.webp"
143+
height=500
144+
caption="ESP-SparkBot Hardware Design"
145+
>}}
146+
147+
148+
### Description of Different Circuit Blocks
149+
150+
- **Main MCU**: [ESP32-S3-WROOM-1-N16R8](https://www.espressif.com/sites/default/files/documentation/esp32-s3-wroom-1_wroom-1u_datasheet_en.pdf) module,responsible for controlling the entire system. In includes both connectivity (Wi-Fi and Bluetooth) and peripherals (LCD, camera and audio).
151+
152+
- **Camera**: Uses a DVP interface camera, whose model is [OV2640](https://www.waveshare.com/w/upload/9/92/Ov2640_ds_1.8_.pdf). It is used for capture images and transmit video streams.
153+
154+
- **Audio**: An audio module is used for both microphone input and speaker output. It transmits digital audio data via the I2S interface and drives the speaker to play audio signals through an audio amplifier circuit.
155+
156+
- **LCD (Display)**: 1.54-inch with 240x240 pixel resolution. This LCD is equipped with the ST7789 controller.
157+
158+
- **USB Type-C (USB-C Interface)**: Supports USB Type-C connection for device power supply and data transmission. Support USB serial communication for debugging and flashing firmware.
159+
160+
- **DC-DC Converter** : Responsible for converting the input voltage to the stable voltage required by the ESP32-S3 and other modules.
161+
162+
- **Connector Pins (External Connectors)**: Connection interfaces for external devices, facilitating interconnection with other modules or development boards. Can be used for expanding functionality and debugging.
163+
164+
- **Touch (Touch Circuit)**: Touch sensing circuit for detecting user touch operations.
165+
166+
- **Microphone Module**: Microphone signal biasing network.
167+
168+
- **Power Switching Circuit**: Responsible for switching between different power inputs (such as battery and USB). Ensures stable operation of the device under multiple power supply methods.
169+
170+
- **Lithium Battery Charging Circuit**: Manages the charging process of the lithium battery, supporting battery overcharge protection and constant current/constant voltage charging. Provides a portable power solution for the device.
171+
172+
- **IMU (BMI270) (Inertial Measurement Unit)**: Used for detect the acceleration and angular velocity. It realizes motion detection, attitude detection and gesture recognition functions.
173+
174+
- **Button**: Detects button input for user interaction, enabling function switching or mode selection.
175+
176+
Complete open-source hardware resources are available at [ESP-SparkBot](https://oshwlab.com/hawaii0707/esp-sparkbot). For more ESP hardware design instructions, please refer to the [ESP Hardware Design Guidelines](https://docs.espressif.com/projects/esp-hardware-design-guidelines/en/latest/esp32s3/index.html#esp-hardware-design-guidelines)
177+
178+
179+
## Software Design
180+
181+
The software code for the ESP-SparkBot AI robot is fully open-source and available at [esp-friends/esp_sparkbot](https://gitee.com/esp-friends/esp_sparkbot). This repository includes a series of sample projects designed to showcase the full capabilities of the ESP-SparkBot. For more details, see: [ESP-SparkBot examples](https://gitee.com/esp-friends/esp_sparkbot/tree/master/example).
182+
183+
## Conclusion
184+
185+
The ESP-SparkBot is a versatile AI-powered robot offering voice interaction, facial recognition, motion detection, and multimedia features for smart home, office, and entertainment use. With both its hardware and software being open-source, and welcoming contributions from the community, it makes advanced AI accessible and practical for everyday applications, enhancing both personal and professional experiences.

0 commit comments

Comments
 (0)