Skip to content

aibgr/Crawling-Robot-with-Reinforcement-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ESP32-RL-Crawler

Fully Autonomous On-Board Q-Learning – 2-Servo Crawling Robot with Ultrasonic Reward & Swarm-Ready OTA

Platform RL WiFi/OTA Swarm License: MIT Stars

Zero off-board computation – Complete Q-Learning (35×35 table) runs on ESP32 at 100–200 Hz
Real-time reward directly from HC-SR04 ultrasonic sensor
Supports up to 8 robots simultaneously with unique WiFi AP + OTA hostname
Full source code + schematics included

Key Features

  • 35 discrete posture states + 35 possible actions → 1225 Q-values stored in RAM
  • Distance-based reward: r = 2.5 × Δdistance (encourages forward motion)
  • ε-greedy with decay (0.8 → 0.1) → converges in ~10 episodes (~2–3 min)
  • Smooth servo trajectories with configurable speed
  • Startup health check (ultrasonic + servo reset
  • I2C 16×2 LCD real-time feedback
  • EEPROM-persisted robot ID (1–8) → unique SSID: ESP32-AP-1ESP32-AP-8
  • Non-blocking OTA updates via FreeRTOS task

Core Q-Learning Update Rule (executed every step on ESP32)

Q-Learning formula

Say Hi!

Robot prototype

2-servo crawling robot learning optimal forward gait completely on-board using only an ultrasonic sensor as reward signal

Hardware

  • ESP32 Dev Module
  • 2× SG90/MG90S servo motors
  • HC-SR04 ultrasonic sensor
  • 16×2 I2C LCD
  • Custom 3D-printed linkage mechanism

About

Fully Autonomous On-Board Q-Learning – 2-Servo Crawling Robot with Ultrasonic Reward

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages