Skip to content
/ ELDA Public

An Elderly Learning Digital Agent that turns using a laptop into a simple, stress-free experience.

Notifications You must be signed in to change notification settings

elyssa-qi/ELDA

Repository files navigation

ELDA - Elderly Learning Digital Agent

Elda is an intelligent voice assistant specifically designedΒ to help elderly users navigate technology with confidence. Using advanced AI and natural language processing, Elda provides step-by-step visual tutorialsΒ triggered byΒ simple voice commands, makingΒ technology more accessible for seniors.

🌟 Features

🎀 Voice Interaction

  • Wake Word Detection: Uses Porcupine wake word engine with custom "Hey Elda" trigger
  • Speech-to-Text: Powered by OpenAI Whisper for accurate voice transcription
  • Intent Recognition: AI-powered command understanding using Google Gemini
  • Text-to-Speech: ElevenLabs integration for natural voice responses

πŸŽ›οΈ System Control

  • Volume Control: Increase/decrease system volume by 50% or custom amounts
  • Brightness Control: Adjust screen brightness with voice commands
  • Screen Zoom: Control macOS accessibility zoom features
  • Smart Defaults: All commands default to 50% adjustments for better user experience

πŸ“š Interactive Tutorials

  • AI-Generated Guides: Creates step-by-step tutorials for any topic
  • Visual Interface: Beautiful Electron-based tutorial cards
  • Progress Tracking: Save and resume tutorial progress
  • Detailed Help: Expandable help sections for each step

🎨 Modern Interface

  • React Frontend: Responsive tutorial interface with smooth animations
  • Always-on-Top Window: Non-intrusive popup that stays accessible
  • Visual States: Different Elda avatars for listening, thinking, and tutorial modes
  • Real-time Updates: Live WebSocket communication between components

πŸ—οΈ Architecture

Elda/
β”œβ”€β”€ voice.py                 # Main wake word detection and orchestration
β”œβ”€β”€ speech2text/
β”‚   β”œβ”€β”€ stt_capture.py      # Speech processing and intent handling
β”‚   └── howto_generator.py  # Flask API for tutorial generation
β”œβ”€β”€ tts_announcer.py        # Text-to-speech with ElevenLabs
β”œβ”€β”€ volume.py               # System volume control
β”œβ”€β”€ brightness.py           # Screen brightness control
β”œβ”€β”€ zoom_controller/        # macOS zoom accessibility features
β”œβ”€β”€ websocket_client.py     # Electron communication
└── elda-app/              # Electron frontend
    β”œβ”€β”€ main.js            # Electron main process
    β”œβ”€β”€ renderer/          # React frontend
    β”‚   β”œβ”€β”€ App.jsx        # Main tutorial interface
    β”‚   β”œβ”€β”€ components/    # React components
    β”‚   └── styles.css     # Styling
    └── package.json       # Node.js dependencies

πŸš€ Quick Start

Prerequisites

  • Python 3.9+
  • Node.js 16+
  • macOS (for system controls)
  • Microphone access

1. Clone and Setup

git clone <repository-url>
cd Elda

2. Python Environment

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

3. Node.js Dependencies

cd elda-app
npm install
cd ..

4. Environment Configuration

Create a .env file in the root directory:

# Wake Word Detection
ACCESS_KEY=your_porcupine_access_key

# AI Services
OPENAI_API_KEY=your_openai_api_key
GEMINI_API_KEY=your_gemini_api_key

# Text-to-Speech
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_voice_id

5. Start the Services

Terminal 1 - Main Voice Assistant:

source .venv/bin/activate
python voice.py

Terminal 2 - Tutorial Generator API:

source .venv/bin/activate
python speech2text/howto_generator.py

Terminal 3 - Electron Frontend:

cd elda-app
npm start

🎯 Voice Commands

System Control

  • "Hey Elda, increase volume" - Increase volume by 50%
  • "Hey Elda, decrease volume" - Decrease volume by 50%
  • "Hey Elda, turn up volume by 25" - Custom volume adjustment
  • "Hey Elda, make it brighter" - Increase screen brightness
  • "Hey Elda, zoom in" - Zoom into screen
  • "Hey Elda, zoom out" - Zoom out of screen

Information & Help

  • "Hey Elda, introduce yourself" - Get Elda's introduction
  • "Hey Elda, show me how to [topic]" - Generate interactive tutorial
  • "Hey Elda, how do I [action]" - Get step-by-step guidance

Examples

  • "Hey Elda, how do I send an email?"
  • "Hey Elda, can you make this video louder?"
  • "Hey Elda, zoom in on the screen please?"

πŸ”§ Configuration

Custom Wake Word

Replace hello_elda.ppn with your custom Porcupine wake word model.

Voice Settings

Modify voice settings in tts_announcer.py:

self.voice_id = os.getenv("ELEVENLABS_VOICE_ID", "your_default_voice")

Tutorial Customization

Adjust tutorial generation in howto_generator.py:

HOWTO_SYSTEM_PROMPT = """Your custom prompt here..."""

πŸ› οΈ Development

Adding New Commands

  1. Add Intent Detection in speech2text/stt_capture.py:
# In detect_intent_keywords()
if any(word in text for word in ["your", "keywords"]):
    return "your_new_intent"
  1. Handle the Command:
elif intent == "your_new_intent":
    print("πŸ†• New command detected!")
    # Your implementation here
  1. Update Gemini Prompt to recognize the new intent.

Adding New System Controls

Create a new module (e.g., new_feature.py) and integrate it:

from new_feature import your_function
# Add to handle_command()

πŸ“± Interface Components

EldaState Component

  • Listening State: Shows Elda avatar when ready to listen
  • Thinking State: Animated thinking gif during processing
  • Tutorial State: Interactive tutorial cards

TutorialCard Component

  • Step Navigation: Next/Previous buttons
  • Progress Tracking: Visual progress indicators
  • Detailed Help: Expandable help sections
  • Completion Tracking: Mark steps as completed

πŸ” Permissions

macOS Accessibility

For zoom and brightness controls, grant Terminal accessibility permissions:

  1. System Preferences β†’ Security & Privacy β†’ Privacy β†’ Accessibility
  2. Add Terminal (or your terminal app)
  3. Enable the checkbox

Microphone Access

Grant microphone permissions for wake word detection and speech recognition.

πŸ› Troubleshooting

Common Issues

"Wake word not detected"

  • Check microphone permissions
  • Verify Porcupine access key
  • Ensure quiet environment

"Intent detection fails"

  • Check Gemini API quota (50 requests/day free tier)
  • System falls back to keyword matching automatically
  • Verify API keys in .env

"Tutorial not showing"

  • Ensure Flask server is running (python speech2text/howto_generator.py)
  • Check port 3000 is available
  • Verify Electron frontend is running

"Volume/Brightness not working"

  • Grant accessibility permissions
  • Try running from terminal with elevated permissions
  • Check macOS version compatibility

Logs and Debugging

Enable debug mode by setting environment variables:

export DEBUG=1
python voice.py

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add your improvements
  4. Test thoroughly
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

An Elderly Learning Digital Agent that turns using a laptop into a simple, stress-free experience.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •