RAG-HAR: Human Activity Recognition with Retrieval-Augmented Generation

Real-time human activity recognition system using RAG-based classification with LLM reasoning.

System Architecture

Flutter Mobile App
    ↓ WebSocket
Python Server (RAG-HAR Pipeline)
    ├─ Feature Extraction (temporal segmentation)
    ├─ Vector Search (Milvus/Zilliz)
    └─ LLM Classification (GPT-5-mini)
    ↓ WebSocket
Mobile App (activity + reasoning)

Project Structure

har-demo/
├── mobile/              # Flutter Android app
│   ├── lib/
│   │   ├── models/     # SensorData, ActivityType
│   │   ├── providers/  # State management (Provider pattern)
│   │   ├── services/   # Sensors, WebSocket, permissions, orientation normalizer
│   │   ├── screens/    # Home, Data collection, Workouts, Progress, Settings
│   │   └── widgets/    # UI components
│   └── pubspec.yaml
│
├── server/             # Python WebSocket server + RAG pipeline
│   ├── websocket_server.py      # WebSocket handler
│   ├── data_collector.py        # Save labeled sensor data
│   ├── activity_predictor.py    # Sliding window + RAG classifier
│   ├── rag_har_pipeline.py      # Pipeline orchestrator
│   ├── prediction_data_logger.py # Log predictions for analysis
│   ├── rag-har/                 # RAG pipeline modules
│   │   ├── preprocessing.py     # Train/test split, windowing
│   │   ├── features.py          # Statistical feature extraction
│   │   ├── feature_utils.py     # Common feature utilities
│   │   ├── timeseries_indexing.py  # Vector database indexing
│   │   └── classifier.py        # RAG-based classifier
│   └── requirements.txt
│
└── CLAUDE.md          # Development guide

Features

Mobile App

Data Collection Mode: Label and collect sensor data for training
- 50Hz sampling (accelerometer, gyroscope)
- Subject ID and activity labeling
- CSV export with timestamps
Activity Recognition Mode: Real-time prediction with explanation
- Window-based collection (200 samples = 4 seconds)
- Activity display with LLM reasoning
- History tracking
Orientation Normalization: Sensor data transformed to world frame
- Gravity-based rotation to orientation-independent coordinates
- Consistent feature extraction regardless of device orientation
Demo Mode: Test without physical sensors (works on emulators)

Server (RAG-HAR Pipeline)

Data Collection:

Saves labeled sensor data to collected_data/subject_timestamp/
Triggers RAG pipeline on stop_collection signal

RAG Pipeline Stages:

Preprocessing: Train/test split, windowing
Feature Extraction: Statistical features with temporal segmentation (whole, start, mid, end)
Vector Indexing: Embed and store in Milvus/Zilliz
Classification: Hybrid search + LLM reasoning

Activity Prediction:

Real-time classification using RAG
Returns activity label + reasoning explanation
Non-overlapping 4-second windows

Supported Activities

Walking
Running
Sitting
Standing
Jumping
Lying

Quick Start

1. Server Setup

cd server

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export OPENAI_API_KEY="your-key"
export ZILLIZ_CLOUD_URI="your-uri"
export ZILLIZ_CLOUD_API_KEY="your-key"

# Run server
python websocket_server.py

Server runs on ws://0.0.0.0:8000

2. Mobile App Setup

cd mobile

# Install dependencies
flutter pub get

# Run app
flutter run

# Build apk
flutter build apk --release 
# Connect the mobile phone with a wired connection
adb install build/app/outputs/flutter-apk/app-release.apk 
or
adb push build/app/outputs/flutter-apk/app-release.apk /sdcard/Download/

3. Configure Connection

Android Emulator:

Settings → WebSocket URL: ws://10.0.2.2:8000/ws

Physical Device:

# Find your local IP
ifconfig | grep "inet " | grep -v 127.0.0.1

# In app settings
ws://YOUR_LOCAL_IP:8000/ws

Usage

Collect Training Data

Open app → Data Collection
Select activity label
Tap Start, perform activity for 1-2 mins
Tap Stop (triggers RAG pipeline automatically)
Server processes and indexes data to vector database

Real-Time Recognition

Open app → Activity Recognition
Tap Start
Perform activity
View prediction with LLM reasoning every 4 seconds

WebSocket Protocol

Client → Server

Collect Data:

{
  "type": "collect_data",
  "subject_id": "subject0",
  "activity": "walking",
  "timestamp": "2025-01-05T10:30:45.123Z",
  "data": {
    "accelerometer": {"x": 0.123, "y": 9.81, "z": 0.045},
    "gyroscope": {"x": 0.001, "y": -0.002, "z": 0.0},
  }
}

Stop Collection:

{
  "type": "stop_collection",
  "timestamp": "2025-01-05T10:30:45.123Z"
}

Predict Activity:

{
  "type": "predict_activity",
  "timestamp": "2025-01-05T10:30:45.123Z",
  "data": {
    "accelerometer": {"x": 0.123, "y": 9.81, "z": 0.045},
    "gyroscope": {"x": 0.001, "y": -0.002, "z": 0.0},
  }
}

Server → Client

Activity Prediction:

{
  "type": "activity_prediction",
  "activity": "walking",
  "reasoning": "The acceleration patterns show periodic vertical oscillations...",
  "timestamp": "2025-01-05T10:30:45.678Z",
  "window_size": 200,
  "method": "rag_classifier"
}

RAG Classification Details

How it works:

Feature Extraction: Split 4-second window into temporal segments (whole, start 33%, mid 33%, end 33%)
Embedding: Generate vector embeddings for each segment using OpenAI text-embedding-3-small
Retrieval: Hybrid search in Milvus for top 10 similar training samples (15 per segment → weighted reranking)
Classification: GPT-5-mini analyzes retrieved examples and predicts activity with reasoning

Why RAG?

✅ Human-readable explanations for each prediction
✅ Semantic understanding vs pattern matching
✅ Add new activities without retraining (just update vector DB)
✅ Leverages LLM knowledge about physics and motion

Technology Stack

Mobile: Flutter, Provider, sensors_plus
Server: Python, websockets, asyncio
RAG: LangChain, OpenAI (embeddings + GPT-5-mini)
Vector DB: Milvus/Zilliz Cloud
Features: pandas, numpy, scipy (statistical feature extraction)

Development

For server-specific documentation, see server/README.md.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
mobile		mobile
server		server
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-HAR: Human Activity Recognition with Retrieval-Augmented Generation

System Architecture

Project Structure

Features

Mobile App

Server (RAG-HAR Pipeline)

Supported Activities

Quick Start

1. Server Setup

2. Mobile App Setup

3. Configure Connection

Usage

Collect Training Data

Real-Time Recognition

WebSocket Protocol

Client → Server

Server → Client

RAG Classification Details

Technology Stack

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG-HAR: Human Activity Recognition with Retrieval-Augmented Generation

System Architecture

Project Structure

Features

Mobile App

Server (RAG-HAR Pipeline)

Supported Activities

Quick Start

1. Server Setup

2. Mobile App Setup

3. Configure Connection

Usage

Collect Training Data

Real-Time Recognition

WebSocket Protocol

Client → Server

Server → Client

RAG Classification Details

Technology Stack

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages