Skip to content

Real-time video analytics with AI - Natural language queries for YouTube, webcam, and video files

License

Notifications You must be signed in to change notification settings

nihal-5/video-query

Repository files navigation

AI Video Query System

A real-time video analytics application built with Next.js that allows users to query video content using natural language. Supports YouTube live streams, video files, and webcam feeds with AI-powered analysis using Google's Gemini API.

🚀 Live Demo

View in AI Portfolio Dashboard | Direct Access

Part of Nihal's AI Portfolio - Unified dashboard featuring 5 cutting-edge AI services

Next.js TypeScript Gemini API License

Features

  • AI-Powered Analysis - Gemini 2.0 Flash for video understanding
  • Multiple Input Sources
    • YouTube live streams
    • Video file uploads
    • Webcam capture
  • Natural Language Queries - Ask questions in plain English
  • Real-Time Object Detection - COCO-SSD model for live object detection
  • Advanced Computer Vision - Face detection, hand tracking, and pose estimation
  • Session Analytics - Track and export detection data

Prerequisites

Installation

  1. Clone the repository
git clone https://github.com/nihal-5/video-query.git
cd video-query
  1. Install dependencies
npm install
  1. Set up environment variables
cp .env.example .env.local

Add your Gemini API key to .env.local:

GEMINI_API_KEY=your_actual_api_key_here
  1. Run the development server
npm run dev
  1. Open http://localhost:3000 in your browser

Tech Stack

  • Framework: Next.js 15 with App Router
  • Language: TypeScript
  • Styling: Tailwind CSS
  • AI/ML:
    • Google Gemini 2.0 Flash API
    • TensorFlow.js with COCO-SSD
    • MediaPipe for advanced detection
  • UI Components: Framer Motion, Lucide React

Project Structure

video-query/
├── app/
│   ├── api/query/        # API endpoint for AI queries
│   ├── page.tsx          # Main application page
│   └── layout.tsx        # Root layout
├── components/
│   ├── VideoSelector.tsx
│   ├── LiveYouTubeStream.tsx
│   ├── WebcamCapture.tsx
│   ├── QueryInterface.tsx
│   ├── SessionManager.tsx
│   └── AdvancedSessionManager.tsx
├── hooks/
│   ├── useObjectDetection.ts
│   ├── useHandTracking.ts
│   └── useAdvancedFaceDetection.ts
└── lib/
    ├── gemini.ts         # Gemini API client
    └── publicCameras.ts  # Camera configurations

Usage

Query Live Streams

  1. Select "Live YouTube Stream" mode
  2. Paste a YouTube live stream URL
  3. Ask questions about the video content

Upload Videos

  1. Select "YouTube Videos" mode
  2. Upload a video file or paste a YouTube URL
  3. Pause at any frame and ask questions

Use Webcam

  1. Select "Your Webcam" mode
  2. Allow camera permissions
  3. Real-time object detection and analysis

Example Queries

  • "What objects are visible in this frame?"
  • "How many people can you see?"
  • "Describe the current scene"
  • "What is happening in the video?"

API Reference

Query Endpoint

POST /api/query

Request body:

{
  "imageData": "base64_encoded_image",
  "query": "What do you see?",
  "cameraName": "Source name"
}

Response:

{
  "result": "AI-generated description..."
}

Development

Build for production:

npm run build

Start production server:

npm start

Run linter:

npm run lint

Troubleshooting

Node.js version issues:

  • Ensure you have Node.js 18 or higher: node --version
  • If using older version, update via nodejs.org
  • Consider using nvm for version management

API key errors:

  • Verify .env.local file exists in project root
  • Check API key format: should start with valid Gemini key format
  • Restart dev server after updating environment variables
  • Get your key from Google AI Studio

Port 3000 already in use:

  • Find and kill process: lsof -ti:3000 | xargs kill -9
  • Or use a different port: npm run dev -- -p 3001

Webcam permission denied:

  • Check browser permissions in settings
  • Ensure you're using HTTPS or localhost
  • Try a different browser if issues persist
  • Some browsers require explicit permission grant

Build failures:

  • Clear cache: rm -rf .next node_modules
  • Reinstall dependencies: npm install
  • Check for TypeScript errors: npm run lint
  • Ensure all peer dependencies are compatible

License

MIT License - Free to use for personal and educational purposes

Author

Nihal Gupta
GitHub: @nihal-5
Location: Raleigh/Cary, NC

Acknowledgments

  • Google Gemini AI for powerful video understanding capabilities
  • TensorFlow.js team for COCO-SSD model
  • Next.js team for the framework

About

Real-time video analytics with AI - Natural language queries for YouTube, webcam, and video files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages