A real-time video analytics application built with Next.js that allows users to query video content using natural language. Supports YouTube live streams, video files, and webcam feeds with AI-powered analysis using Google's Gemini API.
View in AI Portfolio Dashboard | Direct Access
Part of Nihal's AI Portfolio - Unified dashboard featuring 5 cutting-edge AI services
- AI-Powered Analysis - Gemini 2.0 Flash for video understanding
- Multiple Input Sources
- YouTube live streams
- Video file uploads
- Webcam capture
- Natural Language Queries - Ask questions in plain English
- Real-Time Object Detection - COCO-SSD model for live object detection
- Advanced Computer Vision - Face detection, hand tracking, and pose estimation
- Session Analytics - Track and export detection data
- Node.js 18 or higher
- npm or yarn
- Gemini API key from Google AI Studio
- Clone the repository
git clone https://github.com/nihal-5/video-query.git
cd video-query- Install dependencies
npm install- Set up environment variables
cp .env.example .env.localAdd your Gemini API key to .env.local:
GEMINI_API_KEY=your_actual_api_key_here
- Run the development server
npm run dev- Open http://localhost:3000 in your browser
- Framework: Next.js 15 with App Router
- Language: TypeScript
- Styling: Tailwind CSS
- AI/ML:
- Google Gemini 2.0 Flash API
- TensorFlow.js with COCO-SSD
- MediaPipe for advanced detection
- UI Components: Framer Motion, Lucide React
video-query/
├── app/
│ ├── api/query/ # API endpoint for AI queries
│ ├── page.tsx # Main application page
│ └── layout.tsx # Root layout
├── components/
│ ├── VideoSelector.tsx
│ ├── LiveYouTubeStream.tsx
│ ├── WebcamCapture.tsx
│ ├── QueryInterface.tsx
│ ├── SessionManager.tsx
│ └── AdvancedSessionManager.tsx
├── hooks/
│ ├── useObjectDetection.ts
│ ├── useHandTracking.ts
│ └── useAdvancedFaceDetection.ts
└── lib/
├── gemini.ts # Gemini API client
└── publicCameras.ts # Camera configurations
- Select "Live YouTube Stream" mode
- Paste a YouTube live stream URL
- Ask questions about the video content
- Select "YouTube Videos" mode
- Upload a video file or paste a YouTube URL
- Pause at any frame and ask questions
- Select "Your Webcam" mode
- Allow camera permissions
- Real-time object detection and analysis
- "What objects are visible in this frame?"
- "How many people can you see?"
- "Describe the current scene"
- "What is happening in the video?"
POST /api/query
Request body:
{
"imageData": "base64_encoded_image",
"query": "What do you see?",
"cameraName": "Source name"
}Response:
{
"result": "AI-generated description..."
}Build for production:
npm run buildStart production server:
npm startRun linter:
npm run lintNode.js version issues:
- Ensure you have Node.js 18 or higher:
node --version - If using older version, update via nodejs.org
- Consider using nvm for version management
API key errors:
- Verify
.env.localfile exists in project root - Check API key format: should start with valid Gemini key format
- Restart dev server after updating environment variables
- Get your key from Google AI Studio
Port 3000 already in use:
- Find and kill process:
lsof -ti:3000 | xargs kill -9 - Or use a different port:
npm run dev -- -p 3001
Webcam permission denied:
- Check browser permissions in settings
- Ensure you're using HTTPS or localhost
- Try a different browser if issues persist
- Some browsers require explicit permission grant
Build failures:
- Clear cache:
rm -rf .next node_modules - Reinstall dependencies:
npm install - Check for TypeScript errors:
npm run lint - Ensure all peer dependencies are compatible
MIT License - Free to use for personal and educational purposes
Nihal Gupta
GitHub: @nihal-5
Location: Raleigh/Cary, NC
- Google Gemini AI for powerful video understanding capabilities
- TensorFlow.js team for COCO-SSD model
- Next.js team for the framework