Video LLM Analysis Platform

A comprehensive platform for analyzing video content using Google's Gemini models. This project enables downloading YouTube videos, processing them with Gemini Flash 2.0, and generating intelligent responses to questions about the videos.

Overview

This platform streamlines the process of working with video datasets by:

Fetching datasets from HuggingFace
Downloading videos using pytubefix
Uploading videos to Google's File API
Performing inference with Gemini Flash 2.0
Saving results to CSV for analysis

The modular architecture separates concerns, making the workflow maintainable and efficient.

Architecture

The system is composed of three main components:

┌───────────────────┐      ┌───────────────────┐      ┌───────────────────┐
│  fetch_dataset.py │ ──► │ download_upload.py │ ──► │   inference.py    │
└───────────────────┘      └───────────────────┘      └───────────────────┘
       │                         │                          │
       ▼                         ▼                          ▼
┌─────────────┐         ┌─────────────────┐        ┌─────────────────┐
│ dataset.csv │         │ videos + metadata│        │   results.csv   │
└─────────────┘         └─────────────────┘        └─────────────────┘

Features

Efficient Processing: Checks for previously downloaded/uploaded videos to avoid duplication
Robust Error Handling: Automatic retries with exponential backoff
Interactive Progress: Detailed logging and progress bars
Flexible Configuration: Extensive command-line options
Metadata Tracking: Comprehensive tracking of video data
Modular Design: Separates fetching, downloading, and inference
Resume Support: Can be stopped and resumed at any point

Installation

Prerequisites

Python 3.8+
HuggingFace Account with access to the dataset
Google API Key with access to Gemini models

Steps

Clone this repository:

git clone https://github.com/dadevchia/tiktokllm.git
cd tiktokllm

Install required packages:
```
pip install -r requirements.txt
```
Log in to HuggingFace (required to access datasets):
```
huggingface-cli login
```
Create a .env file with your API credentials:
```
touch .env
```

Add the following to your .env file:

GEMINI_API_KEY=your_google_api_key_here

Usage

Step 1: Fetch the Dataset

# Basic usage
python fetch_dataset.py

# Advanced usage
python fetch_dataset.py --split test --output custom_dataset.csv --cache-dir ./hf_cache

Options:

--output: Output CSV file path (default: dataset.csv)
--split: Dataset split to use (default: test)
--cache-dir: Cache directory for HuggingFace datasets

Step 2: Download and Upload Videos

# Basic usage
python download_upload.py

# Advanced usage
python download_upload.py --batch-size 20 --input-csv custom_dataset.csv

Options:

--api-key: Your Google Generative AI API key (uses .env by default)
--batch-size: Number of videos to process in one batch (default: 10)
--start-index: Index to start from in the dataset (default: 0)
--max-videos: Maximum number of videos to process (default: all)
--input-csv: Path to input CSV file (default: dataset.csv)

Step 3: Perform Inference

# Basic usage
python inference.py

# Advanced usage
python inference.py --model gemini-2.0-flash --retry 5 --output custom_results.csv

Options:

--api-key: Your Google Generative AI API key (uses .env by default)
--model: Model name to use (default: gemini-2.0-flash)
--retry: Number of retries for failed inferences (default: 3)
--start-index: Index to start from in the list of videos (default: 0)
--max-videos: Maximum number of videos to process (default: all)
--output: Output CSV file for results (default: results.csv)

Output Files

dataset.csv: HuggingFace dataset in CSV format
video_metadata.csv: Metadata about downloaded and uploaded videos
downloaded_videos.json: Tracking info for downloaded and uploaded videos
results.csv: Final inference results with qid and pred columns
videos/: Directory containing downloaded videos organized by QID prefix

Example Workflow

A typical workflow might look like:

# 1. Set up environment and fetch dataset
python fetch_dataset.py

# 2. Download first 10 videos for testing
python download_upload.py --max-videos 10

# 3. Run inference on these 10 videos
python inference.py --max-videos 10

# 4. Process the remaining videos in batches
python download_upload.py --start-index 10 --batch-size 20
python inference.py --start-index 10

Troubleshooting

Common Issues

Video Download Failures
- Check internet connection
- Verify video still exists on YouTube
- Try updating pytubefix: pip install --upgrade pytubefix
Google API Errors
- Verify your API key is correct in the .env file
- Check if you have access to the specified model
- Ensure your Google Cloud billing is properly set up
HuggingFace Access Issues
- Run huggingface-cli login again
- Verify you have access to the dataset
- Check your internet connection

Logs

All scripts create detailed logs in the console. For persistent logs, redirect output:

python download_upload.py > download_log.txt 2>&1

Project Structure

tiktokllm/
├── fetch_dataset.py     # Script to fetch HuggingFace dataset
├── download_upload.py   # Script to download/upload videos
├── inference.py         # Script for Gemini inference
├── .env                 # Environment variables (API keys)
├── requirements.txt     # Project dependencies
├── dataset.csv          # Generated by fetch_dataset.py
├── video_metadata.csv   # Generated by download_upload.py
├── downloaded_videos.json # Generated by download_upload.py
├── results.csv          # Generated by inference.py
└── videos/              # Downloaded video files
    └── [QID_PREFIX]/    # Organized by QID prefix

Best Practices

Start with a small batch (e.g., --max-videos 5) to test your setup
Run scripts using screen or tmux for long-running processes
Regularly backup your metadata and tracking files
Monitor disk space, especially if downloading many videos
Consider setting up a cron job for large datasets

Contributors

[Your Name/Organization]

License

[Specify your license here]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Video LLM Analysis Platform

Overview

Architecture

Features

Installation

Prerequisites

Steps

Usage

Step 1: Fetch the Dataset

Step 2: Download and Upload Videos

Step 3: Perform Inference

Output Files

Example Workflow

Troubleshooting

Common Issues

Logs

Project Structure

Best Practices

Contributors

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Video LLM Analysis Platform

Overview

Architecture

Features

Installation

Prerequisites

Steps

Usage

Step 1: Fetch the Dataset

Step 2: Download and Upload Videos

Step 3: Perform Inference

Output Files

Example Workflow

Troubleshooting

Common Issues

Logs

Project Structure

Best Practices

Contributors

License