🚀 Quick Start

Clone the repository

https://github.com/IEEE-SB-VIT-Pune/agentPlay.git 
cd agentPlay

Set Up the Environment

Create and activate a virtual environment:

python -m venv env
env/Scripts/activate

Install the dependencies:

pip install -r backend/requirements.txt

Create a .env file in the project root with your configurations:

GEMINI_API_KEY="<GEMINI_KEY>"
SERPER_API_KEY="<SERPER_KEY>"
MISTRAL_API_KEY="<MISTRAL_KEY>"

Run App:
```
python backend/main.py  
```

🧩 Running the Chrome Extension

Open Google Chrome.
Click the three dots in the top-right corner.
Go to Extensions > Manage Extensions.
Enable Developer Mode (toggle at the top-right).
Click Load Unpacked.
Browse to the frontend/extension directory and select it.
The extension should now be loaded and active.

Flowchart

The following flowchart illustrates the system's architecture and data flow:

flowchart TD
    classDef clientNode fill:#f9d5e5,stroke:#333,stroke-width:1px,color:#333,font-weight:bold
    classDef processNode fill:#eeeeee,stroke:#333,stroke-width:1px,color:#333
    classDef decisionNode fill:#e3f0f7,stroke:#333,stroke-width:1px,color:#333,font-weight:bold
    classDef storeNode fill:#d0f0c0,stroke:#333,stroke-width:1px,color:#333
    classDef outputNode fill:#ffeb99,stroke:#333,stroke-width:1px,color:#333,font-weight:bold
    classDef crewAINode fill:#d8e1fa,stroke:#333,stroke-width:1px,color:#333
    classDef highlightNode fill:#ffd580,stroke:#ff8c00,stroke-width:2px,color:#333,font-weight:bold
    
    A[User Request] --> B[Video ID & Target Language]
    
    B --> C{Transcript<br>Exists?}
    C -->|No| D[Create Transcript Store]
    D --> E[Fetch YouTube Transcript]
    E --> F{Source Lang<br>is English?}
    
    F -->|Yes| G[Use Original Transcript]
    F -->|No| H[Translate to English]
    
    C -->|Yes| I{Audio Segment<br>Exists?}
    
    G --> J[Store English Transcript]
    H --> J
    
    I -->|No| K[Process Missing Segment]
    I -->|Yes| L[Serve Existing Audio]
    
    K --> M[Extract Context Window]
    M --> N[Translate Segment]
    
    subgraph crewAI [CrewAI Translation Process]
        direction TB
        H1[Split Text into<br>500-word Chunks] --> H2[Process Chunks<br>Concurrently]
        H2 --> H3[Translation Agent]
        H3 --> H4[Join Results]
    end
    
    subgraph context [Contextual Translation]
        direction TB
        N1[Extract 100-word Context] --> N2[CrewAI Translation Agent]
        N2 --> N3[Context-Aware Translation]
    end
    
    H --> crewAI
    crewAI --> H
    
    N --> context
    context --> N
    
    N --> O[Generate Audio in<br>Target Language]
    O --> P[Save Audio Segment]
    P --> L
    
    J --> Q[Create Summary]
    Q --> R[Store Summary]
    
    L --> S[Return Audio to Client]
    R --> T[Return Summary to Client]
    
    class A,B clientNode
    class D,E,G,H,K,M,N,O,P,Q processNode
    class C,F,I decisionNode
    class J,R storeNode
    class S,T outputNode
    class H1,H2,H3,H4,N1,N2,N3 crewAINode
    class H1,N1 highlightNode

Key Components

Transcript Extraction: YouTube transcripts are extracted using YouTube Transcript API
Translation System: CrewAI agents handle translation with contextual awareness
Context-Aware Processing:
- Large texts are split into 500-word chunks for efficient processing
- Each segment uses a 100-word context window for accurate translation
Audio Generation: Edge TTS converts translated text to natural-sounding speech
Caching System: Processed segments are stored for future requests

Chrome Extension

The extension provides a user interface overlay for YouTube videos, allowing users to interact with the translation and language learning features
Features a popup interface that detects when users are on a YouTube video page
Includes buttons for fetching english transcript, creating real-time translations, and generating summary and notes
Implements a sync system that highlights transcript lines as the video plays, helping users follow along
Contains a language selection form where users can specify their target language for translation
Features an audio playback system that retrieves translated audio segments from the backend server and plays them in real time with the current video timestamp
Implements proper error handling for various scenarios like missing transcripts or connectivity issues
Caches transcript data locally to improve performance for repeated operations
Requires proper permissions in the manifest file, including access to YouTube domains and scripting capabilities

Supported Languages

The system supports translation and audio generation for multiple languages, including English, Hindi, Spanish, French, German, Japanese, Korean, Chinese, and more.

API Endpoints

Video Content

/show_transcript/<video_id>: Returns full transcript with timestamps
- Response: JSON with transcript segments including start/end times
/show_data/<video_id>: Retrieves video metadata
- Response: JSON with title, channel, duration, and other metadata
/concise_summary/<video_id>: Gets AI-generated concise summary
- Response: JSON with concise_summary field
/notes/<video_id>: Retrieves AI-generated structured notes
- Response: JSON with notes field
/listen_audio/<video_id>/<target_language>/<segment_number>: Gets translated audio segment
- Parameters: target_language (e.g., "es"), segment_number (index)
- Response: Audio file of translated segment

Interactive Q&A

/precompute/<video_id>: Prepares video data for Q&A
- Response: JSON indicating success/failure
/process: Processes a query about video content
- Method: POST
- Body: {"query": "...", "video_id": "...", "addition_mode": false}
- Response: JSON with answer, video title, and channel

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
backend		backend
frontend/extension		frontend/extension
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Quick Start

🧩 Running the Chrome Extension

Flowchart

Key Components

Chrome Extension

Supported Languages

API Endpoints

Video Content

Interactive Q&A

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 Quick Start

🧩 Running the Chrome Extension

Flowchart

Key Components

Chrome Extension

Supported Languages

API Endpoints

Video Content

Interactive Q&A

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages