Skip to content

Commit 8f65132

Browse files
authored
Add files via upload
1 parent 2f7d797 commit 8f65132

File tree

12 files changed

+1543
-0
lines changed

12 files changed

+1543
-0
lines changed
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) Microsoft Corporation.
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE
Lines changed: 268 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
# Video Translation Agent
2+
3+
## Summary
4+
5+
The Video Translation Agent is an AI-powered agent that enables users to translate videos between languages with customizable speech and subtitle options. It leverages Azure AI Foundry, Azure AI Cognitive Services, and Azure Blob Storage to provide high-quality video translations.
6+
7+
## Demo
8+
9+
<div align="center">
10+
<video width="800" controls>
11+
<source src="./assets/demo.mp4" type="video/mp4">
12+
Your browser does not support the video tag.
13+
</video>
14+
</div>
15+
16+
You can find the demo video at `./assets/demo.mp4` in the repository.
17+
18+
## Use Cases
19+
20+
1. **Corporate Communications** 🏢 - Translate training videos, executive messages, and internal communications to reach global employees in their native languages.
21+
22+
2. **Educational Content** 🎓 - Enable educators to translate course materials and lectures for international students, expanding the reach of educational content.
23+
24+
3. **Marketing and Sales** 📊 - Localize marketing videos and sales presentations for different regions and markets, increasing engagement with international audiences.
25+
26+
## Architecture Overview
27+
28+
The Video Translation Agent uses:
29+
- **Azure AI Agents** - For conversation management and orchestration
30+
- **Semantic Kernel** - For creating the AI Agent and managing plugins
31+
- **Azure AI's Cognitive Services Video Translation API** - For the core video translation functionality
32+
- **Azure Blob Storage** - For secure file storage and retrieval
33+
34+
<div align="center">
35+
<img src="./assets/architecture-diagram.png" alt="Architecture Diagram" width="800"/>
36+
</div>
37+
38+
## Setup Instructions
39+
40+
### Prerequisites
41+
42+
- [Python > 3.11.10](https://www.python.org/downloads/)
43+
- [Git](https://git-scm.com/downloads)
44+
- Azure resources:
45+
- **Azure subscription**: [Create a free account](https://azure.microsoft.com/free/) if you don't already have one
46+
- **Azure AI Speech Services**:
47+
- Create a [Speech resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices) in a [supported region for video translation](https://learn.microsoft.com/azure/ai-services/speech-service/regions#speech-service)
48+
- **Azure Blob Storage**:
49+
- [Create a storage account](https://learn.microsoft.com/azure/storage/common/storage-account-create) with a container configured for video uploads
50+
- To provide video access either
51+
- Generate a [SAS token](https://learn.microsoft.com/azure/storage/common/storage-sas-overview) to provide as a video URL directly to the agent
52+
- Tell the agent where your video is located locally (via file path) and the agent will handle uploading the video to the storage account and generating the url
53+
- **Azure CLI**: [Install](https://learn.microsoft.com/cli/azure/install-azure-cli) and authenticate using `az login`
54+
55+
### Setting up Azure AI Agent Service
56+
57+
Follow the [official Azure AI Agent Service documentation](https://learn.microsoft.com/azure/ai-services/agents/quickstart?pivots=ai-foundry-portal) quickstart. This guides you through creating a hub and project in the Azure AI Foundry Portal and deploying your model. Once completed, you'll have the necessary credentials to configure your environment variables in the next steps.
58+
59+
### Quickstart
60+
61+
1. Clone the repository and navigate to the `video-translation-agent` folder using the "cd" command in the terminal.
62+
63+
2. Create and activate a virtual environment:
64+
65+
```bash
66+
python -m venv .venv
67+
source .venv/bin/activate # On Windows, use: .venv\Scripts\activate
68+
```
69+
70+
3. Install the required dependencies:
71+
72+
```bash
73+
pip install -r requirements.txt
74+
```
75+
76+
4. Create a `.env` file based on the `.env.sample` provided:
77+
78+
```bash
79+
cp .env.sample .env
80+
```
81+
82+
5. Update the `.env` file.
83+
84+
### Running the Video Translation Agent
85+
86+
Start the agent with:
87+
88+
```bash
89+
python template.py
90+
```
91+
92+
The agent will guide you through the process of translating videos, providing helpful prompts for required information. To exit the agent at any time, simply type `exit` or `quit` in the console.
93+
94+
## Sample Data
95+
96+
### Video Requirements
97+
98+
- .mp4 format video file
99+
- Less than 5 GB in size
100+
- Shorter than 4 hours in duration
101+
- Verify your [source and target languages are supported](https://learn.microsoft.com/azure/ai-services/speech-service/language-support?tabs=video-translation)
102+
103+
### Sample Files for Testing
104+
105+
You can provide the agent with this video URL directly [es-ES-TryOutOriginal.mp4](https://speechstudioprodpublicsa.blob.core.windows.net/ttsvoice/VideoTranslation/PublicDoc/SampleData/es-ES-TryOutOriginal.mp4).
106+
107+
The repository also includes sample files in the `sample_data/` directory to help you test the agent:
108+
109+
- **Sample Video in English**: `sample_data\en-US-TryoutOriginalTTSIntro.mp4`
110+
- **Sample Video in Spanish**: `sample_data\es-ES-TryOutOriginalElSalvadorMicrosoftIntro.mp4`
111+
112+
## Example Agent Interaction
113+
114+
### Video Translation
115+
116+
Translate videos from one language to another with the following options:
117+
- Source and target languages
118+
- Voice type (Platform Voice or Personal Voice)
119+
- Speaker count customization
120+
- Subtitle character limit adjustment
121+
- Embedded subtitle options
122+
123+
### File Management
124+
125+
The agent can work with:
126+
- Local video files (automatically uploads to Azure Blob Storage)
127+
- Remote video URLs (direct processing)
128+
- WebVTT subtitle files for custom iterations
129+
130+
### Translation Management
131+
132+
Manage your translations with commands to:
133+
- List all translations
134+
- Get details about specific translations
135+
- Create new iterations with customized subtitles
136+
- Delete translations
137+
138+
### Translating a Video
139+
140+
```
141+
You: I'd like to translate a video from English to Japanese
142+
Agent: I can help you translate your video from English to Japanese. First, I need to know if you have a local video file on your computer or a URL to a video stored online.
143+
144+
You: I have a local file at C:\Videos\my_presentation.mp4
145+
Agent: Great! I'll need to upload that file to Azure Blob Storage first so it can be processed.
146+
[Agent uploads file]
147+
Video uploaded successfully!
148+
Access URL: https://yourstorageaccount.blob.core.windows.net/videos/my_presentation.mp4
149+
Secure access URL (valid for 24 hours): https://yourstorageaccount.blob.core.windows.net/videos/my_presentation.mp4?sv=...
150+
151+
Now I need a few more details:
152+
1. Would you like to use "PlatformVoice" or "PersonalVoice" for the translation? PlatformVoice is the standard option.
153+
[...]
154+
```
155+
156+
157+
```
158+
You: I need to translate our new training video from English to Spanish, French, and German for our international facilities. The video is hosted on our secure server.
159+
160+
Agent: I'd be happy to help translate your training video. Do you have a URL for the video, or is it stored locally on your computer?
161+
162+
You: It's available at https://yourstorageaccount.blob.core.windows.net/videos/my_training_video.mp4
163+
164+
Agent: Perfect. I'll use this URL directly for the translation process. Now I need some details for the translation:
165+
1. Would you like to enable subtitles for accessibility?
166+
2. How many speakers are in your video? This helps with speaker identification.
167+
168+
You: Yes, please enable subtitles. We have 2 speakers - a narrator and a demonstrator.
169+
170+
Agent: I'll set up the translations with subtitles and optimize for 2 speakers.
171+
172+
Starting translation process for all three languages:
173+
- English → Spanish (es-ES): Processing...
174+
- English → French (fr-FR): Processing...
175+
- English → German (de-DE): Processing...
176+
```
177+
178+
### Creating an Iteration with Custom Subtitles
179+
180+
```
181+
You: I want to improve the subtitles for my translation
182+
Agent: I can help you create a new iteration with custom subtitles. Do you have the Translation ID for the video?
183+
184+
You: Yes, it's 06202024123045_en-US_ja-JP_PlatformVoice
185+
Agent: Great! Now I need the URL to your WebVTT subtitle file and the type of WebVTT file you're providing.
186+
[...]
187+
```
188+
189+
### Listing previous translations
190+
191+
```
192+
You: Could you please provide me with my five most recent translations?
193+
Agent: I'll retrieve your 5 most recent translations:
194+
195+
1. Translation ID: 07152023153045_en-US_ja-JP_PlatformVoice
196+
- Source Language: English (en-US)
197+
- Target Language: Japanese (ja-JP)
198+
- Created: July 15, 2025 15:30:45
199+
- Status: Completed
200+
- Video URL: https://yourstorageaccount.blob.core.windows.net/videos/quarterly_review.mp4
201+
202+
2. Translation ID: 07102023091230_en-US_fr-FR_PlatformVoice
203+
- Source Language: English (en-US)
204+
- Target Language: French (fr-FR)
205+
- Created: July 10, 2025 09:12:30
206+
- Status: Completed
207+
- Video URL: https://yourstorageaccount.blob.core.windows.net/videos/product_launch.mp4
208+
209+
3. Translation ID: 07052023142215_en-US_es-ES_PlatformVoice
210+
- Source Language: English (en-US)
211+
- Target Language: Spanish (es-ES)
212+
- Created: July 5, 2025 14:22:15
213+
- Status: Completed
214+
- Video URL: https://yourstorageaccount.blob.core.windows.net/videos/training_module_3.mp4
215+
216+
4. Translation ID: 06302023103345_en-US_de-DE_PlatformVoice
217+
- Source Language: English (en-US)
218+
- Target Language: German (de-DE)
219+
- Created: June 30, 2025 10:33:45
220+
- Status: Completed
221+
- Video URL: https://yourstorageaccount.blob.core.windows.net/videos/executive_message.mp4
222+
223+
5. Translation ID: 06252023081120_en-US_it-IT_PlatformVoice
224+
- Source Language: English (en-US)
225+
- Target Language: Italian (it-IT)
226+
- Created: June 25, 2025 08:11:20
227+
- Status: Completed
228+
- Video URL: https://yourstorageaccount.blob.core.windows.net/videos/customer_testimonial.mp4
229+
230+
Would you like to get more details about any specific translation or perform any actions with these translations?
231+
```
232+
233+
## ⚠️ Troubleshooting
234+
235+
Common issues and solutions:
236+
237+
- **Authentication Errors**: Ensure you're logged in with `az login` and have the proper permissions.
238+
- **File Upload Issues**: Check your storage account name and permissions.
239+
- **Translation Failures**: Verify your video format is supported and your Cognitive Services endpoint is correct.
240+
241+
### Logging Configuration
242+
243+
By default, logging is set to the ERROR level. If you need more detailed output for debugging or monitoring, you can adjust the logging level:
244+
245+
```bash
246+
# Options include: DEBUG, INFO, WARNING, ERROR, CRITICAL
247+
export LOG_LEVEL=DEBUG # On Windows, use: set LOG_LEVEL=DEBUG
248+
```
249+
250+
You can also modify the logging configuration directly in the code by updating the `logging.basicConfig()` call in `template.py`:
251+
252+
```python
253+
# Example: Change to INFO level logging with more detailed format
254+
logging.basicConfig(
255+
level=logging.INFO,
256+
format='%(asctime)s [%(levelname)s] %(name)s: %(message)s',
257+
datefmt='%Y-%m-%d %H:%M:%S'
258+
)
259+
```
260+
261+
## Resources
262+
- [Semantic Kernel Official Documentation](https://learn.microsoft.com/semantic-kernel/overview/)
263+
- [Azure AI Agent Service Documentation](https://learn.microsoft.com/azure/ai-services/agents/)
264+
- [Azure AI Speech Service: Video Translation Documentation](https://learn.microsoft.com/azure/ai-services/speech-service/video-translation-overview)
265+
- [Azure AI Foundry Documentation](https://learn.microsoft.com/azure/ai-foundry/)
266+
- [Azure AI Speech Service Documentation](https://learn.microsoft.com/azure/ai-services/speech-service/)
267+
- [Azure Blob Storage Documentation](https://learn.microsoft.com/azure/storage/blobs/)
268+
- [Azure AI Services Overview](https://learn.microsoft.com/azure/ai-services/)
329 KB
Loading
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
azure-identity
2+
azure-storage-blob
3+
semantic-kernel
4+
semantic-kernel[openai]
5+
python-dotenv
6+
aiohttp
7+
requests
8+
urllib3
9+
orjson
10+
pydantic
11+
termcolor
12+
azure-cognitiveservices-speech

0 commit comments

Comments
 (0)