Enhance README and add Gradio web interface for Audio2Face-3D API client by Yash-Kavaiya · Pull Request #9 · NVIDIA/Audio2Face-3D-Samples

Yash-Kavaiya · 2025-12-01T18:56:02Z

Improve the README to provide comprehensive information about the Audio2Face-3D API client, including features, installation instructions, and usage examples. Introduce a Gradio web interface for real-time interaction and testing of the API functionalities. Add necessary dependencies for the web interface and enhance the overall structure of the documentation.

Copilot

Pull request overview

This pull request enhances the Audio2Face-3D API client by introducing a comprehensive Gradio-based web interface and significantly improving documentation. The PR adds interactive functionality for testing and using the Audio2Face-3D API, making it more accessible to users who prefer a visual interface over command-line tools.

Key Changes

New Gradio Web Interface: Adds app.py with a full-featured web UI including audio upload/recording, sample audio selection, real-time emotion parameter adjustment, and visualization video generation
Enhanced Documentation: Comprehensive README overhaul with detailed installation instructions, usage examples, configuration details, and well-structured sections covering all aspects of the API client
Additional Dependencies: Introduces Gradio and OpenCV for web interface and visualization capabilities

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 23 comments.

File	Description
`scripts/audio2face_3d_api_client/requirements`	Adds Gradio 6.0.1 and opencv-python-headless 4.12.0.88 dependencies for web interface support
`scripts/audio2face_3d_api_client/app.py`	New 637-line Gradio application providing interactive web interface with audio processing, emotion control, blendshape visualization, and multi-format output
`scripts/audio2face_3d_api_client/README.md`	Major documentation enhancement from 50 to 225 lines, adding structured sections for features, installation, usage examples, configuration details, and comprehensive API reference

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-01T19:04:32Z

scripts/audio2face_3d_api_client/app.py

+        return None, None, f"Error reading audio: {str(e)}"
+


The error message for audio reading failures is too generic. Consider providing more specific guidance based on common issues:

except Exception as e: error_msg = f"Error reading audio file: {str(e)}\n" error_msg += "Ensure the file is a valid WAV format with PCM encoding." return None, None, error_msg

Suggested change

return None, None, f"Error reading audio: {str(e)}"

error_msg = f"Error reading audio file: {str(e)}\n"

error_msg += "Ensure the file is a valid WAV format with PCM encoding."

return None, None, error_msg

Copilot · 2025-12-01T19:04:33Z

scripts/audio2face_3d_api_client/README.md

+- Sample audio selection
+- Real-time emotion parameter adjustment
+- Visual blendshape output preview
+- CSV export functionality


The description "CSV export functionality" is misleading. Based on the code in app.py (lines 502-508), the web interface exports both CSV and JSON formats. Consider updating to:

- Downloadable animation data in JSON and CSV formats

Suggested change

- CSV export functionality

- Downloadable animation data in JSON and CSV formats

Copilot · 2025-12-01T19:04:33Z

scripts/audio2face_3d_api_client/app.py

+    return output_path
+
+
+async def process_audio_async(audio_path: str, model: str, emotion: str, emotion_strength: float, 


The function name process_audio_async could be more descriptive. Consider renaming to process_audio_with_a2f_api or generate_animation_from_audio to better describe what the function does (it generates animation, not just processes audio).

Suggested change

async def process_audio_async(audio_path: str, model: str, emotion: str, emotion_strength: float,

async def generate_animation_from_audio(audio_path: str, model: str, emotion: str, emotion_strength: float,

Copilot · 2025-12-01T19:04:34Z

scripts/audio2face_3d_api_client/app.py

+    if audio_path is None:
+        return None, None, "Please upload an audio file or select a sample audio."
+
+    return asyncio.run(process_audio_async(audio_path, model, emotion, emotion_strength, progress))


[nitpick] Using asyncio.run() inside a synchronous wrapper function is fine for simple cases, but Gradio already has async support. Consider making the wrapper function async and using Gradio's native async support for better performance:

async def process_audio(audio_path, sample_audio, model, emotion, emotion_strength, progress=gr.Progress()): # ... validation logic ... return await process_audio_async(audio_path, model, emotion, emotion_strength, progress)

This avoids creating a new event loop for each request.

Copilot · 2025-12-01T19:04:34Z

scripts/audio2face_3d_api_client/app.py

+    total_frames = int(duration * fps) + 1
+
+    # Create video writer
+    fourcc = cv2.VideoWriter_fourcc(*'mp4v')


Using the deprecated 'mp4v' codec which has compatibility issues on many platforms. Consider using 'avc1' or letting OpenCV choose a platform-appropriate codec, or use H.264 directly:

fourcc = cv2.VideoWriter_fourcc(*'avc1') # Better compatibility # or fourcc = cv2.VideoWriter_fourcc('H','2','6','4') # H.264 codec

Note: The video is later re-encoded with libx264 by ffmpeg, so this temporary codec choice is less critical, but still worth fixing for clarity.

Suggested change

fourcc = cv2.VideoWriter_fourcc(*'mp4v')

fourcc = cv2.VideoWriter_fourcc(*'avc1') # Use H.264/AVC for better compatibility

Copilot · 2025-12-01T19:04:39Z

scripts/audio2face_3d_api_client/README.md

+
+```bash


The instructions for launching the web interface are incomplete. The app requires an API key to be set (hardcoded in the current implementation, but should be via environment variable). Add documentation about setting the API key before running:

Launch the interactive web interface: **Prerequisites**: Set your NVIDIA API key as an environment variable: ```bash export NVIDIA_API_KEY=your-api-key-here

Then launch the app:

python3 ./app.py

Suggested change

```bash

**Prerequisites**: Set your NVIDIA API key as an environment variable:

```bash

export NVIDIA_API_KEY=your-api-key-here

Then launch the app:

Copilot · 2025-12-01T19:04:39Z

scripts/audio2face_3d_api_client/app.py

+import gradio as gr
+import asyncio
+import os
+import tempfile


Import of 'tempfile' is not used.

Suggested change

import tempfile

Copilot · 2025-12-01T19:04:39Z

scripts/audio2face_3d_api_client/app.py

+import shutil
+import numpy as np
+import scipy.io.wavfile
+import yaml


Import of 'yaml' is not used.

Suggested change

import yaml

Copilot · 2025-12-01T19:04:40Z

scripts/audio2face_3d_api_client/app.py

+# Audio2Face imports
+import a2f_3d.client.auth
+from nvidia_ace.services.a2f_controller.v1_pb2_grpc import A2FControllerServiceStub
+from nvidia_ace.animation_data.v1_pb2 import AnimationData, AnimationDataStreamHeader


Import of 'AnimationData' is not used.
Import of 'AnimationDataStreamHeader' is not used.

Suggested change

from nvidia_ace.animation_data.v1_pb2 import AnimationData, AnimationDataStreamHeader

# from nvidia_ace.animation_data.v1_pb2 import AnimationData, AnimationDataStreamHeader

Copilot · 2025-12-01T19:04:40Z

scripts/audio2face_3d_api_client/app.py

+    if animation_frames:
+        duration = animation_frames[-1]["timeCode"]
+    else:
+        duration = 1.0


This statement is unreachable.

Suggested change

if animation_frames:

duration = animation_frames[-1]["timeCode"]

else:

duration = 1.0

duration = animation_frames[-1]["timeCode"]

Enhance README and add Gradio web interface for Audio2Face-3D API client

556b698

Copilot AI review requested due to automatic review settings December 1, 2025 18:56

Copilot started reviewing on behalf of Yash-Kavaiya December 1, 2025 18:56 View session

Copilot finished reviewing on behalf of Yash-Kavaiya December 1, 2025 19:00

Copilot AI reviewed Dec 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance README and add Gradio web interface for Audio2Face-3D API client#9

Enhance README and add Gradio web interface for Audio2Face-3D API client#9
Yash-Kavaiya wants to merge 1 commit intoNVIDIA:mainfrom
Yash-Kavaiya:main

Yash-Kavaiya commented Dec 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Copilot AI Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        return None, None, f"Error reading audio: {str(e)}"
+        error_msg = f"Error reading audio file: {str(e)}\n"
+        error_msg += "Ensure the file is a valid WAV format with PCM encoding."
+        return None, None, error_msg

	- CSV export functionality
	- Downloadable animation data in JSON and CSV formats

		return output_path


		async def process_audio_async(audio_path: str, model: str, emotion: str, emotion_strength: float,

	async def process_audio_async(audio_path: str, model: str, emotion: str, emotion_strength: float,
	async def generate_animation_from_audio(audio_path: str, model: str, emotion: str, emotion_strength: float,

	fourcc = cv2.VideoWriter_fourcc(*'mp4v')
	fourcc = cv2.VideoWriter_fourcc(*'avc1') # Use H.264/AVC for better compatibility

-```bash
+**Prerequisites**: Set your NVIDIA API key as an environment variable:
+```bash
+export NVIDIA_API_KEY=your-api-key-here

	from nvidia_ace.animation_data.v1_pb2 import AnimationData, AnimationDataStreamHeader
	# from nvidia_ace.animation_data.v1_pb2 import AnimationData, AnimationDataStreamHeader

Conversation

Yash-Kavaiya commented Dec 1, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants