Ditto TalkingHead Docker Container

This repository contains a Docker setup for the ditto-talkinghead project, which provides Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis.

Prerequisites

Docker with GPU support
NVIDIA Docker runtime
NVIDIA GPU with CUDA support
Docker Compose (optional but recommended)

Quick Start

Remote Deployment (Recommended)

For deploying on another server, use the setup script:

# Clone with submodules
git clone --recursive https://github.com/your-username/ditto-container.git
cd ditto-container

# Run setup script (handles submodules + Docker build + run)
./setup.sh

# Or step by step:
./setup.sh setup  # Setup submodules
./setup.sh build  # Build Docker image  
./setup.sh run    # Run container

Using Docker Compose (Local Development)

Build and run the container:
```
docker-compose up -d --build
```

Access the container:

docker-compose exec ditto-talkinghead bash

Stop the container:
```
docker-compose down
```

Using Docker directly

Build the image (includes source code):
```
docker build -t ditto-talkinghead .
```

Run the container:

docker run -it --gpus all \
  -v $(pwd)/checkpoints:/app/checkpoints \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/output:/app/output \
  -p 8000:8000 \
  --name ditto-container \
  ditto-talkinghead

Note: When using Docker directly, the source code from the src/ submodule is built into the container at /app/src/.

Container Features

Base Image: NVIDIA CUDA 11.8 with Ubuntu 22.04 + manually installed cuDNN8
Python: 3.10
GPU Support: Full CUDA and TensorRT support
Pre-installed Tools: git-lfs, vim
Source Code:
- Docker Compose: Mounted from local ./src directory (development mode)
- Docker Direct: Built into container at /app/src/ (deployment mode)
Pre-installed Dependencies:
- PyTorch with CUDA support
- TensorRT 8.6.1
- OpenCV
- librosa
- All other required packages from the original repository

Directory Structure

For Docker Compose (Development Mode)

Volume mounts for live development:

./
├── src/                 # Git submodule: mounted to /app/src in container
├── checkpoints/          # Model checkpoints: mounted to /app/checkpoints
├── data/                # Input data: mounted to /app/data
├── output/              # Generated outputs: mounted to /app/output
└── docker files...

For Direct Docker (Deployment Mode)

Only external data needs to be mounted:

./
├── src/                 # Git submodule: built into container at /app/src
├── checkpoints/          # Model checkpoints: mounted to /app/checkpoints  
├── data/                # Input data: mounted to /app/data
├── output/              # Generated outputs: mounted to /app/output
└── docker files...

The source code is managed as a git submodule from https://github.com/fciannella/ditto-talkinghead.

Setting Up the Project

After running the container, you'll need to:

Download the model checkpoints:

cd /app
git lfs install
git clone https://huggingface.co/digital-avatar/ditto-talkinghead checkpoints

Run inference:

cd /app/src
python inference.py \
  --data_root "/app/checkpoints/ditto_trt_Ampere_Plus" \
  --cfg_pkl "/app/checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" \
  --audio_path "/app/data/audio.wav" \
  --source_path "/app/data/image.png" \
  --output_path "/app/output/result.mp4"

The source code from your fork is available in /app/src and any changes you make locally will be reflected in the container.

🎥 Real-time Streaming Services

In addition to batch processing, this container includes real-time streaming services for live talking head generation:

🌐 WebSocket Service

# Inside container
cd /app/src
python streaming_service.py "/app/checkpoints/ditto_cfg/v0.4_hubert_cfg_trt_online.pkl" "/app/checkpoints/ditto_trt_Ampere_Plus"

# Open browser with your server's IP/hostname:
# http://YOUR_SERVER_IP:8000  (for remote server)
# http://localhost:8000       (for local development)
# Built-in web interface - no separate client needed!

⚠️ Notes:

Make sure to put a source image (avatar photo) at /app/data/source_image.png
For remote servers, replace YOUR_SERVER_IP with your actual server IP or hostname
The WebSocket URL is automatically detected from the browser location

📺 RTMP Service (YouTube/Twitch Live)

# Inside container  
cd /app/src
python rtmp_streaming_service.py "/app/checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" "/app/checkpoints/ditto_trt_Ampere_Plus"

# Start streaming via API
curl -X POST "http://localhost:8000/start_stream/my_stream" \
  -H "Content-Type: application/json" \
  -d '{"source_path": "/app/data/avatar.png", "rtmp_url": "rtmp://your_stream_url"}'

See STREAMING_GUIDE.md for complete documentation.

Working with the Git Submodule

The src/ directory is a git submodule pointing to your fork. To work with it:

# Update the submodule to latest from your fork
git submodule update --remote src

# Make changes to the code in ./src/
# Then commit and push from within the src directory
cd src
git add .
git commit -m "Your changes"
git push origin main

# Update the main repository to point to the new commit
cd ..
git add src
git commit -m "Update submodule"
git push

🚀 Remote Deployment & Submodules

For Remote Servers:

Clone with submodules:

git clone --recursive https://github.com/your-username/ditto-container.git

Use the setup script:

cd ditto-container
./setup.sh  # Handles everything automatically

Submodule Troubleshooting:

If src/ directory is empty after cloning:

git submodule update --init --recursive

If submodule is out of date:

git submodule update --remote --recursive

To update submodule to latest commit:

cd src
git pull origin main
cd ..
git add src
git commit -m "Update submodule to latest"

GPU Compatibility

The pre-built TensorRT models are compatible with Ampere_Plus GPUs. If your GPU doesn't support this, you'll need to convert the ONNX models to TensorRT inside the container:

cd /app/src
python scripts/cvt_onnx_to_trt.py \
  --onnx_dir "/app/checkpoints/ditto_onnx" \
  --trt_dir "/app/checkpoints/ditto_trt_custom"

Then use --data_root=/app/checkpoints/ditto_trt_custom in your inference command.

Troubleshooting

GPU Not Detected

Ensure you have:

NVIDIA drivers installed on the host
NVIDIA Docker runtime installed
Used --gpus all flag or proper docker-compose GPU configuration

Permission Issues

The container runs as a non-root user. If you encounter permission issues with mounted volumes, adjust the ownership:

sudo chown -R 1000:1000 ./checkpoints ./data ./output ./src

Memory Issues

This model requires significant GPU memory. Ensure your GPU has enough VRAM (recommended: 8GB+).

cuDNN Library Issues

If you get errors about missing libcudnn.so.8, this should be resolved as the Dockerfile installs cuDNN8 manually via apt packages (libcudnn8 and libcudnn8-dev).

Run on OCI

Push the container to gitlab

docker build -t gitlab-master.nvidia.com/fciannella/ditto-container/ditto-container:0.0.1 -t gitlab-master.nvidia.com/fciannella/ditto-container/ditto-container:latest .
docker push gitlab-master.nvidia.com/fciannella/ditto-container/ditto-container:0.0.1
docker push gitlab-master.nvidia.com/fciannella/ditto-container/ditto-container:latest

Run the container on OCI

srun -A llmservice_nemo_mlops -p interactive_singlenode -G 4 --time 04:00:00 --container-mounts /lustre/fsw/portfolios/llmservice/users/fciannella/cache:/root/.cache,/lustre/fsw/portfolios/llmservice/users/fciannella/src:/root/src --container-image gitlab-master.nvidia.com/fciannella/ditto-container/ditto-container:latest --pty bash

Test the inference

python inference.py \
    --data_root "./checkpoints/ditto_trt_Ampere_Plus" \
    --cfg_pkl "./checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" \
    --audio_path "./tmpcc1gbdw3.wav" \
    --source_path "./chris_avatar.png" \
    --output_path "./result.mp4"

License

This Docker setup is provided under the same Apache-2.0 license as the original ditto-talkinghead project.

export TMPDIR=/root/src/.cache export TEMP=/root/src/.cache export TMP=/root/src/.cache export PYXBLD_DIR=/root/src/.cache/pyxbld mkdir -p /root/src/.cache/pyxbld

Clean up any existing build cache

rm -rf /root/.pyxbld/

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
src		src
.dockerignore		.dockerignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ditto TalkingHead Docker Container

Prerequisites

Quick Start

Remote Deployment (Recommended)

Using Docker Compose (Local Development)

Using Docker directly

Container Features

Directory Structure

For Docker Compose (Development Mode)

For Direct Docker (Deployment Mode)

Setting Up the Project

🎥 Real-time Streaming Services

🌐 WebSocket Service

📺 RTMP Service (YouTube/Twitch Live)

Working with the Git Submodule

🚀 Remote Deployment & Submodules

For Remote Servers:

Submodule Troubleshooting:

GPU Compatibility

Troubleshooting

GPU Not Detected

Permission Issues

Memory Issues

cuDNN Library Issues

Run on OCI

Push the container to gitlab

Run the container on OCI

Test the inference

License

Clean up any existing build cache

About

Uh oh!

Releases

Packages

Languages

fciannella/ditto-container

Folders and files

Latest commit

History

Repository files navigation

Ditto TalkingHead Docker Container

Prerequisites

Quick Start

Remote Deployment (Recommended)

Using Docker Compose (Local Development)

Using Docker directly

Container Features

Directory Structure

For Docker Compose (Development Mode)

For Direct Docker (Deployment Mode)

Setting Up the Project

🎥 Real-time Streaming Services

🌐 WebSocket Service

📺 RTMP Service (YouTube/Twitch Live)

Working with the Git Submodule

🚀 Remote Deployment & Submodules

For Remote Servers:

Submodule Troubleshooting:

GPU Compatibility

Troubleshooting

GPU Not Detected

Permission Issues

Memory Issues

cuDNN Library Issues

Run on OCI

Push the container to gitlab

Run the container on OCI

Test the inference

License

Clean up any existing build cache

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages