Name	Name	Last commit message	Last commit date
parent directory ..
.gitignore	.gitignore
README.md	README.md
demo-diffusion.py	demo-diffusion.py
models.py	models.py
requirements.txt	requirements.txt
utilities.py	utilities.py

Name

Last commit message

Last commit date

Introduction

This demo application ("demoDiffusion") showcases the acceleration of Stable Diffusion pipeline using TensorRT plugins.

Setup

Clone the TensorRT OSS repository

git clone git@github.com:NVIDIA/TensorRT.git -b release/8.5 --single-branch
cd TensorRT
git submodule update --init --recursive

Launch TensorRT NGC container

Install nvidia-docker using these intructions.

docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/tensorrt:22.10-py3 /bin/bash

(Optional) Install latest TensorRT release

python3 -m pip install --upgrade pip
python3 -m pip install --upgrade tensorrt

NOTE: Alternatively, you can download and install TensorRT packages from NVIDIA TensorRT Developer Zone.

Build TensorRT plugins library

Build TensorRT Plugins library using the TensorRT OSS build instructions.

export TRT_OSSPATH=/workspace

cd $TRT_OSSPATH
mkdir -p build && cd build
cmake .. -DTRT_OUT_DIR=$PWD/out
cd plugin
make -j$(nproc)

export PLUGIN_LIBS="$TRT_OSSPATH/build/out/libnvinfer_plugin.so"

Install required packages

cd $TRT_OSSPATH/demo/Diffusion
pip3 install -r requirements.txt

# Create output directories
mkdir -p onnx engine output

NOTE: demoDiffusion has been tested on systems with NVIDIA A100, RTX3090, and RTX4090 GPUs, and the following software configuration.

cuda-python         11.8.1
diffusers           0.7.2
onnx                1.12.0
onnx-graphsurgeon   0.3.25
onnxruntime         1.13.1
polygraphy          0.43.1
tensorrt            8.5.1.7
tokenizers          0.13.2
torch               1.12.0+cu116
transformers        4.24.0

NOTE: optionally install HuggingFace accelerate package for faster and less memory-intense model loading.

Running demoDiffusion

Review usage instructions

python3 demo-diffusion.py --help

HuggingFace user access token

To download the model checkpoints for the Stable Diffusion pipeline, you will need a read access token. See instructions.

export HF_TOKEN=<your access token>

Generate an image guided by a single text prompt

LD_PRELOAD=${PLUGIN_LIBS} python3 demo-diffusion.py "a beautiful photograph of Mt. Fuji during cherry blossom" --hf-token=$HF_TOKEN -v

Restrictions

Upto 16 simultaneous prompts (maximum batch size) per inference.
For generating images of dynamic shapes without rebuilding the engines, use --force-dynamic-shape.
Supports images sizes between 256x256 and 1024x1024.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Introduction

Setup

Clone the TensorRT OSS repository

Launch TensorRT NGC container

(Optional) Install latest TensorRT release

Build TensorRT plugins library

Install required packages

Running demoDiffusion

Review usage instructions

HuggingFace user access token

Generate an image guided by a single text prompt

Restrictions

FilesExpand file tree

Diffusion

Directory actions

More options

Directory actions

More options

Latest commit

History

Diffusion

Folders and files

parent directory

README.md

Introduction

Setup

Clone the TensorRT OSS repository

Launch TensorRT NGC container

(Optional) Install latest TensorRT release

Build TensorRT plugins library

Install required packages

Running demoDiffusion

Review usage instructions

HuggingFace user access token

Generate an image guided by a single text prompt

Restrictions