Skip to content

Commit c9c851d

Browse files
authored
Updating Version to 24.08 and extended support for dashbaords and auto scaling
1 parent 64a61ed commit c9c851d

File tree

16 files changed

+207
-212
lines changed

16 files changed

+207
-212
lines changed

Popular_Models_Guide/StableDiffusion/README.md

Lines changed: 17 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
# Deploying Stable Diffusion Models with Triton and TensorRT
3030

3131
This example demonstrates how to deploy Stable Diffusion models in
32-
Triton by leveraging the [TensorRT demo](https://github.com/NVIDIA/TensorRT/tree/release/9.2/demo/Diffusion)
32+
Triton by leveraging the [TensorRT demo](https://github.com/NVIDIA/TensorRT/tree/release/10.4/demo/Diffusion)
3333
pipeline and utilities.
3434

3535
Using the TensorRT demo as a base this example contains a reusable
@@ -38,9 +38,9 @@ suitable for deploying multiple versions and configurations of
3838
Diffusion models.
3939

4040
For more information on Stable Diffusion please visit
41-
[stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5),
42-
[stable-diffusion-xl](https://huggingface.co/docs/diffusers/en/using-diffusers/sdxl). For
43-
more information on the TensorRT implementation please see the [TensorRT demo](https://github.com/NVIDIA/TensorRT/tree/release/9.2/demo/Diffusion).
41+
[stable-diffusion-v1-5](https://huggingface.co/benjamin-paine/stable-diffusion-v1-5),
42+
[stable-diffusion-xl](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0). For
43+
more information on the TensorRT implementation please see the [TensorRT demo](https://github.com/NVIDIA/TensorRT/tree/release/10.4/demo/Diffusion).
4444

4545
> [!Note]
4646
> This example is given as sample code and should be reviewed before use in production settings.
@@ -57,7 +57,7 @@ support matrix](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/i
5757
## Building the Triton Inference Server Image
5858

5959
The example is designed based on the
60-
`nvcr.io/nvidia/tritonserver:24.01-py3` docker image and [TensorRT OSS v9.2.0](https://github.com/NVIDIA/TensorRT/releases/tag/v9.2.0).
60+
`nvcr.io/nvidia/tritonserver:24.08-py3` docker image and [TensorRT OSS v10.4](https://github.com/NVIDIA/TensorRT/releases/tag/v10.4).
6161

6262
A set of convenience scripts are provided to create a docker image
6363
based on the `nvcr.io/nvidia/tritonserver:24.01-py3` image with the
@@ -99,6 +99,15 @@ directory as `workspace`.
9999

100100
### Build Stable Diffusion v 1.5 Engine
101101

102+
> [!Note]
103+
>
104+
> The model
105+
> [stable-diffusion-v1-5](https://huggingface.co/benjamin-paine/stable-diffusion-v1-5)
106+
> requires login in to huggingface and acceptance of terms and
107+
> conditions of use. Please set the environment variable HF_TOKEN
108+
> accordingly.
109+
>
110+
102111
```bash
103112
./scripts/build_models.sh --model stable_diffusion_1_5
104113
```
@@ -285,27 +294,13 @@ python3 client.py --model stable_diffusion_xl --requests 10 --clients 10
285294

286295
## Known Issues and Limitations
287296

288-
1. When shutting down the server, an invalid memory operation occurs:
289-
290-
> [!Note]
291-
> This error is also seen in standalone applications outside of the Triton Inference Server
292-
> and we believe this is due to an interaction between imported python modules. Further
293-
> we haven't seen any issues related to this error and believe it can be safely
294-
> ignored.
295-
296-
297-
```
298-
free(): invalid pointer
299-
```
300-
301-
302-
2. The diffusion backend doesn't yet support using an optional refiner
297+
1. The diffusion backend doesn't yet support using an optional refiner
303298
model unlike the [demo][demo_reference] it's based on. See also
304299
[demo_txt2img_xl.py][demo_code]
305300

306301

307-
[demo_code]: https://github.com/NVIDIA/TensorRT/blob/release/9.2/demo/Diffusion/demo_txt2img_xl.py
302+
[demo_code]: https://github.com/NVIDIA/TensorRT/blob/release/10.4/demo/Diffusion/demo_txt2img_xl.py
308303

309304

310-
[demo_reference]: https://github.com/NVIDIA/TensorRT/tree/release/9.2/demo/Diffusion#text-to-image-using-sdxl-stable-diffusion-xl
305+
[demo_reference]: https://github.com/NVIDIA/TensorRT/tree/release/10.4/demo/Diffusion#generate-an-image-with-stable-diffusion-xl-guided-by-a-single-text-prompt
311306

Popular_Models_Guide/StableDiffusion/build.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ DOCKERFILE=${SOURCE_DIR}/docker/Dockerfile
3939

4040
# Base Images
4141
BASE_IMAGE=nvcr.io/nvidia/tritonserver
42-
BASE_IMAGE_TAG_DIFFUSION=24.01-py3
42+
BASE_IMAGE_TAG_DIFFUSION=24.08-py3
4343

4444
get_options() {
4545
while :; do
@@ -141,7 +141,7 @@ get_options() {
141141
fi
142142

143143
if [ -z "$TAG" ]; then
144-
TAG="tritonserver:r24.01"
144+
TAG="tritonserver:r24.08"
145145

146146
if [[ $FRAMEWORK == "DIFFUSION" ]]; then
147147
TAG+="-diffusion"
@@ -211,7 +211,7 @@ if [[ $FRAMEWORK == DIFFUSION ]]; then
211211
set -x
212212
fi
213213
$RUN_PREFIX mkdir -p $PWD/backend/diffusion
214-
$RUN_PREFIX docker run --rm -it -v $PWD:/workspace $TAG /bin/bash -c "cp -rf /tmp/TensorRT/demo/Diffusion /workspace/backend/diffusion"
214+
$RUN_PREFIX docker run --rm -it -v ${SOURCE_DIR}:/workspace $TAG /bin/bash -c "cp -rf /tmp/TensorRT/demo/Diffusion /workspace/backend/diffusion"
215215

216216
{ set +x; } 2>/dev/null
217217

@@ -221,7 +221,7 @@ if [[ $FRAMEWORK == DIFFUSION ]]; then
221221
set -x
222222
fi
223223

224-
$RUN_PREFIX docker run --rm -it -v $PWD:/workspace $TAG /bin/bash -c "/workspace/scripts/build_models.sh --model $model"
224+
$RUN_PREFIX docker run --rm -it -v ${SOURCE_DIR):/workspace $TAG /bin/bash -c "/workspace/scripts/build_models.sh --model $model"
225225
226226
{ set +x; } 2>/dev/null
227227
done

Popular_Models_Guide/StableDiffusion/docker/Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ ARG BASE_IMAGE_TAG=24.01-py3
2929

3030
FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG} as tritonserver-stable-diffusion
3131

32-
RUN pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt==9.2.0.post12.dev5
32+
RUN pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt-cu12==10.4.0
3333

34-
RUN git clone https://github.com/NVIDIA/TensorRT.git -b release/9.2 --single-branch /tmp/TensorRT
34+
RUN git clone https://github.com/NVIDIA/TensorRT.git -b release/10.4 --single-branch /tmp/TensorRT
3535

3636
RUN pip3 install -r /tmp/TensorRT/demo/Diffusion/requirements.txt
3737

Popular_Models_Guide/StableDiffusion/run.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ get_options() {
9999
fi
100100

101101
if [ -z "$IMAGE" ]; then
102-
IMAGE="tritonserver:r24.01"
102+
IMAGE="tritonserver:r24.08"
103103

104104
if [[ $FRAMEWORK == "DIFFUSION" ]]; then
105105
IMAGE+="-diffusion"

Triton_Inference_Server_Python_API/README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -54,30 +54,30 @@ https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html
5454
## Installation
5555

5656
The tutorial and Python API package are designed to be installed and
57-
run within the `nvcr.io/nvidia/tritonserver:24.01-py3` docker image.
57+
run within the `nvcr.io/nvidia/tritonserver:24.08-py3` docker image.
5858

5959
A set of convenience scripts are provided to create a docker image
60-
based on the `nvcr.io/nvidia/tritonserver:24.01-py3` image with the
60+
based on the `nvcr.io/nvidia/tritonserver:24.08-py3` image with the
6161
Python API installed plus additional dependencies required for the
6262
examples.
6363

64-
### Triton Inference Server 24.01 + Python API
64+
### Triton Inference Server 24.08 + Python API
6565

6666
#### Clone Repository
6767
```bash
6868
git clone https://github.com/triton-inference-server/tutorials.git
6969
cd tutorials/Triton_Inference_Server_Python_API
7070
```
7171

72-
#### Build `triton-python-api:r24.01` Image
72+
#### Build `triton-python-api:r24.08` Image
7373
```bash
7474
./build.sh
7575
```
7676

7777
#### Supported Backends
7878

7979
The built image includes all the backends shipped by default in the
80-
tritonserver `nvcr.io/nvidia/tritonserver:24.01-py3` container.
80+
tritonserver `nvcr.io/nvidia/tritonserver:24.08-py3` container.
8181

8282
```
8383
dali fil identity onnxruntime openvino python pytorch repeat square tensorflow tensorrt
@@ -95,7 +95,7 @@ different data types. The `identity` model copies provided inputs of
9595

9696
## Hello World
9797

98-
### Start `triton-python-api:r24.01` Container
98+
### Start `triton-python-api:r24.08` Container
9999

100100
The following command starts a container and volume mounts the current
101101
directory as `workspace`.
@@ -163,7 +163,7 @@ This example is based on the
163163
tutorial.
164164

165165

166-
#### Build `triton-python-api:r24.01-diffusion` Image and Stable Diffusion Models
166+
#### Build `triton-python-api:r24.08-diffusion` Image and Stable Diffusion Models
167167

168168
Please note the following command will take many minutes depending on
169169
your hardware configuration and network connection.
@@ -175,7 +175,7 @@ your hardware configuration and network connection.
175175
#### Supported Backends
176176

177177
The built image includes all the backends shipped by default in the
178-
tritonserver `nvcr.io/nvidia/tritonserver:24.01-py3` container.
178+
tritonserver `nvcr.io/nvidia/tritonserver:24.08-py3` container.
179179

180180
```
181181
dali fil identity onnxruntime openvino python pytorch repeat square tensorflow tensorrt
@@ -223,13 +223,13 @@ server.models()
223223

224224
#### Example Output
225225
```python
226-
{('stable_diffusion', 1): {'name': 'stable_diffusion', 'version': 1, 'state': 'READY'}, ('text_encoder', 1): {'name': 'text_encoder', 'version': 1, 'state': 'READY'}, ('vae', 1): {'name': 'vae', 'version': 1, 'state': 'READY'}}
226+
{('stable_diffusion_1_5', 1): {'name': 'stable_diffusion_1_5', 'version': 1, 'state': 'READY'}, ('stable_diffusion_xl', 1): {'name': 'stable_diffusion_xl', 'version': 1, 'state': 'READY'}}
227227
```
228228

229229
### Send an Inference Request
230230

231231
```python
232-
model = server.model("stable_diffusion")
232+
model = server.model("stable_diffusion_xl")
233233
responses = model.infer(inputs={"prompt":[["butterfly in new york, realistic, 4k, photograph"]]})
234234
```
235235

Triton_Inference_Server_Python_API/build.sh

Lines changed: 18 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ RUN_PREFIX=
3030
BUILD_MODELS=
3131

3232
# Frameworks
33-
declare -A FRAMEWORKS=(["DIFFUSION"]=1 ["TRT_LLM"]=2 ["IDENTITY"]=3)
33+
declare -A FRAMEWORKS=(["DIFFUSION"]=1 ["IDENTITY"]=3)
3434
DEFAULT_FRAMEWORK=IDENTITY
3535

3636
SOURCE_DIR=$(dirname "$(readlink -f "$0")")
@@ -39,9 +39,8 @@ DOCKERFILE=${SOURCE_DIR}/docker/Dockerfile
3939

4040
# Base Images
4141
BASE_IMAGE=nvcr.io/nvidia/tritonserver
42-
BASE_IMAGE_TAG_IDENTITY=24.01-py3
43-
BASE_IMAGE_TAG_DIFFUSION=24.01-py3
44-
BASE_IMAGE_TAG_TRT_LLM=24.01-trtllm-python-py3
42+
BASE_IMAGE_TAG_IDENTITY=24.08-py3
43+
BASE_IMAGE_TAG_DIFFUSION=24.08-py3
4544

4645
get_options() {
4746
while :; do
@@ -138,11 +137,7 @@ get_options() {
138137
fi
139138

140139
if [ -z "$TAG" ]; then
141-
TAG="triton-python-api:r24.01"
142-
143-
if [[ $FRAMEWORK == "TRT_LLM" ]]; then
144-
TAG+="-trt-llm"
145-
fi
140+
TAG="triton-python-api:r24.08"
146141

147142
if [[ $FRAMEWORK == "DIFFUSION" ]]; then
148143
TAG+="-diffusion"
@@ -186,7 +181,7 @@ get_options "$@"
186181

187182
if [[ $FRAMEWORK == DIFFUSION ]]; then
188183
BASE_IMAGE="tritonserver"
189-
BASE_IMAGE_TAG="r24.01-diffusion"
184+
BASE_IMAGE_TAG="r24.08-diffusion"
190185
fi
191186

192187
# BUILD RUN TIME IMAGE
@@ -207,17 +202,18 @@ if [[ $FRAMEWORK == DIFFUSION ]]; then
207202
if [ -z "$RUN_PREFIX" ]; then
208203
set -x
209204
fi
210-
$RUN_PREFIX mkdir -p backend/diffusion
211-
$RUN_PREFIX $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/build.sh --framework diffusion --tag tritonserver:r24.01-diffusion
212-
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/backend/diffusion/model.py backend/diffusion/model.py
213-
$RUN_PREFIX mkdir -p diffusion-models/stable_diffusion_1_5/1
214-
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_1_5/config.pbtxt diffusion-models/stable_diffusion_1_5/config.pbtxt
215-
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_1_5/1/.gitkeep diffusion-models/stable_diffusion_1_5/1/.gitkeep
216-
$RUN_PREFIX mkdir -p diffusion-models/stable_diffusion_xl/1
217-
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_xl/config.pbtxt diffusion-models/stable_diffusion_xl/config.pbtxt
218-
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_xl/1/.gitkeep diffusion-models/stable_diffusion_xl/1/.gitkeep
219-
$RUN_PREFIX mkdir -p scripts/stable_diffusion
220-
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/scripts/build_models* scripts/stable_diffusion/
205+
$RUN_PREFIX mkdir -p ${SOURCE_DIR}/backend/diffusion
206+
$RUN_PREFIX $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/build.sh --framework diffusion --tag tritonserver:r24.08-diffusion
207+
$RUN_PREFIX docker run --rm -it -v ${SOURCE_DIR}:/workspace tritonserver:r24.08-diffusion /bin/bash -c "cp -rf /tmp/TensorRT/demo/Diffusion /workspace/backend/diffusion"
208+
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/backend/diffusion/model.py ${SOURCE_DIR}/backend/diffusion/model.py
209+
$RUN_PREFIX mkdir -p ${SOURCE_DIR}/diffusion-models/stable_diffusion_1_5/1
210+
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_1_5/config.pbtxt ${SOURCE_DIR}/diffusion-models/stable_diffusion_1_5/config.pbtxt
211+
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_1_5/1/.gitkeep ${SOURCE_DIR}/diffusion-models/stable_diffusion_1_5/1/.gitkeep
212+
$RUN_PREFIX mkdir -p ${SOURCE_DIR}/diffusion-models/stable_diffusion_xl/1
213+
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_xl/config.pbtxt ${SOURCE_DIR}/diffusion-models/stable_diffusion_xl/config.pbtxt
214+
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/diffusion-models/stable_diffusion_xl/1/.gitkeep ${SOURCE_DIR}/diffusion-models/stable_diffusion_xl/1/.gitkeep
215+
$RUN_PREFIX mkdir -p ${SOURCE_DIR}/scripts/stable_diffusion
216+
$RUN_PREFIX cp $SOURCE_DIR/../Popular_Models_Guide/StableDiffusion/scripts/build_models* ${SOURCE_DIR}/scripts/stable_diffusion/
221217

222218
fi
223219

@@ -231,25 +227,14 @@ $RUN_PREFIX docker build -f $DOCKERFILE $BUILD_OPTIONS $BUILD_ARGS -t $TAG $SOUR
231227
{ set +x; } 2>/dev/null
232228

233229

234-
if [[ $FRAMEWORK == TRT_LLM ]]; then
235-
if [ -z "$RUN_PREFIX" ]; then
236-
set -x
237-
fi
238-
239-
$RUN_PREFIX docker build -f $SOURCE_DIR/docker/Dockerfile.trt-llm-engine-builder $BUILD_OPTIONS $BUILD_ARGS -t trt-llm-engine-builder $SOURCE_DIR $NO_CACHE
240-
241-
{ set +x; } 2>/dev/null
242-
243-
fi;
244-
245230
if [[ $FRAMEWORK == IDENTITY ]] || [[ $BUILD_MODELS == TRUE ]]; then
246231

247232
if [[ $FRAMEWORK == DIFFUSION ]]; then
248233
if [ -z "$RUN_PREFIX" ]; then
249234
set -x
250235
fi
251236

252-
$RUN_PREFIX docker run --rm -it -v $PWD:/workspace $TAG /bin/bash -c "/workspace/scripts/stable_diffusion/build_models.sh --model stable_diffusion_1_5"
237+
$RUN_PREFIX docker run --gpus all --rm -it -v ${SOURCE_DIR}:/workspace $TAG /bin/bash -c "/workspace/scripts/stable_diffusion/build_models.sh --model stable_diffusion_xl"
253238

254239
{ set +x; } 2>/dev/null
255240
fi

Triton_Inference_Server_Python_API/deps/requirements.txt

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,6 @@
2424
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

27-
awscli
28-
fastapi==0.97.0
29-
ftfy
30-
mypy
3127
pyright
3228
pytest
33-
ray[all]==2.9
34-
scipy
35-
sphinx
36-
sphinx-markdown-builder
37-
starlette==0.27.0
29+
ray[all]==2.36.0
Binary file not shown.

Triton_Inference_Server_Python_API/docker/Dockerfile

Lines changed: 9 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -25,37 +25,27 @@
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

2727
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver
28-
ARG BASE_IMAGE_TAG=24.01-py3
28+
ARG BASE_IMAGE_TAG=24.08-py3
2929

3030
FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG} as triton-python-api
3131

3232
RUN apt-get update; apt-get install -y gdb
3333

34-
COPY ./deps/requirements.txt /tmp/requirements.txt
35-
36-
RUN pip install --timeout=2000 -r /tmp/requirements.txt
34+
RUN --mount=type=bind,source=./deps/requirements.txt,target=/tmp/requirements.txt \
35+
pip install --timeout=2000 --requirement /tmp/requirements.txt
3736

3837
# Finish pyright install
3938

4039
RUN pyright --help
4140

42-
COPY ./deps/tritonserver-2.41.0.dev0-py3-none-any.whl /tmp/tritonserver-2.41.0.dev0-py3-none-any.whl
43-
4441
RUN find /opt/tritonserver/python -maxdepth 1 -type f -name \
45-
"tritonserver-*.whl" | xargs -I {} pip3 install --force-reinstall --upgrade {}[all]
42+
"tritonserver-*.whl" | xargs -I {} pip3 install --upgrade {}[all]
4643

47-
RUN pip3 show tritonserver 1>/dev/null || \
48-
if [ $? != 0 ]; then \
49-
pip3 install /tmp/tritonserver-2.41.0.dev0-py3-none-any.whl[all] ;\
50-
fi
44+
# grafana
45+
RUN apt-get install -y adduser libfontconfig1 musl && \
46+
wget https://dl.grafana.com/enterprise/release/grafana-enterprise_11.2.0_amd64.deb && \
47+
dpkg -i grafana-enterprise_11.2.0_amd64.deb && \
48+
rm -rf grafana-enterprise_11.2.0_amd64.deb
5149

5250
RUN ln -sf /bin/bash /bin/sh
5351

54-
COPY . /workspace
55-
56-
ARG RUN_TESTS=FALSE
57-
58-
RUN if [[ "$RUN_TESTS" == "TRUE" ]] ; then cd /tmp && git clone -b r23.12-python-api https://github.com/triton-inference-server/core.git && cp -rf /tmp/core/python/test /workspace/deps/ ; fi
59-
60-
RUN if [[ "$RUN_TESTS" == "TRUE" ]] ; then pytest /workspace/deps ; fi
61-

0 commit comments

Comments
 (0)