Skip to content

Commit c85cfae

Browse files
committed
TRT-LLM Multi-Node Tutorial: initial check-in
This change creates the TensorRT-LLM Multi-Node tutorial and guide. Includes: - Instructions w/ explainations (README) - Helm chart - Container definitions - Server-side Python script - Various helpful YAML files.
1 parent d459ddd commit c85cfae

27 files changed

+3115
-0
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
.vscode/
2+
**/.vscode/
3+
4+
dev_*
5+
**/dev_*

Deployment/Kubernetes/TensorRT-LLM_Multi-Node_Distributed_Models/README.md

Lines changed: 678 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
dev_values.yaml
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
apiVersion: v2
16+
appVersion: 0.1.0
17+
description: Generative AI Multi-Node w/ Triton and TensorRT-LLM Guide/Tutorial
18+
icon: https://www.nvidia.com/content/dam/en-zz/Solutions/about-nvidia/logo-and-brand/[email protected]
19+
name: triton_trt-llm_multi-node_example
20+
version: 0.1.0
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
gpu: Tesla-V100-SXM2-16GB
16+
17+
model:
18+
name: gpt2
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: NVIDIA-A10G
18+
19+
model:
20+
name: llama-2-70b
21+
tensorrtLlm:
22+
conversion:
23+
gpu: 8
24+
memory: 256Gi
25+
parallelism:
26+
tensor: 8
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: Tesla-V100-SXM2-16GB
18+
19+
model:
20+
name: llama-2-7b-chat
21+
tensorrtLlm:
22+
conversion:
23+
gpu: 2
24+
memory: 64Gi
25+
parallelism:
26+
tensor: 2
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: Tesla-V100-SXM2-16GB
18+
19+
model:
20+
name: llama-2-7b
21+
tensorrtLlm:
22+
conversion:
23+
gpu: 2
24+
memory: 64Gi
25+
parallelism:
26+
tensor: 2
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: NVIDIA-A10G
18+
19+
model:
20+
name: llama-3-70b-instruct
21+
tensorrtLlm:
22+
conversion:
23+
gpu: 8
24+
memory: 256Gi
25+
parallelism:
26+
tensor: 8
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: Tesla-V100-SXM2-16GB
18+
19+
model:
20+
name: llama-3-8b-instruct
21+
tensorrtLlm:
22+
conversion:
23+
gpu: 4
24+
memory: 128Gi
25+
parallelism:
26+
tensor: 4
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: Tesla-V100-SXM2-16GB
18+
19+
model:
20+
name: llama-3-8b
21+
tensorrtLlm:
22+
conversion:
23+
gpu: 2
24+
memory: 64Gi
25+
parallelism:
26+
tensor: 2
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
# See values.yaml for reference values.
16+
17+
gpu: Tesla-V100-SXM2-16GB
18+
19+
model:
20+
name: opt125m
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
{{- $create_account := true }}
2+
{{- $create_job := true }}
3+
{{- $create_service := true }}
4+
{{- with $.Values.model }}
5+
{{- if .skipConversion }}
6+
{{- $create_job = false }}
7+
{{- end }}
8+
{{- end }}
9+
{{- with $.Values.kubernetes }}
10+
{{- if .noService }}
11+
{{- $create_service = false }}
12+
{{- end }}
13+
{{- if .serviceAccount}}
14+
{{- $create_account = false }}
15+
{{- end }}
16+
{{- end }}
17+
18+
{{ $.Chart.Name }} ({{ $.Chart.Version }}) installation complete.
19+
20+
Release Name: {{ $.Release.Name }}
21+
Namespace: {{ $.Release.Namespace }}
22+
Deployment Name: {{ $.Release.Name }}
23+
{{- if $create_job }}
24+
Conversion Job: {{ $.Release.Name }}
25+
{{- end }}
26+
{{- if $create_service }}
27+
Service Name: {{ $.Release.Name }}
28+
{{- end }}
29+
{{- if $create_account }}
30+
ServiceAccount Name: {{ $.Release.Name }}
31+
{{- end }}
32+
33+
Helpful commands:
34+
35+
$ helm status --namespace={{ $.Release.Namespace }} {{ $.Release.Name }}
36+
$ helm get --namespace={{ $.Release.Namespace }} all {{ $.Release.Name }}
37+
$ kubectl get --namespace={{ $.Release.Namespace }} --selector='app={{ $.Release.Name }}' deployments
38+
{{- if $create_job -}}
39+
,jobs
40+
{{- end -}}
41+
,pods
42+
{{- if $create_service -}}
43+
,services
44+
{{- end -}}
45+
,podmonitors
46+
{{- if $create_account -}}
47+
,serviceAccounts
48+
{{- end -}}

0 commit comments

Comments
 (0)