Nvidia NIM on Kubernetes with Spice

Works with v1.0+

This recipe deploys Nvidia NIM infrastructure, on Kubernetes, with GPUs. Specifically, we will:

Deploy the NVIDIA GPU Operator onto Kubernetes so that pods can request GPUs.
Select and deploy an LLM available on Nvidia NIM.
Connect spice to the OpenAI compatible NIM LLM.

Prerequisites

A Kubernetes cluster, with at least 1 GPU node.
- Ensure that the GPU has a compute capability of 8.0 or higher.
Local tools
- helm: install
- kubectl: install
- spice: install

Deploying GPU-operator

Add the Nvidia Helm repository

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
    && helm repo update

Install the GPU Operator

```bash
helm install --wait --generate-name \
    -n gpu-operator --create-namespace \
    nvidia/gpu-operator
```

- For additional `helm` overrides, see [additional values](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html#common-chart-customization-options

). - Once the command completes (because of the --wait), Kubernetes pods will be able to ask for GPU requests/limits.

For additional details & troubleshooting, see the official documentation.

Configuring NIMs

Get a NGC API key from Nvidia's NGC website.
```
export NGC_API_KEY=""
```

echo "$NGC_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

helm fetch https://helm.ngc.nvidia.com/nim/charts/nim-llm-1.1.2.tgz --username=\$oauthtoken --password=$NGC_API_KEY

Create a secret to use for pulling images from docker registries.

kubectl create secret \
docker-registry ngc-secret \
--docker-server=nvcr.io \
--docker-username='$oauthtoken' \
--docker-password=$NGC_API_KEY

Similar to above, create a secret to pull model weights.

kubectl create secret generic ngc-api --from-literal=NGC_API_KEY=$NGC_API_KEY

Install the Helm chart.

helm install my-nim nim-llm-1.1.2.tgz -f values.yaml

For available models, use NGC CLI and run

ngc registry image list "nvcr.io/nim/*"

Connect Spice

Add the helm repository

helm repo add spiceai https://helm.spiceai.org
helm repo update

Deploy Spice

helm install spiceai spiceai/spiceai -f spiceai.yaml

Connect to Spice

kubectl port-forward deployment/spiceai 8090

Chat with meta/llama3-8b-instruct via NIM.

spice chat

Using model: nim
chat> Tell me a joke about the moon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvidia NIM on Kubernetes with Spice

Prerequisites

Deploying GPU-operator

Configuring NIMs

Connect Spice

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Nvidia NIM on Kubernetes with Spice

Prerequisites

Deploying GPU-operator

Configuring NIMs

Connect Spice