Skip to content

Commit 247e8f5

Browse files
authored
docs: Edit pass and code example updates (#20200)
1 parent a4ecc3b commit 247e8f5

File tree

1 file changed

+68
-46
lines changed
  • llama-index-integrations/llms/llama-index-llms-nvidia

1 file changed

+68
-46
lines changed
Lines changed: 68 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,66 +1,64 @@
1-
# NVIDIA NIMs
1+
# LlamaIndex LLMs Integration: NVIDIA NIM for LLMs
22

3-
The `llama-index-llms-nvidia` package contains LlamaIndex integrations building applications with models on
4-
NVIDIA NIM inference microservice. NIM supports models across domains like chat, embedding, and re-ranking models
5-
from the community as well as NVIDIA. These models are optimized by NVIDIA to deliver the best performance on NVIDIA
6-
accelerated infrastructure and deployed as a NIM, an easy-to-use, prebuilt containers that deploy anywhere using a single
7-
command on NVIDIA accelerated infrastructure.
3+
The `llama-index-llms-nvidia` package contains LlamaIndex integrations for building applications with [NVIDIA NIM](https://developer.nvidia.com/nim).
4+
With the NVIDIA LLM connector, you can develop LLM-powered systems using [NVIDIA AI Foundation models](https://www.nvidia.com/en-us/ai-data-science/foundation-models/).
85

9-
NVIDIA hosted deployments of NIMs are available to test on the [NVIDIA API catalog](https://build.nvidia.com/). After testing,
10-
NIMs can be exported from NVIDIA’s API catalog using the NVIDIA AI Enterprise license and run on-premises or in the cloud,
11-
giving enterprises ownership and full control of their IP and AI application.
6+
NVIDIA NIM for LLM supports models across domains like chat, reward, and reasoning, from the community as well as from NVIDIA.
7+
Each model is optimized by NVIDIA to deliver the best performance on NVIDIA-accelerated infrastructure and is packaged as a NIM,
8+
an easy-to-use, prebuilt container that deploys anywhere using a single command on NVIDIA accelerated infrastructure.
9+
At their core, NIM for LLMs are containers that provide interactive APIs for running inference on an AI Model.
1210

13-
NIMs are packaged as container images on a per model basis and are distributed as NGC container images through the NVIDIA NGC Catalog.
14-
At their core, NIMs provide easy, consistent, and familiar APIs for running inference on an AI model.
11+
NVIDIA-hosted deployments are available on the [NVIDIA API catalog](https://build.nvidia.com/) to test each NIM.
12+
After you explore, you can download NIM for LLMs from the API catalog, which is included with the NVIDIA AI Enterprise license.
13+
The ability to run models on-premises or in your own cloud gives your enterprise ownership of your customizations and full control of your IP and AI application.
1514

16-
# NVIDIA's LLM connector
15+
Use this documentation to learn how to install the `llama-index-llms-nvidia` package
16+
and use it to connect to, and generate content from, compatible LLM models.
1717

18-
This example goes over how to use LlamaIndex to interact with and develop LLM-powered systems using the publicly-accessible AI Foundation endpoints.
18+
## Install the Package
1919

20-
With this connector, you'll be able to connect to and generate from compatible models available as hosted [NVIDIA NIMs](https://ai.nvidia.com), such as:
21-
22-
- Google's [gemma-7b](https://build.nvidia.com/google/gemma-7b)
23-
- Mistal AI's [mistral-7b-instruct-v0.2](https://build.nvidia.com/mistralai/mistral-7b-instruct-v2)
24-
- And more!
25-
26-
## Installation
20+
To install the `llama-index-llms-nvidia` package, run the following code.
2721

2822
```shell
2923
pip install llama-index-llms-nvidia
3024
```
3125

32-
## Setup
26+
## Access the NVIDIA API Catalog
3327

34-
**To get started:**
28+
To get access to the NVIDIA API Catalog, do the following:
3529

36-
1. Create a free account with [NVIDIA](https://build.nvidia.com/), which hosts NVIDIA AI Foundation models.
30+
1. Create a free account on the [NVIDIA API Catalog](https://build.nvidia.com/) and log in.
31+
2. Click your profile icon, and then click **API Keys**. The **API Keys** page appears.
32+
3. Click **Generate API Key**. The **Generate API Key** window appears.
33+
4. Click **Generate Key**. You should see **API Key Granted**, and your key appears.
34+
5. Copy and save the key as `NVIDIA_API_KEY`.
35+
6. To verify your key, use the following code.
3736

38-
2. Click on your model of choice.
37+
```python
38+
import getpass
39+
import os
3940

40-
3. Under Input select the Python tab, and click `Get API Key`. Then click `Generate Key`.
41+
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
42+
print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
43+
else:
44+
nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
45+
assert nvapi_key.startswith(
46+
"nvapi-"
47+
), f"{nvapi_key[:5]}... is not a valid key"
48+
os.environ["NVIDIA_API_KEY"] = nvapi_key
49+
```
4150

42-
4. Copy and save the generated key as NVIDIA_API_KEY. From there, you should have access to the endpoints.
51+
You can now use your key to access endpoints on the NVIDIA API Catalog.
4352

44-
```python
45-
import getpass
46-
import os
47-
48-
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
49-
print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
50-
else:
51-
nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
52-
assert nvapi_key.startswith(
53-
"nvapi-"
54-
), f"{nvapi_key[:5]}... is not a valid key"
55-
os.environ["NVIDIA_API_KEY"] = nvapi_key
56-
```
53+
## Work with the API Catalog
5754

58-
## Working with API Catalog
55+
The following example chats with the default LLM.
5956

6057
```python
6158
from llama_index.llms.nvidia import NVIDIA
6259
from llama_index.core.llms import ChatMessage, MessageRole
6360

61+
# Use the default model
6462
llm = NVIDIA()
6563

6664
messages = [
@@ -76,26 +74,50 @@ messages = [
7674
llm.chat(messages)
7775
```
7876

79-
For models that are not included in the default model table (see [`llama_index/llms/nvidia/utils.py`](llama_index/llms/nvidia/utils.py)), you need to explicitly specify whether the model supports chat endpoints using the `is_chat_model` parameter:
77+
For models that are not included in the [CHAT_MODEL_TABLE](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/llms/llama-index-llms-nvidia/llama_index/llms/nvidia/utils.py), you must explicitly specify whether the model supports chat endpoints.
78+
Set the `is_chat_model` parameter as described following:
79+
80+
- `False` – Use the `/completions` endpoint. This is the default value.
81+
- `True` – Use the `/chat/completions` endpoint.
8082

81-
- `is_chat_model=False` (default): Uses the `/completions` endpoint
82-
- `is_chat_model=True`: Uses the `/chat/completions` endpoint
83+
The following example chats with the Llama-3.3-Nemotron-Super-49B-v1 LLM.
8384

8485
```python
86+
from llama_index.llms.nvidia import NVIDIA
87+
from llama_index.core.llms import ChatMessage, MessageRole
88+
89+
# Use a specific model
8590
llm = NVIDIA(
8691
model="nvidia/llama-3.3-nemotron-super-49b-v1", is_chat_model=True
8792
)
93+
94+
messages = [
95+
ChatMessage(
96+
role=MessageRole.SYSTEM, content=("You are a helpful assistant.")
97+
),
98+
ChatMessage(
99+
role=MessageRole.USER,
100+
content=("What are the most popular house pets in North America?"),
101+
),
102+
]
103+
104+
llm.chat(messages)
88105
```
89106

90-
## Working with NVIDIA NIMs
107+
## Self-host with NVIDIA NIM for LLMs
91108

92-
When ready to deploy, you can self-host models with NVIDIA NIM—which is included with the NVIDIA AI Enterprise software license—and run them anywhere, giving you ownership of your customizations and full control of your intellectual property (IP) and AI applications.
109+
When you are ready to deploy your AI application, you can self-host models with NVIDIA NIM for LLMs.
110+
For more information, refer to [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/).
93111

94-
[Learn more about NIMs](https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/)
112+
The following example code connects to a locally-hosted LLM.
95113

96114
```python
97115
from llama_index.llms.nvidia import NVIDIA
98116

99117
# connect to an chat NIM running at localhost:8080
100118
llm = NVIDIA(base_url="http://localhost:8080/v1")
101119
```
120+
121+
## Related Topics
122+
123+
- [Overview of NVIDIA NIM for Large Language Models (LLMs)](https://docs.nvidia.com/nim/large-language-models/latest/introduction.html)

0 commit comments

Comments
 (0)