You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -22,40 +22,52 @@ Integrating NeMo Guardrails improves safety and security of an Application LLM,
22
22
23
23
NeMo Guardrails can also call models for a specific guardrail on behalf of the client. Having guardrail-specific models allows the use of smaller fine-tuned models, which are specialized on the guardrails task. For example the NVIDIA Nemoguard collection of models includes [content-safety](https://build.nvidia.com/nvidia/llama-3_1-nemotron-safety-guard-8b-v3), [topic-control](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control), and [jailbreak-detect](https://build.nvidia.com/nvidia/nemoguard-jailbreak-detect) models. These models can be accessed on [build.nvidia.com](https://build.nvidia.com/) for rapid prototyping, or on [NGC Catalog](https://catalog.ngc.nvidia.com/) for deployment with NIM Docker containers.
24
24
25
-
## Application LLM Providers
26
-
27
-
The NeMo Guardrails library supports major LLM providers, including:
28
-
29
-
- OpenAI
30
-
- Azure OpenAI
31
-
- Anthropic
32
-
- Cohere
33
-
- Google Vertex AI
34
-
35
-
### Self-Hosted
36
-
37
-
The NeMo Guardrails library supports the following self-hosted LLM providers:
38
-
39
-
- HuggingFace Hub
40
-
- HuggingFace Endpoints
41
-
- vLLM
42
-
- Generic
43
-
44
-
### Providers from LangChain
45
-
46
-
The NeMo Guardrails library supports LLM providers from the LangChain Community, including both text completion and chat completion providers. Refer to [Chat model integrations](https://docs.langchain.com/oss/python/integrations/chat) in the LangChain documentation. You can also use the [`nemoguardrails find-providers`](find-providers-command) CLI command to discover available providers.
47
-
48
-
## Embedding Providers
49
-
50
-
The NeMo Guardrails library supports the following embedding providers:
51
-
52
-
- NVIDIA NIM
53
-
- NVIDIA AI Endpoints
54
-
- FastEmbed
55
-
- OpenAI
56
-
- Azure OpenAI
57
-
- Cohere
58
-
- SentenceTransformers
59
-
- Google
25
+
## Inference Providers
26
+
27
+
Each engine is served by a framework that manages the underlying HTTP or SDK calls. NeMo Guardrails ships with a built-in framework that talks to OpenAI-compatible endpoints over `httpx` with no LangChain dependency. For engines whose API is not OpenAI-compatible, opt into the LangChain framework by setting `NEMOGUARDRAILS_LLM_FRAMEWORK=langchain` and installing the matching `langchain-<provider>` package. To add a custom framework, implement the `LLMFramework` protocol from `nemoguardrails.types`.
|`azure`, `azure_openai`| LangChain (opt-in) | yes | yes | yes | Azure OpenAI is OpenAI-compatible at the wire level. The LangChain path (`langchain-openai`) is the convenient default because it handles the deployment-name URL pattern and `api-version` query string for you. Azure is also reachable through the built-in client by setting `parameters.base_url` to the deployment URL and passing `api-version` via `default_query` and `api-key` via `default_headers`. |
|`openai`| Built-in | yes | yes | yes | OpenAI public API or any OpenAI-compatible endpoint using `parameters.base_url`. For vLLM, TGI, OpenRouter, Together.ai, Fireworks.ai, Groq, DeepSeek, llama.cpp, NVIDIA Nemotron, and similar providers, use `engine: openai` with `parameters.base_url` and `parameters.api_key`. |
|`vllm_openai`, `deepseek`| LangChain (opt-in) | yes | yes | yes | Legacy LangChain provider engines. They continue to work when you opt into LangChain. For new configurations, use `engine: openai` with `parameters.base_url` when the wire protocol is OpenAI-compatible. |
50
+
|`<provider_name>`| LangChain (opt-in) | varies | varies | varies | Any community provider exposed through LangChain's chat-model integrations. Use the bare provider name as the engine name. |
51
+
52
+
For migration recipes between the built-in path and the LangChain path, see [Migrating to 0.22](../migration/0.22.md).
53
+
54
+
## LangChain-Backed Providers
55
+
56
+
The NeMo Guardrails library supports LLM providers from the LangChain Community, including both text completion and chat completion providers. Refer to [Chat model integrations](https://python.langchain.com/docs/integrations/chat/) in the LangChain documentation. You can also use the [`nemoguardrails find-providers`](find-providers-command) CLI command to discover available providers.
57
+
58
+
## Embedding Model Providers
59
+
60
+
The NeMo Guardrails library uses embedding models for vector similarity search in dialog rails, `embeddings_only` intent matching, and knowledge base retrieval. The following table lists the supported embedding model providers and their corresponding engine names.
61
+
62
+
| Provider | Engine | Notes |
63
+
| --- | --- | --- |
64
+
| NVIDIA NIM |`nim`| NVIDIA NIM microservices |
65
+
| NVIDIA AI Endpoints |`nvidia_ai_endpoints`| Alias for `nim`|
66
+
| FastEmbed |`fastembed`| FastEmbed embedding model provider |
67
+
| OpenAI |`openai`| OpenAI embedding model provider |
68
+
| Azure OpenAI |`azure`| Azure OpenAI embedding model provider |
69
+
| Cohere |`cohere`| Cohere embedding model provider |
70
+
| SentenceTransformers |`sentence_transformers`| SentenceTransformers embedding model provider |
71
+
| Google |`google`| Google embedding model provider |
60
72
61
73
For more information on configuring embedding providers, refer to [Embedding Search Providers](../configure-rails/other-configurations/embedding-search-providers.md).
0 commit comments