Skip to content

Commit ff279fa

Browse files
Add support for Nscale (EU-Sovereign) Provider (#10638)
* Add support for nscale provider * Add image generation support and fix unit tests * Add docs for nscale * Fix unit test import issues * Minor doc improvement * Remove redundant null tokens from model cost map * Address PR review comments for doc updates * Revert changes to large text
1 parent 78c264d commit ff279fa

13 files changed

+552
-1
lines changed
Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
import Tabs from '@theme/Tabs';
2+
import TabItem from '@theme/TabItem';
3+
4+
# Nscale (EU Sovereign)
5+
6+
| Property | Details |
7+
|-------|-------|
8+
| Description | European-domiciled full-stack AI cloud platform for LLMs and image generation. |
9+
| Provider Route on LiteLLM | `nscale/` |
10+
| Supported Endpoints | `/chat/completions`, `/images/generations` |
11+
| API Reference | [Nscale docs](https://docs.nscale.com/docs/getting-started/overview) |
12+
13+
## Required Variables
14+
15+
```python showLineNumbers title="Environment Variables"
16+
os.environ["NSCALE_API_KEY"] = "" # your Nscale API key
17+
```
18+
19+
## Supported Models
20+
21+
### Chat Models
22+
23+
| Model Name | Description | Input Cost | Output Cost |
24+
|------------|-------------|------------|-------------|
25+
| nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct | 17B parameter model | $0.09/M tokens | $0.29/M tokens |
26+
| nscale/Qwen/Qwen2.5-Coder-3B-Instruct | 3B parameter coding model | $0.01/M tokens | $0.03/M tokens |
27+
| nscale/Qwen/Qwen2.5-Coder-7B-Instruct | 7B parameter coding model | $0.01/M tokens | $0.03/M tokens |
28+
| nscale/Qwen/Qwen2.5-Coder-32B-Instruct | 32B parameter coding model | $0.06/M tokens | $0.20/M tokens |
29+
| nscale/Qwen/QwQ-32B | 32B parameter model | $0.18/M tokens | $0.20/M tokens |
30+
| nscale/deepseek-ai/DeepSeek-R1-Distill-Llama-70B | 70B parameter distilled model | $0.375/M tokens | $0.375/M tokens |
31+
| nscale/deepseek-ai/DeepSeek-R1-Distill-Llama-8B | 8B parameter distilled model | $0.025/M tokens | $0.025/M tokens |
32+
| nscale/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B | 1.5B parameter distilled model | $0.09/M tokens | $0.09/M tokens |
33+
| nscale/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | 7B parameter distilled model | $0.20/M tokens | $0.20/M tokens |
34+
| nscale/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B | 14B parameter distilled model | $0.07/M tokens | $0.07/M tokens |
35+
| nscale/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B | 32B parameter distilled model | $0.15/M tokens | $0.15/M tokens |
36+
| nscale/mistralai/mixtral-8x22b-instruct-v0.1 | Mixtral 8x22B model | $0.60/M tokens | $0.60/M tokens |
37+
| nscale/meta-llama/Llama-3.1-8B-Instruct | 8B parameter model | $0.03/M tokens | $0.03/M tokens |
38+
| nscale/meta-llama/Llama-3.3-70B-Instruct | 70B parameter model | $0.20/M tokens | $0.20/M tokens |
39+
40+
### Image Generation Models
41+
42+
| Model Name | Description | Cost per Pixel |
43+
|------------|-------------|----------------|
44+
| nscale/black-forest-labs/FLUX.1-schnell | Fast image generation model | $0.0000000013 |
45+
| nscale/stabilityai/stable-diffusion-xl-base-1.0 | SDXL base model | $0.000000003 |
46+
47+
## Key Features
48+
- **EU Sovereign**: Full data sovereignty and compliance with European regulations
49+
- **Ultra-Low Cost (starting at $0.01 / M tokens)**: Extremely competitive pricing for both text and image generation models
50+
- **Production Grade**: Reliable serverless deployments with full isolation
51+
- **No Setup Required**: Instant access to compute without infrastructure management
52+
- **Full Control**: Your data remains private and isolated
53+
54+
## Usage - LiteLLM Python SDK
55+
56+
### Text Generation
57+
58+
```python showLineNumbers title="Nscale Text Generation"
59+
from litellm import completion
60+
import os
61+
62+
os.environ["NSCALE_API_KEY"] = "" # your Nscale API key
63+
response = completion(
64+
model="nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct",
65+
messages=[{"role": "user", "content": "What is LiteLLM?"}]
66+
)
67+
print(response)
68+
```
69+
70+
### Image Generation
71+
72+
```python showLineNumbers title="Nscale Image Generation"
73+
from litellm import image_generation
74+
import os
75+
76+
os.environ["NSCALE_API_KEY"] = "" # your Nscale API key
77+
response = image_generation(
78+
model="nscale/stabilityai/stable-diffusion-xl-base-1.0",
79+
prompt="A beautiful sunset over mountains",
80+
n=1,
81+
size="1024x1024"
82+
)
83+
print(response)
84+
```
85+
86+
## Usage - LiteLLM Proxy
87+
88+
Add the following to your LiteLLM Proxy configuration file:
89+
90+
```yaml showLineNumbers title="config.yaml"
91+
model_list:
92+
- model_name: nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct
93+
litellm_params:
94+
model: nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct
95+
api_key: os.environ/NSCALE_API_KEY
96+
- model_name: nscale/meta-llama/Llama-3.3-70B-Instruct
97+
litellm_params:
98+
model: nscale/meta-llama/Llama-3.3-70B-Instruct
99+
api_key: os.environ/NSCALE_API_KEY
100+
- model_name: nscale/stabilityai/stable-diffusion-xl-base-1.0
101+
litellm_params:
102+
model: nscale/stabilityai/stable-diffusion-xl-base-1.0
103+
api_key: os.environ/NSCALE_API_KEY
104+
```
105+
106+
Start your LiteLLM Proxy server:
107+
108+
```bash showLineNumbers title="Start LiteLLM Proxy"
109+
litellm --config config.yaml
110+
111+
# RUNNING on http://0.0.0.0:4000
112+
```
113+
114+
<Tabs>
115+
<TabItem value="openai-sdk" label="OpenAI SDK">
116+
117+
```python showLineNumbers title="Nscale via Proxy - Non-streaming"
118+
from openai import OpenAI
119+
120+
# Initialize client with your proxy URL
121+
client = OpenAI(
122+
base_url="http://localhost:4000", # Your proxy URL
123+
api_key="your-proxy-api-key" # Your proxy API key
124+
)
125+
126+
# Non-streaming response
127+
response = client.chat.completions.create(
128+
model="nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct",
129+
messages=[{"role": "user", "content": "What is LiteLLM?"}]
130+
)
131+
132+
print(response.choices[0].message.content)
133+
```
134+
135+
</TabItem>
136+
137+
<TabItem value="litellm-sdk" label="LiteLLM SDK">
138+
139+
```python showLineNumbers title="Nscale via Proxy - LiteLLM SDK"
140+
import litellm
141+
142+
# Configure LiteLLM to use your proxy
143+
response = litellm.completion(
144+
model="litellm_proxy/nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct",
145+
messages=[{"role": "user", "content": "What is LiteLLM?"}],
146+
api_base="http://localhost:4000",
147+
api_key="your-proxy-api-key"
148+
)
149+
150+
print(response.choices[0].message.content)
151+
```
152+
153+
</TabItem>
154+
155+
<TabItem value="curl" label="cURL">
156+
157+
```bash showLineNumbers title="Nscale via Proxy - cURL"
158+
curl http://localhost:4000/v1/chat/completions \
159+
-H "Content-Type: application/json" \
160+
-H "Authorization: Bearer your-proxy-api-key" \
161+
-d '{
162+
"model": "nscale/meta-llama/Llama-4-Scout-17B-16E-Instruct",
163+
"messages": [{"role": "user", "content": "What is LiteLLM?"}]
164+
}'
165+
```
166+
167+
</TabItem>
168+
</Tabs>
169+
170+
## Getting Started
171+
1. Create an account at [console.nscale.com](https://console.nscale.com)
172+
2. Add credit to your account (minimum $5)
173+
3. Create an API key in settings
174+
4. Start making API calls using LiteLLM
175+
176+
## Additional Resources
177+
- [Nscale Documentation](https://docs.nscale.com/docs/getting-started/overview)
178+
- [Blog: Sovereign Serverless](https://www.nscale.com/blog/sovereign-serverless-how-we-designed-full-isolation-without-sacrificing-performance)

docs/my-website/sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,7 @@ const sidebars = {
237237
"providers/watsonx",
238238
"providers/predibase",
239239
"providers/nvidia_nim",
240+
{ type: "doc", id: "providers/nscale", label: "Nscale (EU Sovereign)" },
240241
"providers/xai",
241242
"providers/lm_studio",
242243
"providers/cerebras",

litellm/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1032,6 +1032,7 @@ def add_known_models():
10321032
from .llms.deepseek.chat.transformation import DeepSeekChatConfig
10331033
from .llms.lm_studio.chat.transformation import LMStudioChatConfig
10341034
from .llms.lm_studio.embed.transformation import LmStudioEmbeddingConfig
1035+
from .llms.nscale.chat.transformation import NscaleConfig
10351036
from .llms.perplexity.chat.transformation import PerplexityChatConfig
10361037
from .llms.azure.chat.o_series_transformation import AzureOpenAIO1Config
10371038
from .llms.watsonx.completion.transformation import IBMWatsonXAIConfig

litellm/constants.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,7 @@
162162
"lm_studio",
163163
"galadriel",
164164
"meta_llama",
165+
"nscale",
165166
]
166167

167168

@@ -223,6 +224,7 @@
223224
"api.x.ai/v1",
224225
"api.galadriel.ai/v1",
225226
"api.llama.com/compat/v1/",
227+
"inference.api.nscale.com/v1",
226228
]
227229

228230

@@ -254,6 +256,7 @@
254256
"lm_studio",
255257
"galadriel",
256258
"meta_llama",
259+
"nscale",
257260
]
258261
openai_text_completion_compatible_providers: List = (
259262
[ # providers that support `/v1/completions`

litellm/litellm_core_utils/get_llm_provider_logic.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,9 @@ def get_llm_provider( # noqa: PLR0915
218218
elif endpoint == "https://api.llama.com/compat/v1":
219219
custom_llm_provider = "meta_llama"
220220
dynamic_api_key = api_key or get_secret_str("LLAMA_API_KEY")
221+
elif endpoint == litellm.NscaleConfig.API_BASE_URL:
222+
custom_llm_provider = "nscale"
223+
dynamic_api_key = litellm.NscaleConfig.get_api_key()
221224

222225
if api_base is not None and not isinstance(api_base, str):
223226
raise Exception(
@@ -597,6 +600,13 @@ def _get_openai_compatible_provider_info( # noqa: PLR0915
597600
or f"https://{get_secret('SNOWFLAKE_ACCOUNT_ID')}.snowflakecomputing.com/api/v2/cortex/inference:complete"
598601
) # type: ignore
599602
dynamic_api_key = api_key or get_secret_str("SNOWFLAKE_JWT")
603+
elif custom_llm_provider == "nscale":
604+
(
605+
api_base,
606+
dynamic_api_key,
607+
) = litellm.NscaleConfig()._get_openai_compatible_provider_info(
608+
api_base=api_base, api_key=api_key
609+
)
600610

601611
if api_base is not None and not isinstance(api_base, str):
602612
raise Exception("api base needs to be a string. api_base={}".format(api_base))

litellm/litellm_core_utils/get_supported_openai_params.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,8 @@ def get_supported_openai_params( # noqa: PLR0915
202202
return litellm.DeepInfraConfig().get_supported_openai_params(model=model)
203203
elif custom_llm_provider == "perplexity":
204204
return litellm.PerplexityChatConfig().get_supported_openai_params(model=model)
205+
elif custom_llm_provider == "nscale":
206+
return litellm.NscaleConfig().get_supported_openai_params(model=model)
205207
elif custom_llm_provider == "anyscale":
206208
return [
207209
"temperature",
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
from typing import Optional
2+
3+
from litellm.llms.openai.chat.gpt_transformation import OpenAIGPTConfig
4+
from litellm.secret_managers.main import get_secret_str
5+
6+
7+
class NscaleConfig(OpenAIGPTConfig):
8+
"""
9+
Reference: Nscale is OpenAI compatible.
10+
API Key: NSCALE_API_KEY
11+
Default API Base: https://inference.api.nscale.com/v1
12+
"""
13+
14+
API_BASE_URL = "https://inference.api.nscale.com/v1"
15+
16+
@property
17+
def custom_llm_provider(self) -> Optional[str]:
18+
return "nscale"
19+
20+
@staticmethod
21+
def get_api_key(api_key: Optional[str] = None) -> Optional[str]:
22+
return api_key or get_secret_str("NSCALE_API_KEY")
23+
24+
@staticmethod
25+
def get_api_base(api_base: Optional[str] = None) -> Optional[str]:
26+
return (
27+
api_base or get_secret_str("NSCALE_API_BASE") or NscaleConfig.API_BASE_URL
28+
)
29+
30+
def _get_openai_compatible_provider_info(
31+
self, api_base: Optional[str], api_key: Optional[str]
32+
) -> tuple[Optional[str], Optional[str]]:
33+
# This method is called by get_llm_provider to resolve api_base and api_key
34+
resolved_api_base = NscaleConfig.get_api_base(api_base)
35+
resolved_api_key = NscaleConfig.get_api_key(api_key)
36+
return resolved_api_base, resolved_api_key
37+
38+
def get_supported_openai_params(self, model: str) -> list:
39+
return [
40+
"max_tokens",
41+
"n",
42+
"temperature",
43+
"top_p",
44+
]

litellm/main.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4788,7 +4788,7 @@ def image_generation( # noqa: PLR0915
47884788
model=model,
47894789
prompt=prompt,
47904790
timeout=timeout,
4791-
api_key=api_key,
4791+
api_key=api_key or dynamic_api_key,
47924792
api_base=api_base,
47934793
logging_obj=litellm_logging_obj,
47944794
optional_params=optional_params,

0 commit comments

Comments
 (0)