Skip to content

Commit 7b6ffbb

Browse files
jquinterclaude
andcommitted
fix(tests): wrap callbacks cleanup in try/finally and resolve merge conflict
- test_litellm_pre_call_utils.py: wrap test body in try/finally so litellm.callbacks is always restored even when an assertion fails, addressing greptile review comment - test_langfuse_otel.py: resolve trivial merge conflict in comment ("unpatched" vs "unpatch-ed"), keeping correct spelling Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2 parents 2e0a8b3 + 12f5d64 commit 7b6ffbb

File tree

158 files changed

+14069
-1381
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

158 files changed

+14069
-1381
lines changed

.github/workflows/test-litellm-matrix.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,10 @@ jobs:
102102
run: |
103103
cd enterprise && poetry run pip install -e . && cd ..
104104
105+
- name: Generate Prisma client
106+
run: |
107+
poetry run prisma generate --schema litellm/proxy/schema.prisma
108+
105109
- name: Run tests - ${{ matrix.test-group.name }}
106110
run: |
107111
poetry run pytest ${{ matrix.test-group.path }} \
Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
# Mock Prompt Management Server
2+
3+
A reference implementation of the [LiteLLM Generic Prompt Management API](https://docs.litellm.ai/docs/adding_provider/generic_prompt_management_api).
4+
5+
This FastAPI server demonstrates how to build a prompt management API that integrates with LiteLLM without requiring a PR to the LiteLLM repository.
6+
7+
## Quick Start
8+
9+
### 1. Install Dependencies
10+
11+
```bash
12+
pip install fastapi uvicorn pydantic
13+
```
14+
15+
### 2. Start the Server
16+
17+
```bash
18+
python mock_prompt_management_server.py
19+
```
20+
21+
The server will start on `http://localhost:8080`
22+
23+
### 3. Test the Endpoint
24+
25+
```bash
26+
# Get a prompt
27+
curl "http://localhost:8080/beta/litellm_prompt_management?prompt_id=hello-world-prompt"
28+
29+
# Get a prompt with authentication
30+
curl "http://localhost:8080/beta/litellm_prompt_management?prompt_id=hello-world-prompt" \
31+
-H "Authorization: Bearer test-token-12345"
32+
33+
# List all prompts
34+
curl "http://localhost:8080/prompts"
35+
36+
# Get prompt variables
37+
curl "http://localhost:8080/prompts/hello-world-prompt/variables"
38+
```
39+
40+
## Using with LiteLLM
41+
42+
### Configuration
43+
44+
Create a `config.yaml` file:
45+
46+
```yaml
47+
model_list:
48+
- model_name: gpt-3.5-turbo
49+
litellm_params:
50+
model: openai/gpt-3.5-turbo
51+
api_key: os.environ/OPENAI_API_KEY
52+
53+
prompts:
54+
- prompt_id: "hello-world-prompt"
55+
litellm_params:
56+
prompt_integration: "generic_prompt_management"
57+
api_base: http://localhost:8080
58+
api_key: test-token-12345
59+
```
60+
61+
### Start LiteLLM Proxy
62+
63+
```bash
64+
litellm --config config.yaml
65+
```
66+
67+
### Make a Request
68+
69+
```bash
70+
curl http://0.0.0.0:4000/v1/chat/completions \
71+
-H "Content-Type: application/json" \
72+
-H "Authorization: Bearer sk-1234" \
73+
-d '{
74+
"model": "gpt-3.5-turbo",
75+
"prompt_id": "hello-world-prompt",
76+
"prompt_variables": {
77+
"domain": "data science",
78+
"task": "analyzing customer behavior"
79+
},
80+
"messages": [
81+
{"role": "user", "content": "Please help me get started"}
82+
]
83+
}'
84+
```
85+
86+
## Available Prompts
87+
88+
The server includes several example prompts:
89+
90+
| Prompt ID | Description | Variables |
91+
|-----------|-------------|-----------|
92+
| `hello-world-prompt` | Basic helpful assistant | `domain`, `task` |
93+
| `code-review-prompt` | Code review assistant | `years_experience`, `language`, `code` |
94+
| `customer-support-prompt` | Customer support agent | `company_name`, `customer_message` |
95+
| `data-analysis-prompt` | Data analysis expert | `analysis_type`, `dataset_name`, `data` |
96+
| `creative-writing-prompt` | Creative writing assistant | `genre`, `length`, `topic` |
97+
98+
## Authentication
99+
100+
The server supports optional Bearer token authentication. Valid tokens for testing:
101+
102+
- `test-token-12345`
103+
- `dev-token-67890`
104+
- `prod-token-abcdef`
105+
106+
If no `Authorization` header is provided, requests are allowed (for testing purposes).
107+
108+
## API Endpoints
109+
110+
### LiteLLM Spec Endpoints
111+
112+
#### `GET /beta/litellm_prompt_management`
113+
114+
Get a prompt by ID (required by LiteLLM).
115+
116+
**Query Parameters:**
117+
- `prompt_id` (required): The prompt ID
118+
- `project_name` (optional): Project filter
119+
- `slug` (optional): Slug filter
120+
- `version` (optional): Version filter
121+
122+
**Response:**
123+
```json
124+
{
125+
"prompt_id": "hello-world-prompt",
126+
"prompt_template": [
127+
{
128+
"role": "system",
129+
"content": "You are a helpful assistant specialized in {domain}."
130+
},
131+
{
132+
"role": "user",
133+
"content": "Help me with: {task}"
134+
}
135+
],
136+
"prompt_template_model": "gpt-4",
137+
"prompt_template_optional_params": {
138+
"temperature": 0.7,
139+
"max_tokens": 500
140+
}
141+
}
142+
```
143+
144+
### Convenience Endpoints (Not in LiteLLM Spec)
145+
146+
#### `GET /health`
147+
148+
Health check endpoint.
149+
150+
#### `GET /prompts`
151+
152+
List all available prompts.
153+
154+
#### `GET /prompts/{prompt_id}/variables`
155+
156+
Get all variables used in a prompt template.
157+
158+
#### `POST /prompts`
159+
160+
Create a new prompt (in-memory only, for testing).
161+
162+
## Example: Full Integration Test
163+
164+
### 1. Start the Mock Server
165+
166+
```bash
167+
python mock_prompt_management_server.py
168+
```
169+
170+
### 2. Test with Python
171+
172+
```python
173+
from litellm import completion
174+
175+
# The completion will:
176+
# 1. Fetch the prompt from your API
177+
# 2. Replace {domain} with "machine learning"
178+
# 3. Replace {task} with "building a recommendation system"
179+
# 4. Merge with your messages
180+
# 5. Use the model and params from the prompt
181+
182+
response = completion(
183+
model="gpt-4",
184+
prompt_id="hello-world-prompt",
185+
prompt_variables={
186+
"domain": "machine learning",
187+
"task": "building a recommendation system"
188+
},
189+
messages=[
190+
{"role": "user", "content": "I have user behavior data from the past year."}
191+
],
192+
# Configure the generic prompt manager
193+
generic_prompt_config={
194+
"api_base": "http://localhost:8080",
195+
"api_key": "test-token-12345",
196+
}
197+
)
198+
199+
print(response.choices[0].message.content)
200+
```
201+
202+
## Customization
203+
204+
### Adding New Prompts
205+
206+
Edit the `PROMPTS_DB` dictionary in `mock_prompt_management_server.py`:
207+
208+
```python
209+
PROMPTS_DB = {
210+
"my-custom-prompt": {
211+
"prompt_id": "my-custom-prompt",
212+
"prompt_template": [
213+
{
214+
"role": "system",
215+
"content": "You are a {role}."
216+
},
217+
{
218+
"role": "user",
219+
"content": "{user_input}"
220+
}
221+
],
222+
"prompt_template_model": "gpt-4",
223+
"prompt_template_optional_params": {
224+
"temperature": 0.8,
225+
"max_tokens": 1000
226+
}
227+
}
228+
}
229+
```
230+
231+
### Using a Database
232+
233+
Replace the `PROMPTS_DB` dictionary with database queries:
234+
235+
```python
236+
@app.get("/beta/litellm_prompt_management")
237+
async def get_prompt(prompt_id: str):
238+
# Fetch from database
239+
prompt = await db.prompts.find_one({"prompt_id": prompt_id})
240+
241+
if not prompt:
242+
raise HTTPException(status_code=404, detail="Prompt not found")
243+
244+
return PromptResponse(**prompt)
245+
```
246+
247+
### Adding Access Control
248+
249+
Use the custom query parameters for access control:
250+
251+
```python
252+
@app.get("/beta/litellm_prompt_management")
253+
async def get_prompt(
254+
prompt_id: str,
255+
project_name: Optional[str] = None,
256+
user_id: Optional[str] = None,
257+
authorization: Optional[str] = Header(None)
258+
):
259+
token = verify_api_key(authorization)
260+
261+
# Check if user has access to this project
262+
if not has_project_access(token, project_name):
263+
raise HTTPException(status_code=403, detail="Access denied")
264+
265+
# Fetch and return prompt
266+
...
267+
```
268+
269+
## Production Considerations
270+
271+
Before deploying to production:
272+
273+
1. **Use a real database** instead of in-memory storage
274+
2. **Implement proper authentication** with JWT tokens or API keys
275+
3. **Add rate limiting** to prevent abuse
276+
4. **Use HTTPS** for encrypted communication
277+
5. **Add logging and monitoring** for observability
278+
6. **Implement caching** for frequently accessed prompts
279+
7. **Add versioning** for prompt management
280+
8. **Implement access control** based on teams/users
281+
9. **Add input validation** for all parameters
282+
10. **Use environment variables** for configuration
283+
284+
## Related Documentation
285+
286+
- [Generic Prompt Management API Documentation](https://docs.litellm.ai/docs/adding_provider/generic_prompt_management_api)
287+
- [LiteLLM Prompt Management](https://docs.litellm.ai/docs/proxy/prompt_management)
288+
- [Generic Guardrail API](https://docs.litellm.ai/docs/adding_provider/generic_guardrail_api)
289+
290+
## Questions?
291+
292+
This is a reference implementation for the LiteLLM Generic Prompt Management API. For questions or issues, please open an issue on the [LiteLLM GitHub repository](https://github.com/BerriAI/litellm).
293+

0 commit comments

Comments
 (0)