Skip to content

Commit 339fc03

Browse files
Add model config to prompt (backend and docs change) (mlflow#19174)
1 parent ef54810 commit 339fc03

File tree

13 files changed

+897
-18
lines changed

13 files changed

+897
-18
lines changed

docs/api_reference/api_inventory.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -588,6 +588,9 @@ mlflow.entities.model_registry.ModelVersionSearch.tags
588588
mlflow.entities.model_registry.ModelVersionTag
589589
mlflow.entities.model_registry.ModelVersionTag.from_proto
590590
mlflow.entities.model_registry.ModelVersionTag.to_proto
591+
mlflow.entities.model_registry.PromptModelConfig
592+
mlflow.entities.model_registry.PromptModelConfig.from_dict
593+
mlflow.entities.model_registry.PromptModelConfig.to_dict
591594
mlflow.entities.model_registry.PromptVersion
592595
mlflow.entities.model_registry.PromptVersion.convert_response_format_to_dict
593596
mlflow.entities.model_registry.PromptVersion.format
@@ -613,6 +616,7 @@ mlflow.entities.model_registry.model_version_deployment_job_state.ModelVersionDe
613616
mlflow.entities.model_registry.model_version_search.ModelVersionSearch
614617
mlflow.entities.model_registry.model_version_tag.ModelVersionTag
615618
mlflow.entities.model_registry.prompt.Prompt
619+
mlflow.entities.model_registry.prompt_version.PromptModelConfig
616620
mlflow.entities.model_registry.prompt_version.PromptVersion
617621
mlflow.entities.model_registry.registered_model.RegisteredModel
618622
mlflow.entities.model_registry.registered_model_alias.RegisteredModelAlias
@@ -761,6 +765,7 @@ mlflow.genai.delete_dataset
761765
mlflow.genai.delete_dataset_tag
762766
mlflow.genai.delete_labeling_session
763767
mlflow.genai.delete_prompt_alias
768+
mlflow.genai.delete_prompt_model_config
764769
mlflow.genai.delete_prompt_tag
765770
mlflow.genai.delete_prompt_version_tag
766771
mlflow.genai.disable_git_model_versioning
@@ -898,6 +903,7 @@ mlflow.genai.search_datasets
898903
mlflow.genai.search_prompts
899904
mlflow.genai.set_dataset_tags
900905
mlflow.genai.set_prompt_alias
906+
mlflow.genai.set_prompt_model_config
901907
mlflow.genai.set_prompt_tag
902908
mlflow.genai.set_prompt_version_tag
903909
mlflow.genai.to_predict_fn

docs/docs/genai/prompt-registry/index.mdx

Lines changed: 193 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -246,6 +246,7 @@ Key attributes of a Prompt object:
246246
- `Alias`: An mutable named reference to the prompt. For example, you can create an alias named `production` to refer to the version used in your production system. See [Aliases](/genai/prompt-registry/manage-prompt-lifecycles-with-aliases) for more details.
247247
- `is_text_prompt`: A boolean property indicating whether the prompt is a text prompt (True) or chat prompt (False).
248248
- `response_format`: An optional property containing the expected response structure specification, which can be used to validate or structure outputs from LLM calls.
249+
- `model_config`: An optional dictionary containing model-specific configuration such as model name, temperature, max_tokens, and other inference parameters. See [Model Configuration](#model-configuration) for more details.
249250

250251
### Prompt Types
251252

@@ -312,18 +313,202 @@ mlflow.genai.load_prompt("prompts:/summarization-prompt/1").tags
312313
mlflow.genai.delete_prompt_version_tag("summarization-prompt", 1, "author")
313314
```
314315

316+
## Model Configuration
317+
318+
MLflow Prompt Registry allows you to store model-specific configuration alongside your prompts, ensuring reproducibility and clarity about which model and parameters were used with a particular prompt version. This is especially useful when you want to:
319+
320+
- Version both prompt templates and model parameters together
321+
- Share prompts with recommended model settings across your team
322+
- Reproduce exact inference configurations from previous experiments
323+
- Maintain different model configurations for different prompt versions
324+
325+
### Basic Usage
326+
327+
You can attach model configuration to a prompt by passing a `model_config` parameter when registering:
328+
329+
```python
330+
import mlflow
331+
332+
# Using a dictionary
333+
model_config = {
334+
"model_name": "gpt-4",
335+
"temperature": 0.7,
336+
"max_tokens": 1000,
337+
"top_p": 0.9,
338+
}
339+
340+
mlflow.genai.register_prompt(
341+
name="qa-prompt",
342+
template="Answer the following question: {{question}}",
343+
model_config=model_config,
344+
commit_message="QA prompt with model config",
345+
)
346+
347+
# Load and access the model config
348+
prompt = mlflow.genai.load_prompt("qa-prompt")
349+
print(f"Model: {prompt.model_config['model_name']}")
350+
print(f"Temperature: {prompt.model_config['temperature']}")
351+
```
352+
353+
### Using PromptModelConfig Class
354+
355+
For better type safety and validation, you can use the <APILink fn="mlflow.entities.model_registry.PromptModelConfig" /> class:
356+
357+
```python
358+
import mlflow
359+
from mlflow.entities.model_registry import PromptModelConfig
360+
361+
# Create a validated config object
362+
config = PromptModelConfig(
363+
model_name="gpt-4-turbo",
364+
temperature=0.5,
365+
max_tokens=2000,
366+
top_p=0.95,
367+
frequency_penalty=0.2,
368+
presence_penalty=0.1,
369+
stop_sequences=["END", "\n\n"],
370+
)
371+
372+
mlflow.genai.register_prompt(
373+
name="creative-prompt",
374+
template="Write a creative story about {{topic}}",
375+
model_config=config,
376+
)
377+
```
378+
379+
The `PromptModelConfig` class provides validation to catch errors early:
380+
381+
```python
382+
# This will raise a ValueError
383+
config = PromptModelConfig(temperature=-1.0) # temperature must be non-negative
384+
385+
# This will raise a ValueError
386+
config = PromptModelConfig(max_tokens=-100) # max_tokens must be positive
387+
```
388+
389+
### Supported Configuration Parameters
390+
391+
The following standard parameters are supported in `PromptModelConfig`:
392+
393+
- `model_name` (str): The name or identifier of the model (e.g., "gpt-4", "claude-3-opus")
394+
- `temperature` (float): Sampling temperature for controlling randomness (typically 0.0-2.0)
395+
- `max_tokens` (int): Maximum number of tokens to generate in the response
396+
- `top_p` (float): Nucleus sampling parameter (typically 0.0-1.0)
397+
- `top_k` (int): Top-k sampling parameter
398+
- `frequency_penalty` (float): Penalty for token frequency (typically -2.0 to 2.0)
399+
- `presence_penalty` (float): Penalty for token presence (typically -2.0 to 2.0)
400+
- `stop_sequences` (list[str]): List of sequences that will cause the model to stop generating
401+
- `extra_params` (dict): Additional provider-specific or experimental parameters
402+
403+
### Provider-Specific Parameters
404+
405+
You can include provider-specific parameters using the `extra_params` field:
406+
407+
```python
408+
# Anthropic-specific configuration with extended thinking
409+
# See: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
410+
anthropic_thinking_config = PromptModelConfig(
411+
model_name="claude-sonnet-4-20250514",
412+
max_tokens=16000,
413+
extra_params={
414+
# Enable extended thinking for complex reasoning tasks
415+
"thinking": {
416+
"type": "enabled",
417+
"budget_tokens": 10000, # Max tokens for internal reasoning
418+
},
419+
# User tracking for abuse detection
420+
"metadata": {
421+
"user_id": "user-123",
422+
},
423+
},
424+
)
425+
426+
# OpenAI-specific configuration with reproducibility and structured output
427+
# See: https://platform.openai.com/docs/api-reference/chat/create
428+
openai_config = PromptModelConfig(
429+
model_name="gpt-4o",
430+
temperature=0.7,
431+
max_tokens=2000,
432+
extra_params={
433+
# Seed for reproducible outputs
434+
"seed": 42,
435+
# Bias specific tokens (token_id: bias from -100 to 100)
436+
"logit_bias": {"50256": -100}, # Discourage <|endoftext|>
437+
# User identifier for abuse tracking
438+
"user": "user-123",
439+
# Service tier for priority processing
440+
"service_tier": "default",
441+
},
442+
)
443+
```
444+
445+
### Managing Model Configuration
446+
447+
Model configuration is mutable and can be updated after a prompt version is created. This makes it easy to fix mistakes or iterate on model parameters without creating new prompt versions.
448+
449+
#### Setting or Updating Model Config
450+
451+
Use <APILink fn="mlflow.genai.set_prompt_model_config" /> to set or update the model configuration for a prompt version:
452+
453+
```python
454+
import mlflow
455+
from mlflow.entities.model_registry import PromptModelConfig
456+
457+
# Register a prompt without model config
458+
mlflow.genai.register_prompt(
459+
name="my-prompt",
460+
template="Analyze: {{text}}",
461+
)
462+
463+
# Later, add model config
464+
mlflow.genai.set_prompt_model_config(
465+
name="my-prompt",
466+
version=1,
467+
model_config={"model_name": "gpt-4", "temperature": 0.7},
468+
)
469+
470+
# Or update existing model config
471+
mlflow.genai.set_prompt_model_config(
472+
name="my-prompt",
473+
version=1,
474+
model_config={"model_name": "gpt-4-turbo", "temperature": 0.8, "max_tokens": 2000},
475+
)
476+
477+
# Verify the update
478+
prompt = mlflow.genai.load_prompt("my-prompt", version=1)
479+
print(prompt.model_config)
480+
```
481+
482+
#### Deleting Model Config
483+
484+
Use <APILink fn="mlflow.genai.delete_prompt_model_config" /> to remove model configuration from a prompt version:
485+
486+
```python
487+
import mlflow
488+
489+
# Remove model config
490+
mlflow.genai.delete_prompt_model_config(name="my-prompt", version=1)
491+
492+
# Verify removal
493+
prompt = mlflow.genai.load_prompt("my-prompt", version=1)
494+
assert prompt.model_config is None
495+
```
496+
497+
#### Important Notes
498+
499+
- Model config changes are **version-specific** - updating one version doesn't affect others
500+
- Model config is **mutable** - unlike the prompt template, it can be changed after creation
501+
- Changes are **immediate** - no need to create a new version to fix model parameters
502+
- **Validation applies** - The same validation rules apply when updating as when creating
503+
315504
## Prompt Caching
316505

317-
MLflow automatically caches loaded prompts in memory to improve performance and reduce repeated API
318-
calls. The caching behavior differs based on whether you're loading a prompt by **version** or
319-
by **alias**.
506+
MLflow automatically caches loaded prompts in memory to improve performance and reduce repeated API calls. The caching behavior differs based on whether you're loading a prompt by **version** or by **alias**.
320507

321508
### Default Caching Behavior
322509

323-
- **Version-based prompts** (e.g., `prompts:/summarization-prompt/1`): Cached with **infinite TTL**
324-
by default.
325-
- **Alias-based prompts** (e.g., `prompts:/summarization-prompt@latest` or `prompts:/summarization-prompt@production`): Cached with **60 seconds TTL** by default. Aliases can point to
326-
different versions over time, so a shorter TTL ensures your application picks up updates.
510+
- **Version-based prompts** (e.g., `prompts:/summarization-prompt/1`): Cached with **infinite TTL** by default. Since prompt versions are immutable, they can be safely cached indefinitely.
511+
- **Alias-based prompts** (e.g., `prompts:/summarization-prompt@latest` or `prompts:/summarization-prompt@production`): Cached with **60 seconds TTL** by default. Aliases can point to different versions over time, so a shorter TTL ensures your application picks up updates.
327512

328513
### Customizing Cache Behavior
329514

@@ -368,8 +553,7 @@ export MLFLOW_VERSION_PROMPT_CACHE_TTL_SECONDS=0
368553

369554
### Cache Invalidation
370555

371-
The cache item is automatically invalidated when you modify the prompt version or alias tags,
372-
including the following operations:
556+
The cache is automatically invalidated when you modify the prompt version or alias, including the following operations:
373557

374558
- `mlflow.genai.set_prompt_version_tag`
375559
- `mlflow.genai.set_prompt_alias`

mlflow/entities/model_registry/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
from mlflow.entities.model_registry.model_version_search import ModelVersionSearch
66
from mlflow.entities.model_registry.model_version_tag import ModelVersionTag
77
from mlflow.entities.model_registry.prompt import Prompt
8-
from mlflow.entities.model_registry.prompt_version import PromptVersion
8+
from mlflow.entities.model_registry.prompt_version import PromptModelConfig, PromptVersion
99
from mlflow.entities.model_registry.registered_model import RegisteredModel
1010
from mlflow.entities.model_registry.registered_model_alias import RegisteredModelAlias
1111
from mlflow.entities.model_registry.registered_model_deployment_job_state import (
@@ -16,6 +16,7 @@
1616

1717
__all__ = [
1818
"Prompt",
19+
"PromptModelConfig",
1920
"PromptVersion",
2021
"RegisteredModel",
2122
"ModelVersion",

0 commit comments

Comments
 (0)