You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/docs/genai/prompt-registry/index.mdx
+193-9Lines changed: 193 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -246,6 +246,7 @@ Key attributes of a Prompt object:
246
246
-`Alias`: An mutable named reference to the prompt. For example, you can create an alias named `production` to refer to the version used in your production system. See [Aliases](/genai/prompt-registry/manage-prompt-lifecycles-with-aliases) for more details.
247
247
-`is_text_prompt`: A boolean property indicating whether the prompt is a text prompt (True) or chat prompt (False).
248
248
-`response_format`: An optional property containing the expected response structure specification, which can be used to validate or structure outputs from LLM calls.
249
+
-`model_config`: An optional dictionary containing model-specific configuration such as model name, temperature, max_tokens, and other inference parameters. See [Model Configuration](#model-configuration) for more details.
MLflow Prompt Registry allows you to store model-specific configuration alongside your prompts, ensuring reproducibility and clarity about which model and parameters were used with a particular prompt version. This is especially useful when you want to:
319
+
320
+
- Version both prompt templates and model parameters together
321
+
- Share prompts with recommended model settings across your team
322
+
- Reproduce exact inference configurations from previous experiments
323
+
- Maintain different model configurations for different prompt versions
324
+
325
+
### Basic Usage
326
+
327
+
You can attach model configuration to a prompt by passing a `model_config` parameter when registering:
328
+
329
+
```python
330
+
import mlflow
331
+
332
+
# Using a dictionary
333
+
model_config = {
334
+
"model_name": "gpt-4",
335
+
"temperature": 0.7,
336
+
"max_tokens": 1000,
337
+
"top_p": 0.9,
338
+
}
339
+
340
+
mlflow.genai.register_prompt(
341
+
name="qa-prompt",
342
+
template="Answer the following question: {{question}}",
Model configuration is mutable and can be updated after a prompt version is created. This makes it easy to fix mistakes or iterate on model parameters without creating new prompt versions.
448
+
449
+
#### Setting or Updating Model Config
450
+
451
+
Use <APILinkfn="mlflow.genai.set_prompt_model_config" /> to set or update the model configuration for a prompt version:
452
+
453
+
```python
454
+
import mlflow
455
+
from mlflow.entities.model_registry import PromptModelConfig
- Model config changes are **version-specific** - updating one version doesn't affect others
500
+
- Model config is **mutable** - unlike the prompt template, it can be changed after creation
501
+
- Changes are **immediate** - no need to create a new version to fix model parameters
502
+
-**Validation applies** - The same validation rules apply when updating as when creating
503
+
315
504
## Prompt Caching
316
505
317
-
MLflow automatically caches loaded prompts in memory to improve performance and reduce repeated API
318
-
calls. The caching behavior differs based on whether you're loading a prompt by **version** or
319
-
by **alias**.
506
+
MLflow automatically caches loaded prompts in memory to improve performance and reduce repeated API calls. The caching behavior differs based on whether you're loading a prompt by **version** or by **alias**.
320
507
321
508
### Default Caching Behavior
322
509
323
-
-**Version-based prompts** (e.g., `prompts:/summarization-prompt/1`): Cached with **infinite TTL**
324
-
by default.
325
-
-**Alias-based prompts** (e.g., `prompts:/summarization-prompt@latest` or `prompts:/summarization-prompt@production`): Cached with **60 seconds TTL** by default. Aliases can point to
326
-
different versions over time, so a shorter TTL ensures your application picks up updates.
510
+
-**Version-based prompts** (e.g., `prompts:/summarization-prompt/1`): Cached with **infinite TTL** by default. Since prompt versions are immutable, they can be safely cached indefinitely.
511
+
-**Alias-based prompts** (e.g., `prompts:/summarization-prompt@latest` or `prompts:/summarization-prompt@production`): Cached with **60 seconds TTL** by default. Aliases can point to different versions over time, so a shorter TTL ensures your application picks up updates.
0 commit comments