Configuration for augmenting documents with LLM generated fields. Stage Category: APPLY (1-1 Enrichment/Generation) Transformation: N documents → N documents (same count, expanded schema) Purpose: Applies LLM generation to each input document, creating new fields with generated content. Use this for summaries, insights, descriptions, or transformations. Each input document produces exactly one output document with added generated fields. When to Use: - After FILTER/SORT to enhance final results with generated content - For summarization of long content - To extract structured data (entities, insights, key points) - For content transformation (translation, rephrasing, formatting) - To generate descriptions, titles, or metadata - For creative augmentation (suggestions, recommendations) When NOT to Use: - For removing documents (use FILTER: llm_filter instead) - For simple field transformations (use direct field mapping) - For initial document retrieval (use FILTER: hybrid_search) - For reordering (use SORT stages) - When fast response time is critical (LLM generation is slow, 200ms-5s) - When cost is a major concern (LLM generation is very expensive) - For large batch processing (consider async batch jobs instead) Operational Behavior: - Applies LLM generation to each input document (1-1 operation) - Maintains document count: N in → N out - Expands schema: adds new generated fields to each document - Makes HTTP requests to Engine service for LLM inference - Very slow operation (LLM generation, 200ms-5s per document batch) - Processes documents in batches to optimize throughput - Supports concurrent batching for parallel LLM calls Common Pipeline Position: FILTER → SORT → APPLY (this stage) Cost & Performance: - Very Expensive: LLM generation costs per document (10-100x vs embeddings) - Very Slow: 200ms-5s per batch depending on LLM and generation length - CRITICAL: Use when parameter for selective enrichment (massive cost savings) - Consider enriching only top-ranked results after RANK stage - Smaller batch sizes often better for latency Conditional Enrichment: Supports when parameter to only enrich specific documents. CRITICAL FOR COST SAVINGS - LLM generation is expensive! - Only summarize long documents (word_count > 500) - Only process high-priority items - Only enrich specific content types (articles, not images) Requirements: - provider: OPTIONAL, LLM provider (openai, google, anthropic). Auto-inferred if not specified. - model_name: OPTIONAL, specific model name. Uses provider default if not specified. - prompt: REQUIRED, LLM prompt template (supports {DOC.field}, {INPUT.field}) - output_field: REQUIRED, where to store generated content - batch_size: OPTIONAL, documents per batch (default 5) - schema: OPTIONAL, JSON schema for structured output - when: OPTIONAL but RECOMMENDED for cost control Use Cases: - Summarization: Generate 3-sentence summaries of articles - Insight extraction: Extract key takeaways and insights - Description generation: Create product descriptions from specs - Translation: Translate content to other languages - Entity extraction: Extract people, places, organizations - Recommendation generation: Create personalized suggestions Examples: Unconditional enrichment: json { \"provider\": \"openai\", \"model_name\": \"gpt-4o-mini\", \"prompt\": \"Summarize the document\", \"output_field\": \"metadata.summary\" } Conditional enrichment (only summarize long documents): json { \"provider\": \"google\", \"model_name\": \"gemini-2.5-flash\", \"prompt\": \"Summarize the document\", \"output_field\": \"metadata.summary\", \"when\": { \"field\": \"metadata.word_count\", \"operator\": \"gt\", \"value\": 500 } }
| Name | Type | Description | Notes |
|---|---|---|---|
| provider | StageDefsLLMProvider | LLM provider to use. Supported providers: - openai: GPT models (GPT-4o, GPT-4o-mini) - google: Gemini models (Gemini 2.5 Flash) - anthropic: Claude models (Claude 3.5 Sonnet/Haiku) If not specified, defaults to 'google'. Can be auto-inferred from model_name. | [optional] |
| model_name | str | Specific LLM model to use. If not specified, uses provider default. Examples: gemini-2.5-flash, gpt-4o-mini, claude-3-5-haiku-20241022 | [optional] [default to 'null'] |
| inference_name | str | DEPRECATED: Use 'provider' and 'model_name' instead. Legacy format: 'provider:model' (e.g., 'gemini:gemini-2.5-flash'). Kept for backward compatibility only. | [optional] [default to 'null'] |
| prompt | str | Prompt template for the LLM (supports doc/input templates). | [optional] [default to 'Summarize the following content in 2-3 sentences: {{DOC.content}}'] |
| output_field | str | Dot-path where the enrichment result should be stored. | [optional] [default to 'metadata.summary'] |
| batch_size | int | Number of documents to enrich per LLM request batch. | [optional] [default to 5] |
| var_schema | Dict[str, object] | Optional JSON schema instructions for the LLM output. | [optional] |
| max_tokens | int | Maximum output tokens for the LLM response. If not specified, uses the provider default (4000). Increase this if output is being truncated. Note: Gemini counts tokens more aggressively than Claude/GPT — a 1000-token limit may produce only ~350 characters with Gemini. | [optional] [default to null] |
| temperature | float | Sampling temperature passed to the LLM. | [optional] [default to 0.2] |
| api_key | str | OPTIONAL. Bring Your Own Key (BYOK) - use your own LLM API key instead of Mixpeek's. How to use: 1. Store your API key as an organization secret via POST /v1/organizations/secrets Example: {"secret_name": "openai_api_key", "secret_value": "sk-proj-..."} 2. Reference it here using template syntax: {{secrets.openai_api_key}} Benefits: - Use your own API credits and rate limits - Keep your API keys secure in Mixpeek's encrypted vault - No changes needed to your retriever when rotating keys If not provided, uses Mixpeek's default API keys (usage charged to your account). | [optional] |
| use_vcache | bool | Whether to use semantic caching (vCache) for LLM calls in this stage. When True, semantically similar prompts return cached responses, reducing cost. When False, every call goes directly to the LLM provider, reducing latency. When None (default), falls back to the global VCACHE_ENABLED setting. | [optional] [default to False] |
| when | StageDefsLogicalOperator | OPTIONAL. Conditional filter that documents must satisfy to be enriched with LLM. Uses LogicalOperator (AND/OR/NOT) for complex boolean logic, or simple field/operator/value for single conditions. Documents NOT matching this condition will SKIP enrichment (pass-through unchanged). CRITICAL FOR COST SAVINGS - LLM calls are expensive! Only enrich documents that need it. When NOT specified, ALL documents are enriched unconditionally (may incur high costs). Use cases: - Only summarize documents with word_count > 500 - Only enrich English articles/blogs - Only process high-priority items Simple condition example: {"field": "metadata.word_count", "operator": "gt", "value": 500} Boolean AND example: {"AND": [{"field": "category", "operator": "in", "value": ["article"]}, ...]} | [optional] |
from mixpeek.models.stage_params_llm_enrich import StageParamsLlmEnrich
# TODO update the JSON string below
json = "{}"
# create an instance of StageParamsLlmEnrich from a JSON string
stage_params_llm_enrich_instance = StageParamsLlmEnrich.from_json(json)
# print the JSON string representation of the object
print(StageParamsLlmEnrich.to_json())
# convert the object into a dict
stage_params_llm_enrich_dict = stage_params_llm_enrich_instance.to_dict()
# create an instance of StageParamsLlmEnrich from a dict
stage_params_llm_enrich_from_dict = StageParamsLlmEnrich.from_dict(stage_params_llm_enrich_dict)