StageParamsTaxonomyEnrich

Configuration for enriching documents with taxonomy assignments. Stage Category: APPLY (1-1 or 1-N depending on configuration) Transformation: - 1-1 mode (top_k=1): N documents → N documents (same count, expanded schema) - 1-N mode (top_k>1): N documents → NM documents (outer join/tagging) Purpose: Applies each document to a taxonomy search, matching against predefined taxonomy nodes using vector similarity. Can operate as 1-1 enrichment (single best match) or 1-N expansion (multiple matching tags). When to Use: - After FILTER/SORT to classify and tag retrieved documents - For automatic content categorization (topics, genres, entities) - When you have labeled reference data (people, products, categories) - For face recognition (matching faces against enrolled identities) - To apply hierarchical categorization (parent/child relationships) - For entity linking (matching content to knowledge base entities) - 1-1 mode (top_k=1): Single best match enrichment - 1-N mode (top_k>1): Multi-tag expansion (document multiplication) When NOT to Use: - For initial document retrieval from collections (use FILTER: hybrid_search) - For removing documents (use FILTER stages) - For reordering results (use SORT stages) - For general field-based JOINs (use document_enrich instead) - When you don't have a predefined taxonomy collection Operational Behavior: - Applies each input document to taxonomy vector search - Performs vector similarity search against taxonomy collection (Qdrant) - Document count: N in → N out (top_k=1) or NM out (top_k>1) - Expands or maintains schema depending on mode - Moderate performance (vector similarity searches per document) - Supports conditional enrichment (via when parameter for cost savings) Common Pipeline Position: FILTER → SORT → APPLY (this stage) Conditional Enrichment: Supports when parameter to only enrich documents matching specific criteria. Critical for: - Cost savings (vector searches are compute-intensive) - Selective enrichment based on document properties - Applying different taxonomies to different document types Requirements: - taxonomy_id: REQUIRED - ID of the taxonomy to use for enrichment - fields: OPTIONAL, which taxonomy fields to merge into documents - top_k: OPTIONAL, max taxonomy matches per document (default 3) - min_score: OPTIONAL, minimum similarity threshold (default 0.0) - when: OPTIONAL, condition for selective enrichment Use Cases: - Face recognition: Match detected faces to employee directory - Content classification: Tag articles with topic categories - Product categorization: Assign products to taxonomy of categories - Entity linking: Link mentions to knowledge base entities - Brand detection: Identify brand logos in images Examples: Basic taxonomy enrichment: json { \"taxonomy_id\": \"tax_abc123\", \"top_k\": 3 } Conditional enrichment (only enrich if category=product): json { \"taxonomy_id\": \"tax_product_classifier\", \"top_k\": 3, \"when\": { \"AND\": [ {\"field\": \"metadata.category\", \"operator\": \"eq\", \"value\": \"product\"}, {\"field\": \"metadata.has_image\", \"operator\": \"eq\", \"value\": true} ] } }

Properties

Name	Type	Description	Notes
taxonomy_id	str	ID of the taxonomy to use for enrichment. The taxonomy's configured input_mappings determine which vector field from source documents to use for similarity matching. NOTE: You must replace the default placeholder with your actual taxonomy ID.	[optional] [default to '{{TAXONOMY_ID}}']
fields	List[StageDefsEnrichmentField]	Fields from the taxonomy node to merge into the document.	[optional]
top_k	int	Maximum taxonomy assignments to attach per document.	[optional] [default to 3]
min_score	float	Minimum similarity score required to keep an assignment.	[optional] [default to 0]
when	StageDefsLogicalOperator	OPTIONAL. Conditional filter that documents must satisfy to be enriched. Uses LogicalOperator (AND/OR/NOT) for complex boolean logic, or simple field/operator/value for single conditions. Documents NOT matching this condition will SKIP enrichment (pass-through unchanged). Useful for cost savings (only enrich relevant documents) and conditional processing. When NOT specified, ALL documents are enriched unconditionally. Simple condition example: {"field": "metadata.category", "operator": "eq", "value": "product"} Boolean AND example: {"AND": [{"field": "x", "operator": "eq", "value": "y"}, ...]} Boolean OR example: {"OR": [{"field": "x", "operator": "eq", "value": "y"}, ...]}	[optional]

Example

from mixpeek.models.stage_params_taxonomy_enrich import StageParamsTaxonomyEnrich

# TODO update the JSON string below
json = "{}"
# create an instance of StageParamsTaxonomyEnrich from a JSON string
stage_params_taxonomy_enrich_instance = StageParamsTaxonomyEnrich.from_json(json)
# print the JSON string representation of the object
print(StageParamsTaxonomyEnrich.to_json())

# convert the object into a dict
stage_params_taxonomy_enrich_dict = stage_params_taxonomy_enrich_instance.to_dict()
# create an instance of StageParamsTaxonomyEnrich from a dict
stage_params_taxonomy_enrich_from_dict = StageParamsTaxonomyEnrich.from_dict(stage_params_taxonomy_enrich_dict)

[Back to Model list] [Back to API list] [Back to README]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StageParamsTaxonomyEnrich

Properties

Example

FilesExpand file tree

StageParamsTaxonomyEnrich.md

Latest commit

History

StageParamsTaxonomyEnrich.md

File metadata and controls

StageParamsTaxonomyEnrich

Properties

Example