Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions _ml-commons-plugin/agents-tools/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@ An _agent_ orchestrates and runs ML models and tools. For a list of supported ag

A _tool_ performs a set of specific tasks. Some examples of tools are the [`VectorDBTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/vector-db-tool/), which supports vector search, and the [`ListIndexTool`]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/list-index-tool/), which executes the List Indices API. For a list of supported tools, see [Tools]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/tools/index/).

You can modify and transform tool outputs using [output processors]({{site.url}}{{site.baseurl}}/ml-commons-plugin/agents-tools/output-processors/). Output processors allow you to chain multiple data transformations that execute sequentially on any tool's output.

302 changes: 302 additions & 0 deletions _ml-commons-plugin/agents-tools/output-processors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,302 @@
---
layout: default
title: Output processors
parent: Agents and tools
grand_parent: ML Commons APIs
nav_order: 30
---

# Output processors
**Introduced 3.3**
{: .label .label-purple }

Output processors allow you to modify and transform the output of any tool before it's returned to the agent or user. You can chain multiple output processors together to create complex data transformation pipelines that execute sequentially.

## Overview

Output processors provide a powerful way to:

- **Transform data formats**: Convert between different data structures (strings, JSON, arrays)
- **Extract specific information**: Use JSONPath or regex patterns to pull out relevant data
- **Clean and filter content**: Remove unwanted fields or apply formatting rules
- **Standardize outputs**: Ensure consistent data formats across different tools

Each tool can have multiple output processors that execute in the order they are defined. The output of one processor becomes the input for the next processor in the chain.

## Configuration

Add output processors to any tool by including an `output_processors` array in the tool's `parameters` section during agent registeration:

Check failure on line 28 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: registeration. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: registeration. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 28, "column": 125}}}, "severity": "ERROR"}

Example:
```json
{
"type": "ToolName",
"parameters": {
"output_processors": [
{
"type": "processor_type",
"parameter1": "value1",
"parameter2": "value2"
},
{
"type": "another_processor_type",
"parameter": "value"
}
]
}
}
```

### Sequential execution

Output processors execute in the order they appear in the array. Each processor receives the output from the previous processor (or the original tool output for the first processor):

```
Tool Output → Processor 1 → Processor 2 → Processor 3 → Final Output
```

### Complete example

**Step 1: Register a flow agent with output processors**

```json
POST /_plugins/_ml/agents/_register
{
"name": "Index Summary Agent",
"type": "flow",
"description": "Agent that provides clean index summaries",
"tools": [
{
"type": "ListIndexTool",
"parameters": {
"output_processors": [
{
"type": "regex_replace",
"pattern": "^.*?\n",
"replacement": ""
},
{
"type": "regex_capture",
"pattern": "(\\d+,\\w+,\\w+,([^,]+))"
}
]
}
}
]
}
```

**Step 2: Execute the agent**

Using the `agent_id` returned in the previous step:

```json
POST /_plugins/_ml/agents/{agent_id}/_execute
{
"parameters": {
"question": "List the indices"
}
}
```

**Without output processors, the raw ListIndexTool would return:**
```
row,health,status,index,uuid,pri,rep,docs.count,docs.deleted,store.size,pri.store.size
1,green,open,.plugins-ml-model-group,DCJHJc7pQ6Gid02PaSeXBQ,1,0,1,0,12.7kb,12.7kb
2,green,open,.plugins-ml-memory-message,6qVpepfRSCi9bQF_As_t2A,1,0,7,0,53kb,53kb
3,green,open,.plugins-ml-memory-meta,LqP3QMaURNKYDZ9p8dTq3Q,1,0,2,0,44.8kb,44.8kb
```

**With output processors, the agent returns:**
```
1,green,open,.plugins-ml-model-group
2,green,open,.plugins-ml-memory-message
3,green,open,.plugins-ml-memory-meta
```

The output processors transform the verbose CSV output into a clean, readable format by:
1. **`regex_replace`**: Removing the CSV header row
2. **`regex_capture`**: Extracting only essential information (row number, health, status, and index name)

## Supported Output Processor Types

Check failure on line 121 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Supported Output Processor Types' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Supported Output Processor Types' is a heading and should be in sentence case.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 121, "column": 4}}}, "severity": "ERROR"}

### to_string

Check failure on line 123 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings. Raw Output: {"message": "[OpenSearch.StackedHeadings] Do not stack headings. Insert an introductory sentence between headings.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 123, "column": 1}}}, "severity": "ERROR"}

Converts the input to a JSON string representation.

**Parameters:**
- `escape_json` (boolean, optional): Whether to escape JSON characters. Default: `false`

Check failure on line 128 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 128, "column": 18}}}, "severity": "ERROR"}

**Configuration:**
```json
{
"type": "to_string",
"escape_json": true
}
```

**Input/Output Example:**
```
Input: {"name": "test", "value": 123}
Output: "{\"name\":\"test\",\"value\":123}"
```

### regex_replace

Replaces text using regular expression patterns.

**Parameters:**
- `pattern` (string, required): Regular expression pattern to match
- `replacement` (string, optional): Replacement text. Default: `""`
- `replace_all` (boolean, optional): Whether to replace all matches or just the first. Default: `true`

Check warning on line 151 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Simple] Don't use 'just' because it's not neutral in tone. If you mean 'only', use 'only' instead. Raw Output: {"message": "[OpenSearch.Simple] Don't use 'just' because it's not neutral in tone. If you mean 'only', use 'only' instead.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 151, "column": 72}}}, "severity": "WARNING"}

Check failure on line 151 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 151, "column": 18}}}, "severity": "ERROR"}

**Configuration:**
```json
{
"type": "regex_replace",
"pattern": "ERROR",
"replacement": "WARNING",
"replace_all": true
}
```

**Input/Output Example:**
```
Input: "ERROR: Connection failed. ERROR: Timeout occurred."
Output: "WARNING: Connection failed. WARNING: Timeout occurred."
```

### jsonpath_filter

Extracts data using JSONPath expressions.

**Parameters:**
- `path` (string, required): JSONPath expression to extract data
- `default` (any, optional): Default value if path is not found

**Configuration:**
```json
{
"type": "jsonpath_filter",
"path": "$.data.items[*].name",
"default": []
}
```

**Input/Output Example:**
```
Input: {"data": {"items": [{"name": "item1"}, {"name": "item2"}]}}
Output: ["item1", "item2"]
```

### extract_json

Extracts JSON objects or arrays from text strings.

**Parameters:**
- `extract_type` (string, optional): Type of JSON to extract - `"object"`, `"array"`, or `"auto"`. Default: `"auto"`
- `default` (any, optional): Default value if JSON extraction fails

**Configuration:**
```json
{
"type": "extract_json",
"extract_type": "object",
"default": {}
}
```

**Input/Output Example:**
```
Input: "The result is: {\"status\": \"success\", \"count\": 5} - processing complete"
Output: {"status": "success", "count": 5}
```

### regex_capture

Captures specific groups from regex matches.

**Parameters:**
- `pattern` (string, required): Regular expression pattern with capture groups
- `groups` (string or array, optional): Group numbers to capture. Can be a single number like `"1"` or array like `"[1, 2, 4]"`. Default: `"1"`

**Configuration:**
```json
{
"type": "regex_capture",
"pattern": "(\\d+),(\\w+),(\\w+),([^,]+)",
"groups": "[1, 4]"
}
```

**Input/Output Example:**
```
Input: "1,green,open,.plugins-ml-model-group,DCJHJc7pQ6Gid02PaSeXBQ,1,0"
Output: ["1", ".plugins-ml-model-group"]
```

### remove_jsonpath

Removes fields from JSON objects using JSONPath.

**Parameters:**
- `path` (string, required): JSONPath expression identifying fields to remove

**Configuration:**
```json
{
"type": "remove_jsonpath",
"path": "$.sensitive_data"
}
```

**Input/Output Example:**
```
Input: {"name": "user1", "sensitive_data": "secret", "public_info": "visible"}
Output: {"name": "user1", "public_info": "visible"}
```

### conditional

Check failure on line 259 in _ml-commons-plugin/agents-tools/output-processors.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'conditional' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'conditional' is a heading and should be in sentence case.", "location": {"path": "_ml-commons-plugin/agents-tools/output-processors.md", "range": {"start": {"line": 259, "column": 5}}}, "severity": "ERROR"}

Applies different processor chains based on conditions.

**Parameters:**
- `path` (string, optional): JSONPath to extract value for condition evaluation
- `routes` (array, required): Array of condition-processor mappings
- `default` (array, optional): Default processors if no conditions match

**Supported conditions:**
- Exact value match: `"value"`
- Numeric comparisons: `">10"`, `"<5"`, `">=", `"<="`, `"==5"`
- Existence checks: `"exists"`, `"null"`, `"not_exists"`
- Regex matching: `"regex:pattern"`
- Contains text: `"contains:substring"`

**Configuration:**
```json
{
"type": "conditional",
"path": "$.status",
"routes": [
{
"green": [
{"type": "regex_replace", "pattern": "status", "replacement": "healthy"}
]
},
{
"red": [
{"type": "regex_replace", "pattern": "status", "replacement": "unhealthy"}
]
}
],
"default": [
{"type": "regex_replace", "pattern": "status", "replacement": "unknown"}
]
}
```

**Input/Output Example:**
```
Input: {"index": "test-index", "status": "green", "docs": 100}
Output: {"index": "test-index", "healthy": "green", "docs": 100}
```
Loading