New component: OTTL-based request partitioning processor

### The purpose and use-cases of the new component

I would like to propose a new processor for partitioning batches using OTTL. The processor would be configured with a mapping of OTTL expressions to partitioning keys, and:
 1. Infer the relevant context from the expressions (e.g. request, resource, log, datapoint, etc.)
 2. Evaluate the expressions against each batch
 3. Partition the batch according to the partitioning keys
 4. Send each batch partition to the next consumer, add the partitioning keys & values to request metadata

Some use cases below:

**Topic routing and message key/partitioning for message broker exporters**

The idea for this came from a PoC I implemented for routing and partitioning data before exporting to Kafka -- see https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/38888#issuecomment-2778124885. This is intended to address several issues:
 - At the moment the Kafka exporter supports configuration for determining a topic from a resource attribute, but this does not behave correctly with multiple resources - see https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/37470
 - the exporter supports partitioning traces by trace_id, but does not support the same for logs
 - the exporter does not support partitioning by arbitrary attributes -- see https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/38484

It's just a matter of time before users ask for the same functionality in other exporters, e.g. Pulsar.

**Batching by arbitrary data and metadata attributes**

In https://github.com/open-telemetry/opentelemetry-collector/issues/10825 it has been suggested that the exporterhelper batch sender should be enhanced with support for batching by request metadata or by pdata attribute. The former case is simple, while the latter tends towards OTTL, so I think this would be best covered by a separate processor like this one.

**General purpose "groupby" processor**

Finally, this processor would provide arbitrary "group by" functionality, which could enable replacing existing processors:
 - we could replace the [https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/groupbyattrsprocessor](groupbyattrsprocessor) processor by using the new partitioning processor followed by the transform processor to set resource attributes from the partitioning key
 - we could replace the [https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/groupbytraceprocessor](groupbytraceprocessor) processor by using the new partitioning processor followed by a new "buffering" processor, i.e. one that buffers batches by some request metadata keys and emits them after a configurable duration

(Whether we _want_ to replace them is another matter, just saying it's a _possible_ use case.)

### Example configuration for the component

```yaml
processors:
  partitioner:
    keys:
      logs_topic: Concat(request.metadata["X-Tenant-Id"]).otlp_logs
      logs_message_key: log.trace_id
```

In this example, requests would be partitioned by the X-Tenant-Id client request header, and data would further be partitioned by trace ID. Each partitioned batch would include the request metadata `logs_topic` and `logs_message_key`.

### Telemetry data types supported

Logs, metrics, traces. When OTTL supports profiles, profiles too.

### Code Owner(s)

axw

### Sponsor (optional)

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New component: OTTL-based request partitioning processor #39199

The purpose and use-cases of the new component

Example configuration for the component

Telemetry data types supported

Code Owner(s)

Sponsor (optional)

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New component: OTTL-based request partitioning processor #39199

Description

The purpose and use-cases of the new component

Example configuration for the component

Telemetry data types supported

Code Owner(s)

Sponsor (optional)

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions