Skip to content

Better ergonomics for (reroute) processor conditions #96452

Open
@felixbarny

Description

@felixbarny

Currently, it's very difficult to define robust conditions in processors.

Take this condition as an example: ctx.container?.name?.contains('foo')

It looks safe as it's using the null-safe object dereferencing operator. But it's not actually null-safe. I suppose that's because the contains method is defined on String and even though the null-safe dereferencing operator should protect against it, there are checks to see if null's class can invoke the contains method.

A version of the same condition that properly guards against null values is (ctx.container?.name ?: '').contains('foo')

Click to expand error
{
  "docs": [
    {
      "error": {
        "root_cause": [
          {
            "type": "script_exception",
            "reason": "runtime error",
            "script_stack": [
              "ctx.container?.name?.contains('foo')",
              "                   ^---- HERE"
            ],
            "script": "ctx.container?.name?.contains('foo')",
            "lang": "painless",
            "position": {
              "offset": 19,
              "start": 0,
              "end": 36
            }
          }
        ],
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "ctx.container?.name?.contains('foo')",
          "                   ^---- HERE"
        ],
        "script": "ctx.container?.name?.contains('foo')",
        "lang": "painless",
        "position": {
          "offset": 19,
          "start": 0,
          "end": 36
        },
        "caused_by": {
          "type": "null_pointer_exception",
          "reason": "Cannot invoke \"Object.getClass()\" because \"value\" is null"
        }
      }
    }
  ]
}

The bigger point here is that defining conditions is not easy and they can be a source of unexpected errors.

We should think about ways to make defining conditions more ergonomic and robust.

Some ideas:

  • Allow the field API in the conditional context. @stu-elastic has prototyped this: 7e028f9 (what's missing is to ensure the doc is read-only in that context). I think that's something we should do regardless of whether we're also doing other things. The condition would simplify to $('container.name', '').contains('foo'). A caveat here is that the fields API is experimental. So I'm not sure if we can recommend users and http://github.com/elastic/integrations developers to use it.
  • Add query language support to the condition so that users can write container.name: *foo*. The big advantage using a query language is that there'll never be an error at runtime, if the query definition is valid at creation time. It either matches or doesn't match. About which query language to support, we could think about KQL. While KQL is Kibana specific and not something we could just use in the context of Elasticsearch. But we could implement KQL or a similar query language on the ES side. Maybe we could even use ESQL for that? @trentm also implemented KQL in go a while ago for a CLI tool that can filter ECS log files: https://github.com/trentm/go-ecslog/tree/main/internal/kqlog. A big advantage of supporting KQL in conditions is that it would greatly simplify the UI-driven creation of routing rules.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Core/Infra/ScriptingScripting abstractions, Painless, and Mustache:Data Management/Ingest NodeExecution or management of Ingest Pipelines including GeoIPTeam:Core/InfraMeta label for core/infra teamTeam:Data ManagementMeta label for data/management team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions