Description
Currently, it's very difficult to define robust conditions in processors.
Take this condition as an example: ctx.container?.name?.contains('foo')
It looks safe as it's using the null-safe object dereferencing operator. But it's not actually null-safe. I suppose that's because the contains
method is defined on String
and even though the null-safe dereferencing operator should protect against it, there are checks to see if null
's class can invoke the contains
method.
A version of the same condition that properly guards against null values is (ctx.container?.name ?: '').contains('foo')
Click to expand error
{
"docs": [
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx.container?.name?.contains('foo')",
" ^---- HERE"
],
"script": "ctx.container?.name?.contains('foo')",
"lang": "painless",
"position": {
"offset": 19,
"start": 0,
"end": 36
}
}
],
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx.container?.name?.contains('foo')",
" ^---- HERE"
],
"script": "ctx.container?.name?.contains('foo')",
"lang": "painless",
"position": {
"offset": 19,
"start": 0,
"end": 36
},
"caused_by": {
"type": "null_pointer_exception",
"reason": "Cannot invoke \"Object.getClass()\" because \"value\" is null"
}
}
}
]
}
The bigger point here is that defining conditions is not easy and they can be a source of unexpected errors.
We should think about ways to make defining conditions more ergonomic and robust.
Some ideas:
- Allow the field API in the conditional context. @stu-elastic has prototyped this: 7e028f9 (what's missing is to ensure the doc is read-only in that context). I think that's something we should do regardless of whether we're also doing other things. The condition would simplify to
$('container.name', '').contains('foo')
. A caveat here is that the fields API is experimental. So I'm not sure if we can recommend users and http://github.com/elastic/integrations developers to use it. - Add query language support to the condition so that users can write
container.name: *foo*
. The big advantage using a query language is that there'll never be an error at runtime, if the query definition is valid at creation time. It either matches or doesn't match. About which query language to support, we could think about KQL. While KQL is Kibana specific and not something we could just use in the context of Elasticsearch. But we could implement KQL or a similar query language on the ES side. Maybe we could even use ESQL for that? @trentm also implemented KQL in go a while ago for a CLI tool that can filter ECS log files: https://github.com/trentm/go-ecslog/tree/main/internal/kqlog. A big advantage of supporting KQL in conditions is that it would greatly simplify the UI-driven creation of routing rules.