Skip to content

too_complex_to_determinize_exception is thrown in a few cases due to the patterns in index template #127972

Open
@pawankartik-elastic

Description

@pawankartik-elastic

Elasticsearch Version

9.0.0 ECH

Installed Plugins

No response

Java Version

bundled

OS Version

Elastic Cloud

Problem Description

After an upgrade to Elastic v9.0.0, a user reported that they were encountering too_complex_to_determinize_exception error for the index patterns in PUT _index_template/logs-production.

At the moment, we've communicated a workaround: either simplify the patterns or split the patterns across 2 templates. However, we'll need a proper fix for this.

Looking at the commit history, this error is coming from the changes introduced in Lucene 10 upgrade: #114741, i.e. due to Operations.determinize(Operations.union(automata), Operations.DEFAULT_DETERMINIZE_WORK_LIMIT) in Regex#simpleMatchToAutomaton().

Steps to Reproduce

Here's an example:

PUT _index_template/logs-production
{
  "index_patterns": [
                    "*haproxy-production*",
                    "*postgres-production*",
                    "*psql_slow_query-production*",
                    "*springboot-production*",
                    "*tomcat-production*",
                    "*security-security*",
                    "*alerts-monitoring*"
  ],
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "logs-production"
        },
        "codec": "best_compression",
        "query": {
          "default_field": [
            "message"
          ]
        }
      }
    }
  },
  "composed_of": [
    "logs-mappings",
    "data-streams-mappings"
  ]

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions