Skip to content

Index template has excessive dynamic_templates causing mapper_parsing_exception #54

Description

@grdumas

Problem

The OpenSearch index template at src/chronicler/config/opensearch_index_template.json has overly restrictive dynamic_templates that conflict with actual data structures, causing documents to fail indexing with mapper_parsing_exception errors.

Discovered During

Local schema migration testing for RPOPC-1267. When testing the v1→v2 migration path with the index template applied, multiple benchmark types failed to index.

Root Cause

The local template defines 11 dynamic_templates with explicit object mappings:

  • run_objects - maps results.runs.* as objects with specific properties
  • timeseries_objects - maps results.runs.*.timeseries.*
  • numa_nodes - maps system_under_test.hardware.numa.*
  • storage_devices - maps system_under_test.hardware.storage.*
  • network_interfaces - maps system_under_test.hardware.network.*
  • etc.

These templates define fields as type: object but the actual data contains scalar values at those paths, causing conflicts like:

object mapping for [results.runs.run_0.configuration.variant] tried to parse field [variant] as object, but found a concrete value

Production vs Local

Production has only 2 dynamic templates:

[
  {
    "test_config_parameters_as_keyword": {
      "path_match": "test_configuration.parameters.*",
      "mapping": {"type": "keyword"}
    }
  },
  {
    "strings_as_keyword": {
      "match_mapping_type": "string",
      "mapping": {
        "ignore_above": 1024,
        "type": "keyword"
      }
    }
  }
]

Production mapping settings:

  • "dynamic": true (local template had false - now fixed)
  • storage: {"type": "object", "enabled": false} (local had enabled: true - now fixed)

Failed Test Cases

When processing sample Azure RHEL 10.2 benchmark data with corrected template:

  • ❌ auto_hpl: results.runs.run_0.configuration.variant conflict
  • ❌ coremark: same
  • ❌ coremark_pro: same
  • ❌ passmark: same
  • ❌ pig: same
  • ❌ streams: same
  • ✅ specjbb: succeeded (doesn't have variant field)

Expected Behavior

The index template should match production's simpler approach:

  1. Map test_configuration.parameters.* as keyword
  2. Map all strings as keyword with ignore_above
  3. Use "dynamic": true to allow flexible schema
  4. Use explicit static mappings only for well-known top-level fields
  5. Use "enabled": false for storage to prevent mapping conflicts

Reproduction

# Apply the current template
curl -X PUT "http://localhost:9200/_index_template/zathras-results-template" \
  -H 'Content-Type: application/json' \
  -d @src/chronicler/config/opensearch_index_template.json

# Process any benchmark data
export CHRONICLER_CONFIG=config.yml
python3 -m chronicler.run_postprocessing --input /path/to/results --opensearch

# Observe mapper_parsing_exception errors

Proposed Fix

Simplify src/chronicler/config/opensearch_index_template.json to match production:

  1. Remove dynamic_templates for: run_objects, timeseries_objects, numa_nodes, network_interfaces, cpu_flags, validation_threads
  2. Keep only 2 dynamic_templates: test_config_parameters_as_keyword, strings_as_keyword
  3. Ensure settings:
    • "dynamic": true
    • storage: "enabled": false
    • "total_fields.limit": 5000
  4. Keep static mappings for well-known top-level fields (metadata, test, system_under_test top-level structure)

Impact

Without fix:

  • Fresh data processing fails with template applied
  • Migration testing blocked
  • Production migration risk (if template is applied incorrectly)

With fix:

  • Template matches production behavior
  • All benchmark types process successfully
  • Safe for production migration

Additional Context

  • Production OpenSearch: v3.2.0
  • Local testing: v3.2.0 (Podman)
  • Field limit: 5000 (matches production)
  • This blocks completing the v1→v2 schema migration testing

Files to Modify

  • src/chronicler/config/opensearch_index_template.json - Simplify dynamic_templates
  • src/chronicler/config/opensearch_timeseries_template.json - Verify matches production

Testing Checklist

  • Apply simplified template to local OpenSearch
  • Process sample data from all benchmark types (auto_hpl, coremark, coremark_pro, fio, passmark, pig, pyperf, specjbb, speccpu2017, streams, uperf)
  • Verify all documents index successfully
  • Verify java_version field is keyword (not long)
  • Verify storage data is stored but not indexed
  • Run migration script: v1 → v2
  • Compare v2 mapping against production

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions