Skip to content

streams_processor: Malformed status field with newlines (RAN\nRAN\nRAN) #40

Description

@grdumas

Description

The streams processor produces malformed document-level status strings containing multiple "RAN" values separated by newlines, instead of a single status value.

Affected documents: 47 documents in production OpenSearch (0.6% of total data)

Evidence from Production

Query of production OpenSearch found 47 streams documents with status containing newlines:

  • Example: status = "RAN\nRAN\nRAN\nRAN" (should be single "RAN" or "PASS")
  • Run-level status is correct ("PASS"), but document-level status is malformed
  • Occurs across different RHEL versions and instance types
  • Other variations: "RAN\nRAN", "RAN\nRAN\nRAN\nRAN\nRAN\nRAN\nRAN\nRAN\nRAN\nRAN\nRAN\nRAN"

Example Malformed Document

{
  "test": {"name": "streams"},
  "results": {
    "status": "RAN\nRAN\nRAN\nRAN",
    "total_runs": 1,
    "runs": {
      "run_0": {
        "status": "PASS",
        "configuration": {
          "array_sizes": ["", "33792k", "67584k", "135168k", "270336k"],
          "optimization_level": "O2"
        },
        "metrics": { ... }
      }
    }
  }
}

Impact

  1. Data quality: 47 documents have invalid status values
  2. Query accuracy: OpenSearch queries filtering by status may miss these documents
  3. Aggregation pollution: Status field aggregations show multiple "RAN\nRAN" variations
  4. Migration blocker: Schema migration to v2 (RPOPC-1267) may reject multiline status strings

Root Cause (Suspected)

In src/chronicler/processors/streams_processor.py:

  • Likely concatenating multiple "RAN" statuses instead of deduplicating
  • Status parsing doesn't strip/validate for single-line values
  • May be related to multiple optimization levels (O2/O3) or array sizes producing multiple status outputs

Suggested Fix

  1. Validation: Strip newlines from status field before assignment
  2. Deduplication: If multiple status values exist, deduplicate them
  3. Severity hierarchy: If multiple distinct statuses, choose most severe:
    FAIL > UNKNOWN > RAN > PASS
  4. Unit test: Add test case for multiline status handling

Files to Check

  • src/chronicler/processors/streams_processor.py (primary)
  • Other processors: Verify this issue is unique to streams (production data shows only streams affected)

Context

Discovered during RPOPC-1267 schema migration validation when analyzing production OpenSearch edge cases.


Related to: RPOPC-1267 (OpenSearch schema migration)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions