Skip to content

Simplify observability pipeline configs#575

Open
manuranga wants to merge 1 commit intowso2:mainfrom
manuranga:obs-simplify
Open

Simplify observability pipeline configs#575
manuranga wants to merge 1 commit intowso2:mainfrom
manuranga:obs-simplify

Conversation

@manuranga
Copy link
Copy Markdown
Contributor

@manuranga manuranga commented Apr 8, 2026

Summary

Simplify fluent-bit observability pipeline — fix bugs, remove dead code, merge redundant filters.

Changes

Bug fixes

  • MI metrics pipeline: Removed separate analytics log input (${MI_ANALYTICS_LOG_FILE_NAME}) that reads a file that does not exist. ElasticStatisticsPublisher writes via log.info() to the carbon log, not a separate file. Replaced with rewrite_tag on $message ^SYNAPSE_ANALYTICS_DATA from the existing carbon log input, parsed via lua parse_mi_analytics.
  • general-logs double-write: Removed fallback output with Match_regex ^(mi|wso2apim).* — this also matched rewritten tags mi_app_logs and mi_metrics, causing all MI data to be written twice (once to its proper index, once to general-logs).
  • Index template gaps: Added missing ballerina-metrics-logs-* and mi-metrics-logs-* index patterns.

Dead code removal

  • Parsers: Removed bal_parser, jsonparser, docker (unreferenced). Removed mi_metrics_json_extract, mi_metrics_parse_json (replaced by lua).
  • Index template: Removed unused kubernetes object mapping.

Simplification

  • 3 BI lua filters → 1: Merged extract_app_from_path, enrich_bal_logs, construct_bal_app_name (all matching ballerina.*, running sequentially) into single process_bal_logs.

Line count

File Before After
fluent-bit.conf 312 240
parsers.conf 59 28
scripts.lua 235 233
index-template-request.json 75 60

Testing

Verified in docker-compose dev environment with BI + MI runtimes:

  • All 4 indices populated: ballerina-application-logs, ballerina-metrics-logs, mi-application-logs, mi-metrics-logs
  • general-logs index: 0 docs (was 34 duplicates before fix)
  • Dedup via Id_Key doc_id working (version conflict on duplicate MI metrics confirms it)
  • ICP UI metrics page shows request counts, latency, RPM charts correctly

Summary by CodeRabbit

  • Refactor

    • Streamlined log processing pipeline for Micro Integrator and Ballerina applications.
    • Optimized analytics routing and enrichment for improved performance.
  • Chores

    • Updated observability dashboard configuration to support additional metrics log patterns.
    • Removed redundant parser definitions to simplify configuration.

- Remove separate MI analytics log input; read SYNAPSE_ANALYTICS_DATA
  from carbon log via rewrite_tag + lua parse_mi_analytics instead
- Remove general-logs output that double-wrote MI data (Match_regex
  ^(mi|wso2apim).* also matched rewritten tags mi_app_logs, mi_metrics)
- Remove dead parsers: bal_parser, jsonparser, docker (unreferenced)
- Remove mi_metrics_json_extract, mi_metrics_parse_json (replaced by lua)
- Merge 3 sequential BI lua filters (extract_app_from_path,
  enrich_bal_logs, construct_bal_app_name) into single process_bal_logs
- Add ballerina-metrics-logs-* and mi-metrics-logs-* to index template
- Remove unused kubernetes object from index template

312 -> 240 lines fluent-bit.conf
 59 ->  28 lines parsers.conf
235 -> 233 lines scripts.lua (net +50 for parse_mi_analytics, -52 for merged/removed)
 75 ->  60 lines index-template-request.json
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 8, 2026

Walkthrough

This change refactors the Fluent Bit logging pipeline architecture. MI analytics processing is restructured to derive from the existing general MI input using rewrite rules rather than a dedicated input. Ballerina log enrichment is simplified by consolidating two processing functions into one. Unused parsers are removed, OpenSearch index templates are updated to accommodate new metrics log patterns, and Kubernetes field mappings are removed from the schema.

Changes

Cohort / File(s) Summary
Fluent Bit Log Pipeline
fluent-bit.conf
Removed dedicated MI analytics input and its parser chain. Reorganized MI metrics routing to derive from general MI input via rewrite rules targeting SYNAPSE_ANALYTICS_DATA. Consolidated Ballerina log enrichment from two Lua calls to one. Removed fallback OpenSearch output for unmatched logs.
Parser Definitions
parsers.conf
Removed 5 parser definitions: Ballerina logfmt/JSON parsers, Docker JSON parser, and two MI analytics-specific parsers (mi_metrics_json_extract, mi_metrics_parse_json).
Lua Log Processing
scripts.lua
Renamed extract_app_from_path to process_bal_logs and merged with previous enrich_bal_logs logic. Simplified Ballerina app/module field extraction. Replaced enrich_mi_metrics with new parse_mi_analytics function that parses SYNAPSE_ANALYTICS_DATA messages into structured fields with explicit payload, service_type, and product assignment.
Index Template Schema
index-template-request.json
Added ballerina-metrics-logs-* and mi-metrics-logs-* to index patterns. Removed entire Kubernetes field mappings including nested labels from the template schema.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Poem

🐰 The logs flow cleaner through our streams,
Analytics extracted from SYNAPSE dreams,
Ballerina's paths now parsed with grace,
Kubernetes fields we've cleared from this space—
A simpler pipeline, refactored with care! 📊✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the primary goal of this PR: simplifying the observability pipeline configuration across multiple files.
Description check ✅ Passed The description provides a clear summary, detailed changes including bug fixes and dead code removal, line count comparisons, and testing results, though it omits several optional template sections.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/scripts/scripts.lua (1)

132-133: Potential truncation of decimal latency values.

The pattern (%d+) only captures integer portions. If the source JSON ever contains decimal latency values (e.g., "latency": 123.45), this would capture only 123, losing precision.

Consider using a pattern that handles optional decimals:

♻️ Suggested pattern for decimal support
-    local latency = string.match(json_str, '"latency"%s*:%s*(%d+)')
+    local latency = string.match(json_str, '"latency"%s*:%s*([%d%.]+)')

If the upstream MI analytics data is guaranteed to always emit integer latency values (which the successful testing suggests), this is purely defensive.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/scripts/scripts.lua`
around lines 132 - 133, The current string.match call extracts only integer
latency because it uses '(%d+)' on json_str; update the pattern used in the
string.match(...) expression that assigns local latency so it accepts optional
decimal fractions (e.g., allow an optional '.' and digits after the integer),
then keep the existing tonumber(latency) and payload["latency"] assignment so
decimals are preserved; locate the string.match call that reads into local
latency from json_str and modify its pattern accordingly (string.match, local
latency, json_str, payload["latency"]).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/scripts/scripts.lua`:
- Around line 132-133: The current string.match call extracts only integer
latency because it uses '(%d+)' on json_str; update the pattern used in the
string.match(...) expression that assigns local latency so it accepts optional
decimal fractions (e.g., allow an optional '.' and digits after the integer),
then keep the existing tonumber(latency) and payload["latency"] assignment so
decimals are preserved; locate the string.match call that reads into local
latency from json_str and modify its pattern accordingly (string.match, local
latency, json_str, payload["latency"]).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ae2dd94-72c0-407c-9704-c7ea075bcecc

📥 Commits

Reviewing files that changed from the base of the PR and between bbb7a77 and 78e16fb.

📒 Files selected for processing (4)
  • icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/fluent-bit.conf
  • icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/parsers.conf
  • icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/scripts/scripts.lua
  • icp_server/resources/observability/opensearch-observability-dashboard/setup/index-template-request.json
💤 Files with no reviewable changes (1)
  • icp_server/resources/observability/opensearch-observability-dashboard/config/fluent-bit/parsers.conf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant