Skip to content

Gaurab/atp manifest file change#2077

Open
gmanandhar-nr wants to merge 6 commits intomasterfrom
gaurab/atp_manifest_file_change
Open

Gaurab/atp manifest file change#2077
gmanandhar-nr wants to merge 6 commits intomasterfrom
gaurab/atp_manifest_file_change

Conversation

@gmanandhar-nr
Copy link
Member

@gmanandhar-nr gmanandhar-nr commented Jan 20, 2026

Summary

Implements feature-flag based Adaptive Telemetry Processor (ATP) integration for dynamic process-level metric filtering in nr-k8s-otel-collector helmchart, following architectural guidance from @pbeckwith.

Changes

Feature Flag Architecture

  • Added nrdot_plus configuration block with two flags:
    • nrdot_plus.enabled (default: false) - Controls NRDOT Plus collector usage
    • nrdot_plus.atp (default: true) - Controls ATP feature when NRDOT Plus is enabled
  • Backward compatible: Default configuration uses standard New Relic k8s collector without breaking existing deployments

Image Selection

  • Conditional image selection based on nrdot_plus.enabled flag:
    • When false: Uses standard newrelic/nrdot-collector-k8s:1.5.0
    • When true: Uses NRDOT Plus collector gmanandhar321/nrdot-collector-host:2.11
  • Updated _images.tpl helper to support both standard and NRDOT Plus image configurations

ATP Integration

  • Enabled process and processes scrapers in hostmetrics receiver
  • Added adaptivetelemetryprocessor configuration with:
    • Storage path: /var/lib/nrdot-collector/adaptivetelemetry.db
    • Configurable thresholds for CPU/memory utilization
    • Dynamic threshold adjustment with smoothing
    • Multi-metric composite scoring (CPU + memory weights)
    • Anomaly detection with history tracking
  • Added attributes/atp_identifier processor to tag ATP-processed metrics
  • Processors only defined when both flags are enabled: nrdot_plus.enabled && nrdot_plus.atp

Pipeline Architecture

  • Unified metrics/nr pipeline: All metrics (system + process + k8s) flow through single pipeline
  • ATP processors conditionally included in metrics/nr pipeline based on flags
  • ATP intelligently filters: Only processes process-level metrics, passes through system/k8s metrics unchanged
  • Removed separate process pipeline: Consolidated from previous multi-pipeline approach

Volume Mounts

  • Conditional ATP storage volume:
    • Volume mount at /var/lib/nrdot-collector only created when nrdot_plus.enabled && nrdot_plus.atp
    • HostPath volume for ATP database persistence
    • No unnecessary volumes when ATP disabled

Low Data Mode

  • Whitelisted all process. and processes. metrics** in low data mode
  • System and k8s metrics continue using existing low data mode filters
  • Ensures process metrics are preserved for ATP analysis

Configuration Examples

Default (Standard Collector):

nrdot_plus:
  enabled: false  # Uses newrelic/nrdot-collector-k8s:1.5.0

NRDOT Plus with ATP::

nrdot_plus:
  enabled: true
  atp: true  # Uses gmanandhar321/nrdot-collector-host:2.11 with ATP

NRDOT Plus without ATP::

nrdot_plus:
  enabled: true
  atp: false  # Uses NRDOT Plus image but disables ATP processors

Testing

  • Tested in AKS cluster with custom collector image including ATP
  • Verified process metrics appear in New Relic

Outstanding items

  • ATP database persistence file isn't getting created at var/lib/nrdot-collector. Most likely due to permission issue. Need to debug

@gmanandhar-nr gmanandhar-nr requested a review from a team as a code owner January 20, 2026 17:11
# mute_process_exe_error: true
# mute_process_io_error: true
# mute_process_user_error: true
processes:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbudziwojskiNR Noticed one of the checks: "Check Generated file" is failing. Do we make changes to "rendered" folder as well manually, or are they auto-generated once the templates code is merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments