Skip to content

I only want an Infrastructure Agent #943

Closed
@tormodmacleod

Description

Description

all I want is a single infrastructure agent pod running. i don't need ksm or controlPlane running so i disable them as below.

ksm:
  enabled: false

controlPlane:
  enabled: false

however, i'm still left with an agent and a kubelet container running in a kubelet pod on each node because this has been implemented as a DaemonSet

i just want a single pod running an infrastructure agent container only to get the length of some redis lists

i already have eks covered in my infrastructure monitoring layer. i need to be able to implement application monitoring

Acceptance Criteria

  • allow the kubelet pods to be created as part of a Deployment rather than a DaemonSet - toggle
  • allow the kubelet container NOT to be created within the pod - toggle

Describe Alternatives

a new chart which implements only the infrastructure agent?

Dependencies

it might be worth updating the following documents to make it clear to others how you might go about retrieving the lengths of some of your redis keys. i found this was a little unclear if your redis cluster is managed (aws elasticache) rather than running on the cluster

https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/advanced-configuration/monitor-services/monitor-services-running-kubernetes/
https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/get-started/introduction-kubernetes-integration/
https://docs.newrelic.com/docs/infrastructure/host-integrations/host-integrations-list/redis/redis-integration/

Additional context

i am using new relic to monitor at different layers of the stack here

infrastructure

at the infrastructure layer, i have (amongst other things) an eks cluster. i've installed the nri-bundle chart from https://helm-charts.newrelic.com using the terraform helm provider. this includes the newrelic-infrastructure chart with the kubelet, ksm et cetera. this is sneding all the telemetry for my eks cluster back to newrelic. i'm getting all of the information about my eks cluster that i want from it and i'm very happy

application

on my eks cluster i'm running a laravel app. this uses redis to store (amongst other things) it's queue data. i'm using aws elasticache for redis. i've installed the newrelic-infrastructure chart from https://newrelic.github.io/nri-kubernetes using the terraform helm provider with the following configuration

ksm:
  enabled: false
controlPlane:
  enabled: false

integrations:
  redis:
    integrations:
      - name: nri-redis
        env:
          HOSTNAME: replica.my-elasticache-cluster.cache.amazonaws.com
          KEYS: '{"2":["tenantabc123:queues:default","tenantdef456:queues:default",...]}'
          PASSWORD: somepassword
          PORT: 6379
          REMOTE_MONITORING: true
          USE_TLS: true
          USERNAME: someusername

this is working and sending the lengths of my queues to newrelic. i now can see if i have a problem with one of my queue workers or the queue is filling up for some other reason.

why this matters

  • these two layers shouldn't know about each other
    i could try to implement this all at the infrastructure layer. but then i'd need to tell my infrastructure layer about which redis keys to monitor - you can't use wildcards. whenever i create a new tenant, i'll want to add the redis keys for their queues to the list of keys i'm monitoring. when i remove the tenant i'll no longer want to monitor their queues. if i implement this at the infrastructure layer, i'll need to have terraform call the api of the thing that manages my tenants to get a list of the queues for all of my tenants. the terraform that provisions my infrastructure layer is run manually rather than being event driven. it this could be changed but due to it's nature i don't particularly want this running off events. it shouldn't know anything about tenants or queues

  • i only need one pod
    even if i did decide to have my infrastructure layer configuring the redis integration to monitor my queues, the agent runs within the kubelet pod as part of a Daemonset. this means that as my cluster scales out and more and more nodes are added i'll have more and more pods. these pods will all be retrieving the same information from redis and sending it to newrelic. this puts unnecessary load on my redis cluster. it also sends duplicate information to newrelic increasing my costs without providing any additional value

Estimates

i'd say S = 1-3 days

For Maintainers Only or Hero Triaging this bug

Suggested Priority (P1,P2,P3,P4,P5):
Suggested T-Shirt size (S, M, L, XL, Unknown):

Metadata

Assignees

No one assigned

    Labels

    feature requestCategorizes issue or PR as related to a new feature or enhancement.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions