Description
Description
all I want is a single infrastructure agent pod running. i don't need ksm
or controlPlane
running so i disable them as below.
ksm:
enabled: false
controlPlane:
enabled: false
however, i'm still left with an agent
and a kubelet
container running in a kubelet
pod on each node because this has been implemented as a DaemonSet
i just want a single pod running an infrastructure agent container only to get the length of some redis lists
i already have eks covered in my infrastructure monitoring layer. i need to be able to implement application monitoring
Acceptance Criteria
- allow the
kubelet
pods to be created as part of aDeployment
rather than aDaemonSet
- toggle - allow the
kubelet
container NOT to be created within the pod - toggle
Describe Alternatives
a new chart which implements only the infrastructure agent?
Dependencies
it might be worth updating the following documents to make it clear to others how you might go about retrieving the lengths of some of your redis keys. i found this was a little unclear if your redis cluster is managed (aws elasticache) rather than running on the cluster
https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/advanced-configuration/monitor-services/monitor-services-running-kubernetes/
https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/get-started/introduction-kubernetes-integration/
https://docs.newrelic.com/docs/infrastructure/host-integrations/host-integrations-list/redis/redis-integration/
Additional context
i am using new relic to monitor at different layers of the stack here
infrastructure
at the infrastructure layer, i have (amongst other things) an eks cluster. i've installed the nri-bundle
chart from https://helm-charts.newrelic.com using the terraform helm provider. this includes the newrelic-infrastructure
chart with the kubelet
, ksm
et cetera. this is sneding all the telemetry for my eks cluster back to newrelic. i'm getting all of the information about my eks cluster that i want from it and i'm very happy
application
on my eks cluster i'm running a laravel app. this uses redis to store (amongst other things) it's queue data. i'm using aws elasticache for redis. i've installed the newrelic-infrastructure
chart from https://newrelic.github.io/nri-kubernetes using the terraform helm provider with the following configuration
ksm:
enabled: false
controlPlane:
enabled: false
integrations:
redis:
integrations:
- name: nri-redis
env:
HOSTNAME: replica.my-elasticache-cluster.cache.amazonaws.com
KEYS: '{"2":["tenantabc123:queues:default","tenantdef456:queues:default",...]}'
PASSWORD: somepassword
PORT: 6379
REMOTE_MONITORING: true
USE_TLS: true
USERNAME: someusername
this is working and sending the lengths of my queues to newrelic. i now can see if i have a problem with one of my queue workers or the queue is filling up for some other reason.
why this matters
-
these two layers shouldn't know about each other
i could try to implement this all at the infrastructure layer. but then i'd need to tell my infrastructure layer about which redis keys to monitor - you can't use wildcards. whenever i create a new tenant, i'll want to add the redis keys for their queues to the list of keys i'm monitoring. when i remove the tenant i'll no longer want to monitor their queues. if i implement this at the infrastructure layer, i'll need to have terraform call the api of the thing that manages my tenants to get a list of the queues for all of my tenants. the terraform that provisions my infrastructure layer is run manually rather than being event driven. it this could be changed but due to it's nature i don't particularly want this running off events. it shouldn't know anything about tenants or queues -
i only need one pod
even if i did decide to have my infrastructure layer configuring the redis integration to monitor my queues, theagent
runs within thekubelet
pod as part of aDaemonset
. this means that as my cluster scales out and more and more nodes are added i'll have more and more pods. these pods will all be retrieving the same information from redis and sending it to newrelic. this puts unnecessary load on my redis cluster. it also sends duplicate information to newrelic increasing my costs without providing any additional value
Estimates
i'd say S = 1-3 days
For Maintainers Only or Hero Triaging this bug
Suggested Priority (P1,P2,P3,P4,P5):
Suggested T-Shirt size (S, M, L, XL, Unknown):