Description
Hi
I'm trying to graph a basic CPU utilization overview using the ecs_cpu_seconds_total
for my ECS tasks but I'm struggling to get something that matches what I can see in Cloudwatch.
I've noticed most, if not all, of the examples online relating to node_exporter and its cpu metric relies on also recording the idle
time, which I don't think we do here.
Any suggestions on what the PromQL query will look like that will accurately graph the CPU utilization in percentages?
I tried a range of similar queries to the following with no luck
avg(rate(ecs_cpu_seconds_total{ecs_service_name="prod-example", container!="ecs-exporter"}[1m]))
100 - (100 - (100 * (avg(rate(ecs_cpu_seconds_total{ecs_service_name="prod-example", container!="ecs-exporter"}[1m])))))
The 2nd one was an attempt to reverse-engineer what the idle
metric would be (100% - current usage should give the idle?) but it's not right. I can see the right trends, i.e. spikes in CPU are correctly reflected in my own graphs, but the values are incorrect.
Any ideas?