Description
Is there an existing issue for this?
- I have searched the existing issues
YACE version
0.58.0
Config file
apiVersion: v1alpha1
sts-region: us-east-1
discovery:
jobs:
- type: s3
regions:
- us-east-1
searchTags:
- key: customKey
value: customValue
roles:
- roleArn: "aws:role_arn"
delay: 600
metrics:
- name: ReplicationLatency
statistics:
- Maximum
- Minimum
- Average
- Sum
period: 600
length: 600
- name: BytesPendingReplication
statistics:
- Maximum
- Minimum
- Average
- Sum
period: 600
length: 600
- name: OperationsPendingReplication
statistics:
- Maximum
- Minimum
- Average
- Sum
period: 600
length: 600
- name: OperationsFailedReplication
statistics:
- Maximum
- Minimum
- Average
- Sum
- SampleCount
period: 600
length: 600
Current Behavior
I'm attempting to scrape S3 replication metrics, and have observed some odd behaviour with one of the metrics, OperationsFailedReplication
.
On CloudWatch, this metric presents a data point when there is a replication action (triggered when objects are uploaded to the source bucket, a replication rule is in place, and replication metrics are enabled for the rule), and is null, otherwise. If replication is successful, a zero is produced. If replication fails, a value that equals the number of objects that couldn't be replicated, is produced.
YACE appears to emit a the last non-null value that it encounters. This value only changes at the time of the next replication, when the metric assumes the new non-null value, holding it there until it changes again. This is not desired; we would expect YACE to emit nulls when there isn't a value to report.
On hitting the CloudWatch get-metric-statistics
API using an equivalent range (start and end times are set to match YACE's current scraping interval, which is 5 minutes, with a delay of 10 minutes, to match YACE's configs shown above), period, and all the other necessary parameters (metric name, dimension names and values, region, etc.), a []
is returned.
Verified that YACE was, in fact, returning the last non-null value by hitting the /metrics endpoint on the container that YACE is running on, and, sure it enough, it returns the last non-null value for the metric.
What could be causing this? How may this be fixed?
Attached are screenshots from the AWS console (UTC) and Grafana (which relies on metrics coming into Prometheus from YACE, UTC-8).


Expected Behavior
Expect YACE to return null/NaN when there is no data point coming in from CloudWatch.
Steps To Reproduce
- Run YACE with the configs detailed above
- Create two S3 buckets, B1 and B2
- Create a replication rule to copy objects from B1 to B2, but use an IAM role with insufficient permissions to do so
- Upload objects to B1, which fail to be replicated to B2. Metrics take ~ 5 minutes to appear on CloudWatch
Anything else?
No response