Skip to content

staging infra: "Failed to export self-observability metrics" #470

@jku

Description

@jku

This is not a rekor-tiles issue specifically, just filing here for tracking.

staging logs have a lot of errors for "Failed to export self-observability metrics to Cloud Monitoring" with the detail `"Points must be written in order. One or more of the points specified had an older end time than the most recent point"

  • currently we're seeing 1500 events per day. There were instances of this before but it really ramped up on aug 20th (the log freeze happened on that day)
  • this happens on production too but at a much lower rate
  • this seems to happen to multiple metrics collectors, containers are all part of the kube-system namespace
  • I can't find any metrics with those names in the metrics explorer

https://cloudlogging.app.goo.gl/V5SdjtBAZqDAGEtn9

Example:

{
  "insertId": "80fnncnv959jxwgz",
  "jsonPayload": {
    "msg": "Failed to export self-observability metrics to Cloud Monitoring",
    "stacktrace": "google3/cloud/kubernetes/metrics/common/gcm/gcm.(*exporter).startSelfObservability\n\tcloud/kubernetes/metrics/common/gcm/export.go:505",
    "caller": "gcm/export.go:505",
    "error": "rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: timeSeries[0-11] (example metric.type=\"kubernetes.io/internal/metrics_exporter/completed_grpcs\", metric.labels={\"grpc_client_status\": \"OK\", \"metric_type\": \"application\", \"target_name\": \"sidecar\", \"grpc_client_method\": \"google.monitoring.v3.MetricService/CreateServiceTimeSeries\"}): write for resource=k8s_container{container_name:kubedns-metrics-collector,cluster_name:sigstore-staging,pod_name:kube-dns-5f9f9d9597-56spw,namespace_name:kube-system,location:us-central1} failed with: Points must be written in order. One or more of the points specified had an older end time than the most recent point.",
    "level": "error",
    "ts": 1756291238.8434782
  },
  "resource": {
    "type": "k8s_container",
    "labels": {
      "container_name": "kubedns-metrics-collector",
      "cluster_name": "sigstore-staging",
      "pod_name": "kube-dns-5f9f9d9597-56spw",
      "project_id": "projectsigstore-staging",
      "location": "us-central1",
      "namespace_name": "kube-system"
    }
  },
  "timestamp": "2025-08-27T10:40:38.843699249Z",
  "severity": "ERROR",
  "labels": {
    "k8s-pod/k8s-app": "kube-dns",
    "compute.googleapis.com/resource_name": "gke-sigstore-staging-sigstore-node-po-12af3929-5i5y",
    "k8s-pod/pod-template-hash": "5f9f9d9597"
  },
  "logName": "projects/projectsigstore-staging/logs/stderr",
  "receiveTimestamp": "2025-08-27T10:40:40.441293777Z"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    Status

    Todo

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions