Analysis Run Returns Successful Phase When All Measurement Phases Are Either Errored Or Failed

Checklist:

* [X] I've included steps to reproduce the bug.
* [X] I've included the version of argo rollouts.

**Describe the bug**


I have a DataDog analysis run result that shows 'Successful' where all measurement phases are either 'Errored' or 'Failed'. 

Here's the DataDog Analysis Template for dd-analysis2 (incorrect successful phase):
```
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: dd-analysis2
  ...
spec:
  args:
    ...
  metrics:
  - name: dd-analysis2
    interval: 30s
    count: 5
    successCondition: "result < 0.02"
    failureLimit: 2
    provider:
      datadog:
        interval: 5m
        query:           sum:kubernetes.cpu.limits{service:{{ args.service }},workload:{{ args.workload }},env:{{ args.environment }},rollout_revision:{{ args.rollout_revision }}} by {container_name} * 1000

```
(This is just a test analysis template, and the query is not that useful. I just use a query that will return some data, since my test service receives no traffic.)

Here's the result from argo-rollouts for dd-analysis2 which has the 'successful' phase:
```
            {
              "name": "dd-analysis2",
              "phase": "Successful",
              "measurements": [
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:19:40Z",
                  "finishedAt": "2025-10-01T04:19:40Z"
                },
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:19:50Z",
                  "finishedAt": "2025-10-01T04:19:50Z"
                },
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:20:00Z",
                  "finishedAt": "2025-10-01T04:20:00Z"
                },
                {
                  "phase": "Failed",
                  "startedAt": "2025-10-01T04:20:10Z",
                  "finishedAt": "2025-10-01T04:20:10Z",
                  "value": "1000"
                }
              ],
              "count": 1,
              "failed": 1,
              "error": 3
            }
```
Here's another DataDog analysis template dd-analysis3 that resulted in correct error phase:
```
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: dd-analysis3
  ...
spec:
  args:
    ...
  metrics:
  - name: dd-analysis3
    interval: 30s
    count: 5
    successCondition: "result < 0.05"
    failureLimit: 2
    provider:
      datadog:
        interval: 5m
        query:           sum:(
           sum:trace.fastapi.request.hits{service:{{args.service}},http.status_code:500,env:{{args.environment}}}.as_count() /
           sum:trace.http.request.hits{service:{{args.service }},env:{{args.environment }}}.as_count()
         )
```

Here's the argo-rollouts result for dd-analysis3 which shows error phase correctly (error due to no data):
```
{
              "name": "dd-analysis3",
              "phase": "Error",
              "measurements": [
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:19:40Z",
                  "finishedAt": "2025-10-01T04:19:40Z"
                },
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:19:50Z",
                  "finishedAt": "2025-10-01T04:19:50Z"
                },
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:20:00Z",
                  "finishedAt": "2025-10-01T04:20:00Z"
                },
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:20:10Z",
                  "finishedAt": "2025-10-01T04:20:10Z"
                },
                {
                  "phase": "Error",
                  "message": "invalid operation: < (mismatched types <nil> and float64)",
                  "startedAt": "2025-10-01T04:20:20Z",
                  "finishedAt": "2025-10-01T04:20:20Z"
                }
              ],
              "message": "invalid operation: < (mismatched types <nil> and float64)",
              "error": 5,
              "consecutiveError": 5
            },
```

The rollout of the service (the other analysis: promql-analysis, dd-analysis succeeded as expected):
```
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: pd-ws-test-workload
 ....
spec:
  strategy:
    canary:
      maxUnavailable: 10%
      maxSurge: 10%
      steps:
        - setWeight: 30
        - analysis:
            args:
            - name: rollout_revision
              valueFrom:
                fieldRef:
                  fieldPath: metadata.annotations['rollout.argoproj.io/revision']
            - name: workload
              value: pd-ws-test-workload
            templates:
            - templateName: promql-analysis
            - templateName: dd-analysis
            - templateName: dd-analysis2
            - templateName: dd-analysis3
        - setWeight: 70
        - pause: {}
        - setWeight: 100
...
  analysis:
    successfulRunHistoryLimit: 10
    unsuccessfulRunHistoryLimit: 10
  revisionHistoryLimit: 20
  minReadySeconds: 0
```      

Argo-Rollouts version: 1.6.6

**To Reproduce**




**Expected behavior**
I expect dd-analysis2 should have 'Failed' phase instead of 'Successful' phase since all its measurement are either 'Errored' or 'Failed'.


**Screenshots**



**Version**
1.6.6


**Logs**

```
# Paste the logs from the rollout controller

# Logs for the entire controller:
kubectl logs -n argo-rollouts deployment/argo-rollouts

# Logs for a specific rollout:
kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME
```
```
time="2025-10-01T04:20:20Z" level=info msg="Event(v1.ObjectReference{Kind:\"AnalysisRun\", Namespace:\"pd-ws-test-dev\", Name:\"pd-ws-test-workload-6b447b6ff5-36-1\", UID:\"e6e35c8d-e0dc-46f5-87b0-df26092c3c59\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"2375908069\", FieldPath:\"\"}): type: 'Normal' reason: 'MetricSuccessful' Metric 'dd-analysis2' Completed. Result: Successful"
time="2025-10-01T04:20:20Z" level=info msg="Metric 'dd-analysis2' Completed. Result: Successful" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 event_reason=MetricSuccessful namespace=pd-ws-test-dev
time="2025-10-01T04:20:20Z" level=info msg="Metric 'dd-analysis2' transitioned from Running -> Successful" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:20Z" level=info msg="Metric Assessment Result - Successful: Run Terminated" metric=dd-analysis2
time="2025-10-01T04:20:20Z" level=info msg="Skipping measurement: run is terminating" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:10Z" level=info msg="Measurement Completed. Result: Failed" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:10Z" level=warning msg="Datadog will soon deprecate their API v1. Please consider switching to v2 soon." analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:10Z" level=info msg="Running overdue measurement" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:00Z" level=warning msg="Measurement had error: invalid operation: < (mismatched types <nil> and float64)" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:00Z" level=info msg="Measurement Completed. Result: Error" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:00Z" level=warning msg="Datadog will soon deprecate their API v1. Please consider switching to v2 soon." analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:20:00Z" level=info msg="Running overdue measurement" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:50Z" level=warning msg="Measurement had error: invalid operation: < (mismatched types <nil> and float64)" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:50Z" level=info msg="Measurement Completed. Result: Error" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:50Z" level=warning msg="Datadog will soon deprecate their API v1. Please consider switching to v2 soon." analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:50Z" level=info msg="Running overdue measurement" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:40Z" level=warning msg="Measurement had error: invalid operation: < (mismatched types <nil> and float64)" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:40Z" level=info msg="Measurement Completed. Result: Error" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:40Z" level=warning msg="Datadog will soon deprecate their API v1. Please consider switching to v2 soon." analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
time="2025-10-01T04:19:40Z" level=info msg="Running initial measurement" analysisrun=pd-ws-test-workload-6b447b6ff5-36-1 metric=dd-analysis2 namespace=pd-ws-test-dev
```
---

**Message from the maintainers**:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Analysis Run Returns Successful Phase When All Measurement Phases Are Either Errored Or Failed #4479

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Analysis Run Returns Successful Phase When All Measurement Phases Are Either Errored Or Failed #4479

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions