|
| 1 | +## `check_http_metrics` Task |
| 2 | + |
| 3 | +### Description |
| 4 | +The `check_http_metrics` task fetches metrics from an HTTP Prometheus endpoint and evaluates assertions against metric values. |
| 5 | + |
| 6 | +#### Task Behavior |
| 7 | +- The task polls the metrics endpoint at regular intervals. |
| 8 | +- By default, the task returns immediately when all assertions pass. |
| 9 | +- Use `continueOnPass: true` to keep monitoring even after success. |
| 10 | +- Use `failOnCheckMiss: true` to fail immediately when assertions are not met. |
| 11 | + |
| 12 | +### Configuration Parameters |
| 13 | + |
| 14 | +- **`url`**:\ |
| 15 | + HTTP URL of the Prometheus metrics endpoint. Required. |
| 16 | + |
| 17 | +- **`headers`**:\ |
| 18 | + Optional HTTP request headers (e.g., for authentication). Default: `{}`. |
| 19 | + |
| 20 | +- **`pollInterval`**:\ |
| 21 | + Interval between metric scrapes. Default: `10s`. |
| 22 | + |
| 23 | +- **`requestTimeout`**:\ |
| 24 | + Timeout for a single HTTP request. Default: `5s`. |
| 25 | + |
| 26 | +- **`maxResponseSize`**:\ |
| 27 | + Maximum response body size. Must be positive. Default: `10MB`. |
| 28 | + |
| 29 | +- **`failOnCheckMiss`**:\ |
| 30 | + If `true`, fail immediately when assertions are not met. If `false`, keep polling until timeout or success. Default: `false`. |
| 31 | + |
| 32 | +- **`continueOnPass`**:\ |
| 33 | + If `true`, continue checking after all assertions pass. Default: `false`. |
| 34 | + |
| 35 | +- **`missingMetric`**:\ |
| 36 | + Behavior when a metric family is missing: `wait`, `fail`, or `pass`. Default: `wait`. |
| 37 | + |
| 38 | +- **`missingSeries`**:\ |
| 39 | + Behavior when no time series matches the label selector: `wait`, `fail`, or `pass`. Default: `wait`. |
| 40 | + |
| 41 | +- **`resetBehavior`**:\ |
| 42 | + Behavior when a COUNTER metric's value drops below baseline (indicating restart): `fail`, `rebaseline`, or `ignore`. Only applies to COUNTER type metrics. Default: `fail`. |
| 43 | + |
| 44 | +- **`assertions`**:\ |
| 45 | + List of metric assertions. At least one required. |
| 46 | + |
| 47 | +#### Assertion Configuration |
| 48 | + |
| 49 | +- **`name`**: Unique assertion name. Required. |
| 50 | +- **`metric`**: Prometheus metric name. Required. |
| 51 | +- **`labels`**: Label selector (subset matching). Must match exactly one series. |
| 52 | +- **`mode`**: `value` (current value) or `delta` (change since baseline). Default: `value`. |
| 53 | +- **`operator`**: Comparison operator: `eq`, `neq`, `gt`, `gte`, `lt`, `lte`. Required. |
| 54 | +- **`value`**: Expected numeric value. Required. |
| 55 | +- **`missingMetric`**: Per-assertion override for global `missingMetric`. |
| 56 | +- **`missingSeries`**: Per-assertion override for global `missingSeries`. |
| 57 | + |
| 58 | +#### Delta Mode |
| 59 | + |
| 60 | +In `delta` mode, the task tracks changes over time: |
| 61 | +1. First scrape: records the current value as baseline (waits, does not evaluate) |
| 62 | +2. Subsequent scrapes: computes `delta = current - baseline` and evaluates |
| 63 | + |
| 64 | +Negative deltas are valid for GAUGE and UNTYPED metrics. For COUNTER metrics, a decrease triggers `resetBehavior`. |
| 65 | + |
| 66 | +#### Examples |
| 67 | + |
| 68 | +```yaml |
| 69 | +assertions: |
| 70 | + # Check counter increased by at least 1 |
| 71 | + - name: counter_increased |
| 72 | + metric: my_counter |
| 73 | + labels: |
| 74 | + env: prod |
| 75 | + mode: delta |
| 76 | + operator: gte |
| 77 | + value: 1 |
| 78 | + |
| 79 | + # Check gauge decreased (negative delta) |
| 80 | + - name: gauge_dropped |
| 81 | + metric: my_gauge |
| 82 | + mode: delta |
| 83 | + operator: lte |
| 84 | + value: -1 |
| 85 | + |
| 86 | + # Check current value is above threshold |
| 87 | + - name: value_above_threshold |
| 88 | + metric: my_metric |
| 89 | + operator: gt |
| 90 | + value: 100 |
| 91 | +``` |
| 92 | +
|
| 93 | +#### Metric Type Handling |
| 94 | +
|
| 95 | +| Type | Value Extracted | |
| 96 | +|------|-----------------| |
| 97 | +| COUNTER | Counter value | |
| 98 | +| GAUGE | Gauge value | |
| 99 | +| UNTYPED | Untyped value | |
| 100 | +| SUMMARY | Sample sum | |
| 101 | +| HISTOGRAM | Sample sum | |
| 102 | +
|
| 103 | +Counter reset detection only applies to COUNTER type. SUMMARY and HISTOGRAM use sample sum; bucket/quantile helpers are not supported. |
| 104 | +
|
| 105 | +### Outputs |
| 106 | +
|
| 107 | +- **`passedAssertions`**: Array of assertion names that passed. |
| 108 | +- **`failedAssertions`**: Array of assertion names that failed. |
| 109 | +- **`values`**: Map of assertion name to latest observed value. |
| 110 | +- **`deltas`**: Map of assertion name to computed delta (for `delta` mode). |
| 111 | +- **`baselines`**: Map of assertion name to baseline value (for `delta` mode). |
| 112 | +- **`scrapeErrors`**: Number of HTTP/parsing errors. |
| 113 | +- **`assertionErrors`**: Number of assertion evaluation errors. |
| 114 | + |
| 115 | +### Defaults |
| 116 | + |
| 117 | +```yaml |
| 118 | +- name: check_http_metrics |
| 119 | + config: |
| 120 | + url: "" |
| 121 | + headers: {} |
| 122 | + pollInterval: 10s |
| 123 | + requestTimeout: 5s |
| 124 | + maxResponseSize: 10MB |
| 125 | + failOnCheckMiss: false |
| 126 | + continueOnPass: false |
| 127 | + missingMetric: wait |
| 128 | + missingSeries: wait |
| 129 | + resetBehavior: fail |
| 130 | + assertions: [] |
| 131 | +``` |
0 commit comments