Skip to content

Commit d2866b4

Browse files
committed
Get chargeback data results from loki and validate
that the total cost is correct Used Gemini and Cursor Validates chargeback total cost - uses synth data to calculate total cost via script - run "openstack rating summary get" to get total cost from loki - compares script_totals and Loki_Totals if same then job passes Closes: https://issues.redhat.com/browse/OSPRH-26066
1 parent 9362bef commit d2866b4

19 files changed

+890
-55
lines changed
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
---
2+
# Ansible-lint compatible yamllint config for this role only.
3+
# See: https://ansible.readthedocs.io/projects/lint/rules/yaml/
4+
extends: default
5+
6+
rules:
7+
comments:
8+
min-spaces-from-content: 1
9+
comments-indentation: false
10+
braces:
11+
min-spaces-inside: 0
12+
max-spaces-inside: 1
13+
octal-values:
14+
forbid-implicit-octal: true
15+
forbid-explicit-octal: true
16+
line-length:
17+
max: 160
18+
level: warning

roles/telemetry_chargeback/README.md

Lines changed: 50 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ The **`telemetry_chargeback`** role is designed to test the **RHOSO Cloudkitty**
55
The role performs two main functions:
66

77
1. **CloudKitty Validation** - Enables and configures the CloudKitty hashmap rating module, then validates its state.
8-
2. **Synthetic Data Generation** - Generates synthetic Loki log data for testing chargeback scenarios using a Python script and Jinja2 template.
8+
2. **Synthetic Data Generation & Analysis** - Generates synthetic Loki log data for testing chargeback scenarios and calculates metric totals. The role automatically discovers and processes all scenario files matching `test_*.yml` in the `files/` directory. For each scenario it runs: generate synthetic data, compute syn-totals, ingest to Loki, flush Loki ingester memory, and get cost via CloudKitty rating summary (using begin/end from syn-totals). Retrieve-from-Loki is available but currently commented out in the task flow.
99

1010
Requirements
1111
------------
@@ -15,14 +15,15 @@ It relies on the following being available on the target or control host:
1515
* The **OpenStack CLI client** must be installed and configured with administrative credentials.
1616
* Required Python libraries for the `openstack` CLI (e.g., `python3-openstackclient`).
1717
* Connectivity to the OpenStack API endpoint.
18-
* **Python 3** with the following libraries for synthetic data generation:
18+
* **Python 3** with the following libraries for synthetic data generation and analysis:
1919
* `PyYAML`
2020
* `Jinja2`
2121

2222
It is expected to be run **after** a successful deployment and configuration of the following components:
2323

2424
* **OpenStack:** A functional OpenStack cloud (RHOSO) environment.
2525
* **Cloudkitty:** The Cloudkitty service must be installed, configured, and running.
26+
* **Loki / OpenShift (for ingest and flush):** When using ingest and flush tasks, the control host must have `oc` CLI access, and the Cloudkitty Loki stack (route, certificates, ingester) must be deployed. The role sets Loki push/query URLs and extracts certificates via `setup_loki_env.yml`.
2627

2728
Role Variables
2829
--------------
@@ -42,22 +43,64 @@ These variables are used internally by the role and typically do not need to be
4243
|----------|---------------|-------------|
4344
| `logs_dir_zuul` | `/home/zuul/ci-framework-data/logs` | Remote directory for log files. |
4445
| `artifacts_dir_zuul` | `/home/zuul/ci-framework-data/artifacts` | Directory for generated artifacts. |
46+
| `ck_scenario_dir` | `{{ role_path }}/files` | Directory containing scenario files (`test_*.yml`). |
47+
| `ck_synth_data_suffix` | `.json` | Suffix for generated synthetic data files. |
48+
| `ck_loki_data_suffix` | `_loki.json` | Suffix for Loki query result JSON files. |
49+
| `ck_synth_totals_suffix` | `_syn-totals.yml` | Suffix for generated metric totals files (from synthetic data). |
50+
| `ck_loki_totals_suffix` | `_loki-totals.yml` | Suffix for CloudKitty rating summary output files (from loki_rate task). |
51+
| `ck_begin_end_suffix` | `_begin_end.yml` | Suffix for begin/end timestamp output files. |
4552
| `ck_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. |
46-
| `ck_data_template` | `{{ role_path }}/template/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. |
47-
| `ck_data_config` | `{{ role_path }}/files/test_static.yml` | Path to the scenario configuration file. |
48-
| `ck_output_file_local` | `{{ artifacts_dir_zuul }}/loki_synth_data.json` | Local path for generated synthetic data. |
49-
| `ck_output_file_remote` | `{{ logs_dir_zuul }}/gen_loki_synth_data.log` | Remote destination for synthetic data. |
53+
| `ck_data_template` | `{{ role_path }}/templates/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. |
54+
| `ck_totals_script` | `{{ role_path }}/files/gen_synth_loki_metrics.totals.py` | Path to the metric totals calculation script. |
55+
56+
### Loki / OpenShift Variables (vars/main.yml)
57+
58+
Used by setup, ingest, flush, and retrieve tasks when running against Loki on OpenShift:
59+
60+
| Variable | Default Value | Description |
61+
|----------|---------------|-------------|
62+
| `cert_secret_name` | `cert-cloudkitty-client-internal` | OpenShift secret name for client certificates. |
63+
| `cert_dir` | `{{ ansible_user_dir }}/ck-certs` | Local directory for extracted ingest/query certs. |
64+
| `client_secret` | `secret/cloudkitty-lokistack-gateway-client-http` | Secret for flush client certs. |
65+
| `ca_configmap` | `cm/cloudkitty-lokistack-ca-bundle` | ConfigMap for CA bundle. |
66+
| `remote_cert_dir` | `osp-certs` | Directory inside the OpenStack pod for certs. |
67+
| `local_cert_dir` | `{{ ansible_env.HOME }}/flush_certs` | Local directory for flush certs. |
68+
| `logql_query` | `{service="cloudkitty"}` (overridable via `loki_query`) | LogQL query for Loki. |
69+
| `ck_namespace` | `openstack` | OpenShift namespace for Cloudkitty/Loki resources. |
70+
| `openstackpod` | `openstackclient` | OpenStack client pod name for exec/cp. |
71+
| `lookback` | `6` | Days lookback for Loki query time range. |
72+
| `limit` | `50` | Limit for Loki query results. |
73+
74+
Loki push/query URLs are set dynamically in `setup_loki_env.yml` from the Cloudkitty Loki route.
75+
76+
### Dynamically Set Variables (gen_synth_loki_data.yml)
77+
78+
These variables are set dynamically for each scenario file during the loop:
79+
80+
| Variable | Description |
81+
|----------|-------------|
82+
| `ck_data_file` | Local path for generated JSON data (`{{ artifacts_dir_zuul }}/{{ scenario_name }}.json`) |
83+
| `ck_synth_totals_file` | Local path for calculated metric totals (`{{ artifacts_dir_zuul }}/{{ scenario_name }}_syn-totals.yml`) |
84+
| `ck_begin_end_timestamp` | Local path for begin/end timestamp file (`{{ artifacts_dir_zuul }}/{{ scenario_name }}_begin_end.yml`) |
85+
| `ck_test_file` | Path to the scenario configuration file (`{{ ck_scenario_dir }}/{{ scenario_name }}.yml`) |
5086

5187
Scenario Configuration
5288
----------------------
53-
The synthetic data generation is controlled by a YAML configuration file (`files/test_static.yml`). This file defines:
89+
The synthetic data generation is controlled by YAML configuration files in the `files/` directory. Any file matching `test_*.yml` will be automatically discovered and processed.
90+
91+
Each scenario file defines:
5492

5593
* **generation** - Time range configuration (days, step_seconds)
5694
* **log_types** - List of log type definitions with name, type, unit, qty, price, groupby, and metadata
5795
* **required_fields** - Fields required for validation
5896
* **date_fields** - Date fields to add to groupby (week_of_the_year, day_of_the_year, month, year)
5997
* **loki_stream** - Loki stream configuration (service name)
6098

99+
Example scenario files:
100+
* `test_static_basic.yml` - Basic static values for qty and price
101+
* `test_dyn_basic.yml` - Dynamic values distributed across time steps
102+
* `test_all_qty_zero.yml` - All quantities set to zero for testing
103+
61104
Dependencies
62105
------------
63106
This role has no direct hard dependencies on other Ansible roles.

roles/telemetry_chargeback/files/gen_synth_loki_data.py

Lines changed: 56 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,44 @@
55
import yaml
66
from datetime import datetime, timezone, timedelta
77
from pathlib import Path
8-
from typing import Dict, Any
8+
from typing import Dict, Any, List, Union
99
from jinja2 import Environment
1010

1111

12+
def _get_value_for_step(
13+
values: List[Union[int, float]],
14+
step_idx: int,
15+
num_steps: int
16+
) -> Union[int, float]:
17+
"""
18+
Get the appropriate value from a list based on the current step index.
19+
20+
Values are distributed evenly across all steps. For example, if there are
21+
12 steps and 4 values, each value covers 3 steps:
22+
- Steps 0-2: values[0]
23+
- Steps 3-5: values[1]
24+
- Steps 6-8: values[2]
25+
- Steps 9-11: values[3]
26+
27+
Args:
28+
values: List of values to choose from.
29+
step_idx: Current step index (0-based).
30+
num_steps: Total number of steps.
31+
32+
Returns:
33+
The value corresponding to the current step.
34+
"""
35+
num_values = len(values)
36+
if num_values == 1:
37+
return values[0]
38+
39+
# Calculate how many steps each value covers
40+
steps_per_value = num_steps / num_values
41+
# Determine which value index to use, clamping to valid range
42+
value_idx = min(int(step_idx // steps_per_value), num_values - 1)
43+
return values[value_idx]
44+
45+
1246
# --- Configure logging with a default level that can be changed ---
1347
logging.basicConfig(
1448
level=logging.INFO,
@@ -200,12 +234,18 @@ def generate_loki_data(
200234
f"groupby must be a dictionary for {log_type_name}"
201235
)
202236

237+
# Ensure qty and price are lists for step-based distribution
238+
qty_val = log_type_config["qty"]
239+
price_val = log_type_config["price"]
240+
qty_list = qty_val if isinstance(qty_val, list) else [qty_val]
241+
price_list = price_val if isinstance(price_val, list) else [price_val]
242+
203243
log_types[log_type_name] = {
204244
"type": log_type_config["type"],
205245
"unit": log_type_config["unit"],
206246
"description": log_type_config.get("description"),
207-
"qty": log_type_config["qty"],
208-
"price": log_type_config["price"],
247+
"qty": qty_list,
248+
"price": price_list,
209249
"groupby": groupby.copy(),
210250
"metadata": log_type_config.get("metadata", {})
211251
}
@@ -231,15 +271,15 @@ def tojson_preserve_order(obj):
231271
# --- Render the template in one pass with all the data ---
232272
logger.info("Rendering final output...")
233273

274+
# Calculate total number of steps for value distribution
275+
num_steps = len(log_data_list)
276+
logger.debug(f"Total number of time steps: {num_steps}")
277+
234278
# Pre-calculate log types with date fields for each time step
235279
log_types_list = []
236280
for idx, item in enumerate(log_data_list):
237-
# For the last entry, use end_time to ensure it shows today's date
238-
if idx == len(log_data_list) - 1:
239-
dt = end_time
240-
else:
241-
epoch_seconds = item["nanoseconds"] / 1_000_000_000
242-
dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc)
281+
epoch_seconds = item["nanoseconds"] / 1_000_000_000
282+
dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc)
243283

244284
iso_year, iso_week, _ = dt.isocalendar()
245285
day_of_year = dt.timetuple().tm_yday
@@ -267,6 +307,13 @@ def tojson_preserve_order(obj):
267307
log_type_with_dates = log_type_data.copy()
268308
log_type_with_dates["groupby"] = log_type_data["groupby"].copy()
269309
log_type_with_dates["groupby"].update(date_fields)
310+
# Select qty and price based on step index distribution
311+
log_type_with_dates["qty"] = _get_value_for_step(
312+
log_type_data["qty"], idx, num_steps
313+
)
314+
log_type_with_dates["price"] = _get_value_for_step(
315+
log_type_data["price"], idx, num_steps
316+
)
270317
log_types_with_dates[log_type_name] = log_type_with_dates
271318

272319
log_types_list.append(log_types_with_dates)
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Calculate metric totals and aggregate total from a Loki JSON file.
4+
5+
Output is in YAML format.
6+
"""
7+
import json
8+
import argparse
9+
import sys
10+
import yaml
11+
from pathlib import Path
12+
13+
14+
def calculate_totals(json_path: Path, output_path: Path):
15+
"""
16+
Read Loki JSON, calculate step totals (qty * price), and sum them up.
17+
18+
Args:
19+
json_path: Path to the input JSON file.
20+
output_path: Path to the output YAML file.
21+
"""
22+
try:
23+
with json_path.open('r') as f:
24+
data = json.load(f)
25+
except Exception as e:
26+
print(f"Error reading JSON file {json_path}: {e}")
27+
sys.exit(1)
28+
29+
metric_totals = {}
30+
aggregate_total = 0.0
31+
time_steps_set = set()
32+
# Per-timestamp start/end from log entries (same for all entries at step)
33+
time_step_bounds = {}
34+
35+
# Extract values from the Loki JSON structure
36+
for stream in data.get('streams', []):
37+
for val_pair in stream.get('values', []):
38+
try:
39+
# The first element is the timestamp (nanoseconds)
40+
timestamp = val_pair[0]
41+
time_steps_set.add(timestamp)
42+
43+
# The second element is a JSON string containing the log entry
44+
entry = json.loads(val_pair[1])
45+
46+
# Start/end for this time step (same for all entries at step)
47+
if timestamp not in time_step_bounds:
48+
time_step_bounds[timestamp] = {
49+
"begin": entry.get("start"),
50+
"end": entry.get("end"),
51+
}
52+
53+
m_type = entry.get('type')
54+
if m_type is None:
55+
m_type = 'unknown'
56+
57+
qty = float(entry.get('qty', 0))
58+
price = float(entry.get('price', 0))
59+
60+
step_total = qty * price
61+
62+
if m_type not in metric_totals:
63+
metric_totals[m_type] = 0.0
64+
65+
metric_totals[m_type] += step_total
66+
aggregate_total += step_total
67+
except (json.JSONDecodeError, ValueError, IndexError) as e:
68+
print(f"Warning: Skipping malformed entry: {e}")
69+
continue
70+
71+
# First and last time step timestamps (order by numeric value)
72+
sorted_ts = (
73+
sorted(time_steps_set, key=lambda t: int(t)) if time_steps_set else []
74+
)
75+
timestamp_begin = (
76+
time_step_bounds[sorted_ts[0]]["begin"] if sorted_ts else None
77+
)
78+
timestamp_end = (
79+
time_step_bounds[sorted_ts[-1]]["end"] if sorted_ts else None
80+
)
81+
82+
# Prepare data for YAML output with time section and rates
83+
syth_rate = {
84+
m: round(t, 4) for m, t in sorted(metric_totals.items())
85+
}
86+
syth_rate["total_rate"] = round(aggregate_total, 4)
87+
88+
output_data = {
89+
"time": {
90+
"total_time_steps": len(time_steps_set),
91+
"begin": timestamp_begin,
92+
"end": timestamp_end,
93+
},
94+
"syth_rate": syth_rate,
95+
}
96+
97+
# Write to output file in YAML format
98+
try:
99+
with output_path.open('w') as f_out:
100+
f_out.write("---\n")
101+
yaml.dump(
102+
output_data, f_out, default_flow_style=False, sort_keys=False
103+
)
104+
print(
105+
f"Successfully calculated totals and wrote YAML to {output_path}"
106+
)
107+
except Exception as e:
108+
print(f"Error writing to output file {output_path}: {e}")
109+
sys.exit(1)
110+
111+
112+
def main():
113+
"""Main entry point for the script."""
114+
parser = argparse.ArgumentParser(
115+
description="Calculate totals from Loki JSON data"
116+
)
117+
parser.add_argument(
118+
"-j", "--json", required=True, type=Path,
119+
help="Path to the input JSON file."
120+
)
121+
parser.add_argument(
122+
"-o", "--output", required=True, type=Path,
123+
help="Path to the output YAML file."
124+
)
125+
126+
args = parser.parse_args()
127+
calculate_totals(args.json, args.output)
128+
129+
130+
if __name__ == "__main__":
131+
main()

0 commit comments

Comments
 (0)