Skip to content

Commit ec89594

Browse files
committed
Get chargeback data results from loki and validate
that the total cost is correct Used Gemini and Cursor Validates chargeback total cost - uses synth data to calculate total cost via script - run "openstack rating summary get" to get total cost from loki - compares script_totals and Loki_Totals if same then job passes Closes: https://issues.redhat.com/browse/OSPRH-26066
1 parent 9362bef commit ec89594

15 files changed

+525
-54
lines changed

roles/telemetry_chargeback/README.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ The **`telemetry_chargeback`** role is designed to test the **RHOSO Cloudkitty**
55
The role performs two main functions:
66

77
1. **CloudKitty Validation** - Enables and configures the CloudKitty hashmap rating module, then validates its state.
8-
2. **Synthetic Data Generation** - Generates synthetic Loki log data for testing chargeback scenarios using a Python script and Jinja2 template.
8+
2. **Synthetic Data Generation & Analysis** - Generates synthetic Loki log data for testing chargeback scenarios and calculates metric totals. The role automatically discovers and processes all scenario files matching `test_*.yml` in the `files/` directory. For each scenario it then runs the load path (ingest to Loki, retrieve from Loki, get cost via CloudKitty rating summary). The ingest and retrieve steps are currently stubs for future implementation.
99

1010
Requirements
1111
------------
@@ -15,7 +15,7 @@ It relies on the following being available on the target or control host:
1515
* The **OpenStack CLI client** must be installed and configured with administrative credentials.
1616
* Required Python libraries for the `openstack` CLI (e.g., `python3-openstackclient`).
1717
* Connectivity to the OpenStack API endpoint.
18-
* **Python 3** with the following libraries for synthetic data generation:
18+
* **Python 3** with the following libraries for synthetic data generation and analysis:
1919
* `PyYAML`
2020
* `Jinja2`
2121

@@ -42,22 +42,41 @@ These variables are used internally by the role and typically do not need to be
4242
|----------|---------------|-------------|
4343
| `logs_dir_zuul` | `/home/zuul/ci-framework-data/logs` | Remote directory for log files. |
4444
| `artifacts_dir_zuul` | `/home/zuul/ci-framework-data/artifacts` | Directory for generated artifacts. |
45+
| `ck_scenario_dir` | `{{ role_path }}/files` | Directory containing scenario files (`test_*.yml`). |
46+
| `ck_synth_data_suffix` | `.json` | Suffix for generated synthetic data files. |
47+
| `ck_synth_totals_suffix` | `_syn-totals.yml` | Suffix for generated metric totals files (from synthetic data). |
48+
| `ck_loki_totals_suffix` | `_loki-totals.yml` | Suffix for totals retrieved from Loki (reserved for future use). |
4549
| `ck_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. |
46-
| `ck_data_template` | `{{ role_path }}/template/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. |
47-
| `ck_data_config` | `{{ role_path }}/files/test_static.yml` | Path to the scenario configuration file. |
48-
| `ck_output_file_local` | `{{ artifacts_dir_zuul }}/loki_synth_data.json` | Local path for generated synthetic data. |
49-
| `ck_output_file_remote` | `{{ logs_dir_zuul }}/gen_loki_synth_data.log` | Remote destination for synthetic data. |
50+
| `ck_data_template` | `{{ role_path }}/templates/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. |
51+
| `ck_totals_script` | `{{ role_path }}/files/synth_loki_metrics_totals.py` | Path to the metric totals calculation script. |
52+
53+
### Dynamically Set Variables (gen_synth_loki_data.yml)
54+
55+
These variables are set dynamically for each scenario file during the loop:
56+
57+
| Variable | Description |
58+
|----------|-------------|
59+
| `ck_data_file` | Local path for generated JSON data (`{{ artifacts_dir_zuul }}/{{ scenario_name }}.json`) |
60+
| `ck_synth_totals_file` | Local path for calculated metric totals (`{{ artifacts_dir_zuul }}/{{ scenario_name }}_syn-totals.yml`) |
61+
| `ck_test_file` | Path to the scenario configuration file (`{{ ck_scenario_dir }}/{{ scenario_name }}.yml`) |
5062

5163
Scenario Configuration
5264
----------------------
53-
The synthetic data generation is controlled by a YAML configuration file (`files/test_static.yml`). This file defines:
65+
The synthetic data generation is controlled by YAML configuration files in the `files/` directory. Any file matching `test_*.yml` will be automatically discovered and processed.
66+
67+
Each scenario file defines:
5468

5569
* **generation** - Time range configuration (days, step_seconds)
5670
* **log_types** - List of log type definitions with name, type, unit, qty, price, groupby, and metadata
5771
* **required_fields** - Fields required for validation
5872
* **date_fields** - Date fields to add to groupby (week_of_the_year, day_of_the_year, month, year)
5973
* **loki_stream** - Loki stream configuration (service name)
6074

75+
Example scenario files:
76+
* `test_static.yml` - Basic static values for qty and price
77+
* `test_dyn_basic.yml` - Dynamic values distributed across time steps
78+
* `test_all_qty_zero.yml` - All quantities set to zero for testing
79+
6180
Dependencies
6281
------------
6382
This role has no direct hard dependencies on other Ansible roles.

roles/telemetry_chargeback/files/gen_synth_loki_data.py

Lines changed: 56 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,44 @@
55
import yaml
66
from datetime import datetime, timezone, timedelta
77
from pathlib import Path
8-
from typing import Dict, Any
8+
from typing import Dict, Any, List, Union
99
from jinja2 import Environment
1010

1111

12+
def _get_value_for_step(
13+
values: List[Union[int, float]],
14+
step_idx: int,
15+
num_steps: int
16+
) -> Union[int, float]:
17+
"""
18+
Get the appropriate value from a list based on the current step index.
19+
20+
Values are distributed evenly across all steps. For example, if there are
21+
12 steps and 4 values, each value covers 3 steps:
22+
- Steps 0-2: values[0]
23+
- Steps 3-5: values[1]
24+
- Steps 6-8: values[2]
25+
- Steps 9-11: values[3]
26+
27+
Args:
28+
values: List of values to choose from.
29+
step_idx: Current step index (0-based).
30+
num_steps: Total number of steps.
31+
32+
Returns:
33+
The value corresponding to the current step.
34+
"""
35+
num_values = len(values)
36+
if num_values == 1:
37+
return values[0]
38+
39+
# Calculate how many steps each value covers
40+
steps_per_value = num_steps / num_values
41+
# Determine which value index to use, clamping to valid range
42+
value_idx = min(int(step_idx // steps_per_value), num_values - 1)
43+
return values[value_idx]
44+
45+
1246
# --- Configure logging with a default level that can be changed ---
1347
logging.basicConfig(
1448
level=logging.INFO,
@@ -200,12 +234,18 @@ def generate_loki_data(
200234
f"groupby must be a dictionary for {log_type_name}"
201235
)
202236

237+
# Ensure qty and price are lists for step-based distribution
238+
qty_val = log_type_config["qty"]
239+
price_val = log_type_config["price"]
240+
qty_list = qty_val if isinstance(qty_val, list) else [qty_val]
241+
price_list = price_val if isinstance(price_val, list) else [price_val]
242+
203243
log_types[log_type_name] = {
204244
"type": log_type_config["type"],
205245
"unit": log_type_config["unit"],
206246
"description": log_type_config.get("description"),
207-
"qty": log_type_config["qty"],
208-
"price": log_type_config["price"],
247+
"qty": qty_list,
248+
"price": price_list,
209249
"groupby": groupby.copy(),
210250
"metadata": log_type_config.get("metadata", {})
211251
}
@@ -231,15 +271,15 @@ def tojson_preserve_order(obj):
231271
# --- Render the template in one pass with all the data ---
232272
logger.info("Rendering final output...")
233273

274+
# Calculate total number of steps for value distribution
275+
num_steps = len(log_data_list)
276+
logger.debug(f"Total number of time steps: {num_steps}")
277+
234278
# Pre-calculate log types with date fields for each time step
235279
log_types_list = []
236280
for idx, item in enumerate(log_data_list):
237-
# For the last entry, use end_time to ensure it shows today's date
238-
if idx == len(log_data_list) - 1:
239-
dt = end_time
240-
else:
241-
epoch_seconds = item["nanoseconds"] / 1_000_000_000
242-
dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc)
281+
epoch_seconds = item["nanoseconds"] / 1_000_000_000
282+
dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc)
243283

244284
iso_year, iso_week, _ = dt.isocalendar()
245285
day_of_year = dt.timetuple().tm_yday
@@ -267,6 +307,13 @@ def tojson_preserve_order(obj):
267307
log_type_with_dates = log_type_data.copy()
268308
log_type_with_dates["groupby"] = log_type_data["groupby"].copy()
269309
log_type_with_dates["groupby"].update(date_fields)
310+
# Select qty and price based on step index distribution
311+
log_type_with_dates["qty"] = _get_value_for_step(
312+
log_type_data["qty"], idx, num_steps
313+
)
314+
log_type_with_dates["price"] = _get_value_for_step(
315+
log_type_data["price"], idx, num_steps
316+
)
270317
log_types_with_dates[log_type_name] = log_type_with_dates
271318

272319
log_types_list.append(log_types_with_dates)
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Calculate metric totals and aggregate total from a Loki JSON file.
4+
5+
Output is in YAML format.
6+
"""
7+
import json
8+
import argparse
9+
import sys
10+
import yaml
11+
from pathlib import Path
12+
13+
14+
def calculate_totals(json_path: Path, output_path: Path):
15+
"""
16+
Read Loki JSON, calculate step totals (qty * price), and sum them up.
17+
18+
Args:
19+
json_path: Path to the input JSON file.
20+
output_path: Path to the output YAML file.
21+
"""
22+
try:
23+
with json_path.open('r') as f:
24+
data = json.load(f)
25+
except Exception as e:
26+
print(f"Error reading JSON file {json_path}: {e}")
27+
sys.exit(1)
28+
29+
metric_totals = {}
30+
aggregate_total = 0.0
31+
time_steps_set = set()
32+
33+
# Extract values from the Loki JSON structure
34+
for stream in data.get('streams', []):
35+
for val_pair in stream.get('values', []):
36+
try:
37+
# The first element is the timestamp (nanoseconds)
38+
timestamp = val_pair[0]
39+
time_steps_set.add(timestamp)
40+
41+
# The second element is a JSON string containing the log entry
42+
entry = json.loads(val_pair[1])
43+
44+
m_type = entry.get('type')
45+
if m_type is None:
46+
m_type = 'unknown'
47+
48+
qty = float(entry.get('qty', 0))
49+
price = float(entry.get('price', 0))
50+
51+
step_total = qty * price
52+
53+
if m_type not in metric_totals:
54+
metric_totals[m_type] = 0.0
55+
56+
metric_totals[m_type] += step_total
57+
aggregate_total += step_total
58+
except (json.JSONDecodeError, ValueError, IndexError) as e:
59+
print(f"Warning: Skipping malformed entry: {e}")
60+
continue
61+
62+
# Prepare data for YAML output following vars/main.yml pattern
63+
output_data = {
64+
"total_time_steps": len(time_steps_set),
65+
"syth_rate": {
66+
m: round(t, 4) for m, t in sorted(metric_totals.items())
67+
},
68+
"total_rate": round(aggregate_total, 4)
69+
}
70+
71+
# Write to output file in YAML format
72+
try:
73+
with output_path.open('w') as f_out:
74+
f_out.write("---\n")
75+
yaml.dump(
76+
output_data, f_out, default_flow_style=False, sort_keys=False
77+
)
78+
print(
79+
f"Successfully calculated totals and wrote YAML to {output_path}"
80+
)
81+
except Exception as e:
82+
print(f"Error writing to output file {output_path}: {e}")
83+
sys.exit(1)
84+
85+
86+
def main():
87+
"""Main entry point for the script."""
88+
parser = argparse.ArgumentParser(
89+
description="Calculate totals from Loki JSON data"
90+
)
91+
parser.add_argument(
92+
"-j", "--json", required=True, type=Path,
93+
help="Path to the input JSON file."
94+
)
95+
parser.add_argument(
96+
"-o", "--output", required=True, type=Path,
97+
help="Path to the output YAML file."
98+
)
99+
100+
args = parser.parse_args()
101+
calculate_totals(args.json, args.output)
102+
103+
104+
if __name__ == "__main__":
105+
main()

0 commit comments

Comments
 (0)