This guide covers how to set up test data for Cost On-Prem validation using the setup-test-data.sh script.
For detailed information on specific test types, see:
- E2E Scenarios - YAML-driven scenario definitions
- Performance Profiles - Production-based sizing profiles
- NISE Templates - Available data templates
# List available scenarios
./scripts/setup-test-data.sh --list
# Set up data for E2E testing
./scripts/setup-test-data.sh --scenario baseline
# Set up data for performance testing
./scripts/setup-test-data.sh --scenario perf-small
# Clean up test data
./scripts/setup-test-data.sh --clean| Scenario | Clusters | Nodes | Days | ROS | Upload | Processing | Use Case |
|---|---|---|---|---|---|---|---|
minimal |
1 | 1 | 1 | No | <30s | <2min | Smoke tests |
baseline |
1 | 2 | 7 | Yes | <2min | <10min | E2E tests |
perf-small |
1 | 15 | 30 | Yes | <5min | <30min | Perf baseline |
perf-medium |
2 | 49 | 30 | Yes | <15min | <60min | Scale testing |
perf-large |
7 | 133 | 30 | Yes | <45min | <3hr | Stress testing |
ros |
1 | 3 | 7 | Yes | <2min | <15min | ROS testing |
The perf-* scenarios align with the Performance Profiles based on production customer data.
-
Cost On-Prem deployed and healthy:
./scripts/run-pytest.sh --smoke
-
Environment variables (script auto-detects most):
export NAMESPACE=cost-onprem # Default export HELM_RELEASE_NAME=cost-onprem # Default export KAFKA_NAMESPACE=kafka # If separate namespace
./scripts/setup-test-data.sh [OPTIONS]
OPTIONS:
--scenario <name> Scenario to set up (required unless --clean)
--list List available scenarios
--days <n> Override days of data (default: scenario-specific)
--clusters <n> Override cluster count (default: scenario-specific)
--source-prefix <s> Prefix for source names (default: e2e)
--no-wait Don't wait for processing to complete
--no-cleanup Keep data after script exits
--dry-run Show what would be done
--clean Clean test data only
--clean-prefix <s> Clean sources matching prefix
Before running tests that depend on specific data states:
# 1. Clean existing test data
./scripts/setup-test-data.sh --clean
# This removes:
# - Sources with e2e-pytest- prefix
# - Database records for test clusters
# - Manifests and reports from test uploads# Check for existing test sources
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \
psql -h localhost -U koku -d koku -c \
"SELECT name FROM api_sources WHERE name LIKE 'e2e-%' OR name LIKE 'perf-%';"
# Should return: (0 rows)For complete environment reset:
# WARNING: This removes ALL data, not just test data
./scripts/setup-test-data.sh --reset-all
# Alternatively, redeploy
helm uninstall cost-onprem -n cost-onprem
# ... redeploy ...# 1. Setup data for scenario
./scripts/setup-test-data.sh --scenario baseline
# 2. Run tests that need the data
pytest tests/suites/e2e/test_complete_flow.py -v
# 3. Cleanup (optional, tests should self-cleanup)
./scripts/setup-test-data.sh --clean# Setup data and keep it
./scripts/setup-test-data.sh --scenario perf-small --no-cleanup
# Data will persist across test runs
# Source names are printed for reference:
# Source: perf-source-abc123
# Cluster: perf-cluster-abc123
# Manually cleanup when done
./scripts/setup-test-data.sh --clean --source perf-source-abc123# Setup multiple scenarios for UI exploration
./scripts/setup-test-data.sh --scenario baseline --source-prefix demo-baseline
./scripts/setup-test-data.sh --scenario ros --source-prefix demo-ros
# Environment now has:
# - demo-baseline-* source with standard cost data
# - demo-ros-* source with ROS recommendations
# Access UI to explore data
# Cleanup when done
./scripts/setup-test-data.sh --clean --source-prefix demo-# Check source was created
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \
psql -h localhost -U koku -d koku -c \
"SELECT id, name, source_type FROM api_sources WHERE name LIKE '%your-source%';"
# Check manifests were processed
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \
psql -h localhost -U koku -d koku -c \
"SELECT cluster_id, manifest_id, state FROM reporting_ocpusagereportmanifest ORDER BY creation_datetime DESC LIMIT 5;"
# Check summary tables have data
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \
psql -h localhost -U koku -d koku -c \
"SELECT COUNT(*) FROM reporting_ocpusagelineitem_daily_summary WHERE cluster_id = 'your-cluster-id';"# Check Kruize experiments
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \
psql -h localhost -U kruize -d costonprem_kruize -c \
"SELECT experiment_name, cluster_name FROM public.kruize_experiments WHERE cluster_name LIKE '%your-cluster%';"
# Check recommendations exist
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \
psql -h localhost -U kruize -d costonprem_kruize -c \
"SELECT COUNT(*) FROM public.kruize_recommendations WHERE experiment_name LIKE '%your-cluster%';"-
Check manifest processing:
# Look for processing errors oc logs -n cost-onprem -l app.kubernetes.io/component=koku-ocp-worker --tail=100 | grep -i error
-
Check summary job ran:
oc exec -n cost-onprem deploy/cost-onprem-koku-api -- \ psql -h localhost -U koku -d koku -c \ "SELECT * FROM api_dataexportstatus ORDER BY updated_timestamp DESC LIMIT 5;"
-
Flush cache (if API returns stale data):
oc exec -n cost-onprem deploy/cost-onprem-valkey -- valkey-cli FLUSHALL
Gateway timeouts occur with large files (>20MB). Options:
-
Reduce data size:
./scripts/setup-test-data.sh --scenario baseline --days 3 # Instead of 7 -
Increase timeout (temporary, requires chart config):
# values.yaml koku: ingress: annotations: haproxy.router.openshift.io/timeout: 10m
Kruize needs sufficient data history (typically 7+ days):
-
Ensure 7 days of data:
./scripts/setup-test-data.sh --scenario ros --days 7
-
Check ROS queue:
oc exec -n kafka kafka-cluster-kafka-0 -- \ bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 \ --describe --group ros-processor -
Check Kruize logs:
oc logs -n cost-onprem -l app.kubernetes.io/component=kruize --tail=100
# .github/workflows/e2e.yml
- name: Setup test data
run: |
./scripts/setup-test-data.sh --scenario baseline --wait
- name: Run E2E tests
run: |
./scripts/run-pytest.sh --e2eThe deploy-test-cost-onprem.sh script can setup data automatically:
# Include data setup in deployment
./scripts/deploy-test-cost-onprem.sh --setup-test-data baseline
# Or separately after deployment
./scripts/deploy-test-cost-onprem.sh
./scripts/setup-test-data.sh --scenario baseline --wait
./scripts/run-pytest.sh --e2e- Test Suite README - Test framework overview
- Performance Testing - Performance test details
- E2E Testing - E2E scenario tests
- NISE Templates - Available data templates
- Sizing Guide - Resource requirements per profile