This document explains how to configure the Cost Management Metrics Operator to work with self-signed certificates in airgapped or development environments.
The Cost Management Metrics Operator fails with self-signed certificates due to TLS verification failures. Here's what happens:
Out of the box, the Cost Management Metrics Operator is configured with:
SSL_CERT_FILE=/etc/ssl/certs/combined-ca-bundle.crtREQUESTS_CA_BUNDLE=/etc/ssl/certs/combined-ca-bundle.crt- A
combined-ca-bundleConfigMap that contains only standard CA certificates
When the operator attempts to communicate with services using self-signed certificates, you'll see errors like:
# JWT Token Generation (Keycloak communication)
ERROR: Get "https://keycloak-keycloak.apps.cluster.local/realms/kubernetes/protocol/openid-connect/token":
x509: certificate signed by unknown authority
# Upload to Ingress
ERROR: Post "https://cost-onprem-ingress-cost-onprem.apps.cluster.local/api/ingress/v1/upload":
tls: failed to verify certificate: x509: certificate signed by unknown authority
# Prometheus Metrics Collection
ERROR: Get "https://thanos-querier.openshift-monitoring.svc:9091":
x509: certificate signed by unknown authorityThe operator's combined-ca-bundle.crt doesn't include:
- Custom Keycloak CA certificates - needed for JWT token requests
- OpenShift ingress CA certificates - needed for upload requests
- Custom service CA certificates - needed for internal service communication
- Self-signed root CAs - used in airgapped environments
CRITICAL: The Cost Management Metrics Operator is deployed and managed by OpenShift's Operator Lifecycle Manager (OLM) via a ClusterServiceVersion (CSV). This creates additional challenges:
- Deployment Reconciliation: Any direct changes to the operator deployment are automatically reverted by OLM
- CSV Specifications: The deployment configuration is controlled by the CSV, not user modifications
- Limited Customization: Standard Kubernetes deployment modification approaches don't work
- CA Bundle Constraints: The CA bundle mounting is defined in the CSV and cannot be easily modified
In our JWT authentication implementation, this affects:
- Token Generation: Operator β Keycloak (fails with certificate error)
- Upload Process: Operator β Ingress (fails with certificate error)
- Metrics Collection: Operator β Prometheus (fails with certificate error)
The operator logs will show continuous authentication and upload failures, preventing cost management data from being collected and uploaded.
The Cost Management Metrics Operator communicates with several services that may use self-signed certificates:
- Red Hat Build of Keycloak (RHBK) - for JWT token generation
- Ingress - for uploading metrics data
- Prometheus/Thanos - for collecting metrics
- OpenShift API - for cluster metadata
To make the operator work with self-signed certificates within OLM constraints, we need to:
- Work within OLM limitations - cannot modify deployment directly
- Update the existing combined-ca-bundle ConfigMap (the one OLM already mounts)
- Extract all relevant CA certificates from the cluster
- Ensure ConfigMap updates persist across operator restarts
- Test and validate TLS connectivity to all services
Since we cannot modify the deployment due to OLM management, our approach is:
- Preserve the existing CA bundle mount point defined in the CSV
- Update the contents of the
combined-ca-bundleConfigMap that OLM already mounts - Ensure comprehensive CA coverage for all required services
- Automate the update process for maintainability
The operator is deployed with the following TLS settings:
env:
- name: SSL_CERT_FILE
value: /etc/ssl/certs/combined-ca-bundle.crt
- name: REQUESTS_CA_BUNDLE
value: /etc/ssl/certs/combined-ca-bundle.crt
volumeMounts:
- mountPath: /etc/ssl/certs/combined-ca-bundle.crt
name: combined-ca-bundle
readOnly: true
subPath: ca-bundle.crt
volumes:
- configMap:
name: combined-ca-bundle
name: combined-ca-bundleBefore making changes, understand that:
- The operator deployment is controlled by OLM via ClusterServiceVersion
- Direct deployment modifications will be reverted during reconciliation
- We must work within the existing CA bundle mounting structure
- The
combined-ca-bundleConfigMap is already mounted by the CSV - we update its contents
The combined-ca-bundle ConfigMap needs to include all custom CA certificates.
IMPORTANT: We update the existing ConfigMap that OLM already mounts, not the deployment configuration:
# Extract current CA bundle
oc get configmap combined-ca-bundle -n costmanagement-metrics-operator -o jsonpath='{.data.ca-bundle\.crt}' > current-ca-bundle.crt
# Get Keycloak CA certificate
oc get secret keycloak-tls-secret -n keycloak -o jsonpath='{.data.tls\.crt}' | base64 -d > keycloak-ca.crt
# Get OpenShift ingress CA certificate
oc get secret router-ca -n openshift-ingress-operator -o jsonpath='{.data.tls\.crt}' | base64 -d > ingress-ca.crt
# Get OpenShift service CA
oc get configmap service-ca-bundle -n openshift-config -o jsonpath='{.data.service-ca\.crt}' > service-ca.crt
# Combine all CA certificates
cat current-ca-bundle.crt keycloak-ca.crt ingress-ca.crt service-ca.crt > combined-ca-bundle.crt
# Update the ConfigMap
oc create configmap combined-ca-bundle \
--from-file=ca-bundle.crt=combined-ca-bundle.crt \
--dry-run=client -o yaml | oc replace -f - -n costmanagement-metrics-operatorNote: Unlike deployment changes, CostManagementMetricsConfig modifications are safe because they represent user configuration, not OLM-managed resources.
Modify the CostManagementMetricsConfig to handle self-signed certificates:
apiVersion: costmanagement-metrics-cfg.openshift.io/v1beta1
kind: CostManagementMetricsConfig
metadata:
name: costmanagementmetricscfg-sample-v1beta1
namespace: costmanagement-metrics-operator
spec:
# Authentication configuration for Keycloak
authentication:
type: "token"
token_url: "https://keycloak-keycloak.apps.cluster.local/realms/kubernetes/protocol/openid-connect/token"
# Custom client configuration
client_id: "cost-management-operator"
# Secret containing client credentials
secret_name: "cost-management-auth-secret"
# Upload configuration
upload:
upload_toggle: true
upload_cycle: 360 # 6 hours
validate_cert: true # Keep certificate validation enabled
ingress_path: "/api/ingress/v1/upload"
# Use internal service URL to avoid external certificate issues
ingress_url: "http://cost-onprem-ingress.cost-onprem.svc.cluster.local:8080"
# Prometheus configuration
prometheus_config:
service_address: "https://thanos-querier.openshift-monitoring.svc:9091"
skip_tls_verification: false # Keep TLS verification enabled
collect_previous_data: true
context_timeout: 120
# Source configuration (Sources API is now part of Koku at /api/cost-management/v1/sources/)
source:
create_source: false
check_cycle: 1440
sources_path: "/api/cost-management/v1/"Create an automated script to maintain the CA bundle:
#!/bin/bash
# File: scripts/setup-cost-mgmt-tls.sh
NAMESPACE="costmanagement-metrics-operator"
TEMP_DIR="/tmp/cost-mgmt-ca"
echo "π§ Updating Cost Management Metrics Operator CA Bundle"
echo "=============================================="
mkdir -p "$TEMP_DIR"
cd "$TEMP_DIR"
# Get OpenShift system CA bundle
echo "π₯ Extracting OpenShift system CA bundle..."
oc get configmap kube-root-ca.crt -n "$NAMESPACE" -o jsonpath='{.data.ca\.crt}' > system-ca.crt
# Get service CA bundle
echo "π₯ Extracting service CA bundle..."
oc get configmap service-ca-bundle -n openshift-config -o jsonpath='{.data.service-ca\.crt}' > service-ca.crt 2>/dev/null || echo "# No service CA found" > service-ca.crt
# Get Keycloak CA (if available)
echo "π₯ Extracting Keycloak CA certificate..."
KEYCLOAK_NS=$(oc get keycloak -A -o jsonpath='{.items[0].metadata.namespace}' 2>/dev/null || echo "keycloak")
if oc get secret -n "$KEYCLOAK_NS" | grep -q tls; then
KEYCLOAK_SECRET=$(oc get secret -n "$KEYCLOAK_NS" -o name | grep tls | head -1)
oc get "$KEYCLOAK_SECRET" -n "$KEYCLOAK_NS" -o jsonpath='{.data.tls\.crt}' | base64 -d > keycloak-ca.crt
else
echo "# No Keycloak TLS secret found" > keycloak-ca.crt
fi
# Get ingress CA certificate
echo "π₯ Extracting ingress CA certificate..."
if oc get secret router-ca -n openshift-ingress-operator &>/dev/null; then
oc get secret router-ca -n openshift-ingress-operator -o jsonpath='{.data.tls\.crt}' | base64 -d > ingress-ca.crt
else
# Alternative: extract from route
oc get route -A -o jsonpath='{.items[0].spec.host}' | head -1 | xargs -I {} sh -c 'echo | openssl s_client -connect {}:443 -servername {} 2>/dev/null | openssl x509 > ingress-ca.crt' || echo "# No ingress CA found" > ingress-ca.crt
fi
# Create combined CA bundle
echo "π Creating combined CA bundle..."
cat > combined-ca-bundle.crt << EOF
# Combined CA Bundle for Cost Management Metrics Operator
# Generated on $(date)
# Includes: System CA, Service CA, Keycloak CA, Ingress CA
EOF
cat system-ca.crt >> combined-ca-bundle.crt
echo "" >> combined-ca-bundle.crt
cat service-ca.crt >> combined-ca-bundle.crt
echo "" >> combined-ca-bundle.crt
cat keycloak-ca.crt >> combined-ca-bundle.crt
echo "" >> combined-ca-bundle.crt
cat ingress-ca.crt >> combined-ca-bundle.crt
# Update ConfigMap
echo "π€ Updating combined-ca-bundle ConfigMap..."
oc create configmap combined-ca-bundle \
--from-file=ca-bundle.crt=combined-ca-bundle.crt \
--dry-run=client -o yaml | oc apply -f - -n "$NAMESPACE"
echo "β
CA bundle updated successfully"
# Restart operator to pick up new CA bundle
echo "π Restarting Cost Management Metrics Operator..."
oc rollout restart deployment costmanagement-metrics-operator -n "$NAMESPACE"
echo "β
Cost Management Metrics Operator CA bundle update complete!"
# Cleanup
cd /
rm -rf "$TEMP_DIR"# Verify CA bundle is mounted correctly
oc exec -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator -- ls -la /etc/ssl/certs/
# Check environment variables
oc exec -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator -- printenv | grep -E "(SSL_CERT_FILE|REQUESTS_CA_BUNDLE)"
# Test Keycloak connectivity
oc exec -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator -- curl -v https://keycloak-keycloak.apps.cluster.local/realms/kubernetes/.well-known/openid_configurationBased on our real-world testing, here are the exact errors you'll encounter and their solutions:
| Error | When It Occurs | Root Cause | Solution |
|---|---|---|---|
x509: certificate signed by unknown authority |
JWT token generation from Keycloak | Keycloak CA not in bundle | Add Keycloak CA to combined-ca-bundle |
tls: failed to verify certificate: x509: certificate signed by unknown authority |
Upload to ingress | Ingress route CA not in bundle | Add OpenShift ingress CA to bundle |
SSL: CERTIFICATE_VERIFY_FAILED |
Any HTTPS communication | Outdated or incomplete CA bundle | Run CA bundle update script |
dial tcp: lookup keycloak-keycloak.apps.cluster.local: no such host |
DNS resolution failure | Network/DNS issue | Verify service names and DNS |
connection refused |
Service connectivity | Service not available | Check pod status: oc get pods -n keycloak |
context deadline exceeded |
Request timeout | Network or TLS handshake issues | Check network connectivity and certificates |
During our JWT authentication implementation, we encountered this exact sequence:
- Initial Error:
x509: certificate signed by unknown authoritywhen generating JWT tokens - Symptom: Cost Management Metrics Operator logs showed continuous authentication failures
- Impact: No data uploads, JWT authentication completely broken
- First Attempt: Tried to modify deployment directly β Failed due to OLM reconciliation
- Working Solution: Updated existing
combined-ca-bundleConfigMap with Keycloak's CA certificate - Operator Restart: Required restart to pick up new CA bundle
- Result: JWT token generation successful, uploads resumed
Key Lessons:
- The operator's default CA bundle only includes standard public CAs, not custom OpenShift CAs
- OLM prevents direct deployment modifications - work within existing structure
- ConfigMap updates + operator restart is the only viable approach
- Always verify changes persist after OLM reconciliation cycles
# Cost Management Metrics Operator logs
oc logs -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator -f
# Look for specific TLS errors
oc logs -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator | grep -i "tls\|ssl\|certificate\|x509"IMPORTANT: This TLS configuration is essential for the JWT authentication flow documented in native-jwt-authentication.md.
Our JWT authentication implementation requires the Cost Management Metrics Operator to:
- Generate JWT tokens by calling Keycloak's token endpoint
- Upload data to the ingress with JWT authentication
- Validate certificates for all HTTPS communications
Without proper TLS configuration, the JWT authentication flow fails at the token generation step.
When using JWT authentication (as documented in native-jwt-authentication.md), ensure:
- Keycloak Communication: Cost Management Metrics Operator must trust Keycloak's certificate
- Token Validation: Tokens are properly signed and validated
- Ingress Communication: Upload requests include proper CA certificates
# CostManagementMetricsConfig for JWT + Self-Signed Certs
apiVersion: costmanagement-metrics-cfg.openshift.io/v1beta1
kind: CostManagementMetricsConfig
metadata:
name: costmanagementmetricscfg-jwt-tls
namespace: costmanagement-metrics-operator
spec:
authentication:
type: "token"
token_url: "https://keycloak-keycloak.apps.cluster.local/realms/kubernetes/protocol/openid-connect/token"
client_id: "cost-management-operator"
secret_name: "cost-management-auth-secret"
upload:
upload_toggle: true
upload_cycle: 360
validate_cert: true # Certificate validation enabled with custom CA bundle
ingress_path: "/api/ingress/v1/upload"
# Use external route for JWT authentication flow
ingress_url: "https://cost-onprem-ingress-cost-onprem.apps.cluster.local"
prometheus_config:
service_address: "https://thanos-querier.openshift-monitoring.svc:9091"
skip_tls_verification: false
collect_previous_data: true
context_timeout: 120β Direct Deployment Modification:
# This will be reverted by OLM reconciliation
oc patch deployment costmanagement-metrics-operator -n costmanagement-metrics-operator --patch '...'β Adding New Volume Mounts:
# OLM will revert any changes to volumes/volumeMounts
oc set volume deployment/costmanagement-metrics-operator --add --name=custom-caβ Environment Variable Modifications:
# Environment variables defined in CSV cannot be modified
oc set env deployment/costmanagement-metrics-operator SSL_CERT_FILE=/custom/pathβ ConfigMap Content Updates:
# Update the contents of existing ConfigMaps that are already mounted
oc create configmap combined-ca-bundle --from-file=ca-bundle.crt=new-bundle.crt --dry-run=client -o yaml | oc apply -f -β CostManagementMetricsConfig Modifications:
# User configuration CRs are not managed by CSV
oc apply -f custom-metricsconfig.yamlβ Secret Content Updates:
# Update existing secrets (like authentication secrets)
oc create secret generic cost-management-auth-secret --from-literal=client_secret=new-secret --dry-run=client -o yaml | oc apply -f -Always verify your changes survive OLM reconciliation:
# Trigger OLM reconciliation by updating operator
oc patch csv costmanagement-metrics-operator.4.1.0 -n costmanagement-metrics-operator --type='json' -p='[{"op": "add", "path": "/metadata/annotations/test", "value": "trigger-reconcile"}]'
# Wait and verify your ConfigMap changes are still present
oc get configmap combined-ca-bundle -n costmanagement-metrics-operator -o yaml | grep "your-custom-ca"- Certificate Rotation: Monitor certificate expiration and update CA bundle accordingly
- OLM Reconciliation: Ensure CA bundle updates persist across operator upgrades
- Automation: Use the update script in CI/CD pipelines, accounting for OLM behavior
- Monitoring: Set up alerts for TLS-related failures and OLM reconciliation events
- Backup: Keep backup of working CA bundle configuration before OLM operations
- Testing: Validate TLS configuration after any certificate changes and OLM updates
# 1. Deploy Cost Management Metrics Operator with CA bundle support
./scripts/setup-cost-mgmt-tls.sh
# 2. Apply TLS-compatible CostManagementMetricsConfig
oc apply -f examples/costmanagementmetricscfg-tls.yaml
# 3. Wait for operator restart
oc rollout status deployment costmanagement-metrics-operator -n costmanagement-metrics-operator
# 4. Validate TLS connectivity
oc logs -n costmanagement-metrics-operator deployment/costmanagement-metrics-operator --tail=50This configuration ensures the Cost Management Metrics Operator can successfully communicate with all services using self-signed certificates while maintaining security best practices.