Skip to content

GovOps: KPIs and Dashboards

Michael Schwartz edited this page Nov 20, 2025 · 1 revision

KPIs

Policy Store KPIs

  1. Policy Store Health Score

    • Description: Composite metric of validation pass rate, conflict detection results, and Cedar analysis outcomes.
    • Calculation: Weighted average of (validation pass rate × 0.4) + (conflict-free rate × 0.3) + (Cedar analysis pass rate × 0.3)
    • Target: ≥ 95% health score
    • Measurement: Real-time calculation across all policy stores, updated every 5 minutes
    • Alert Threshold: < 90% triggers warning, < 85% triggers critical alert
  2. Policy Evaluation Latency (P99)

    • Description: Measures sub-second enforcement guarantees across all Cedarling instances.
    • Calculation: 99th percentile of policy evaluation response times across all Cedarling instances
    • Target: P99 < 100ms for standard policies, P99 < 500ms for complex policies with external data
    • Measurement: Continuous monitoring via Hub System log collection, aggregated hourly
    • Alert Threshold: P99 > 200ms triggers performance investigation
  3. Policy Drift Incidents per Day

    • Description: Counts divergence between deployed stores and GitHub Releases, including stale versions still in use.
    • Calculation: Daily count of Cedarling instances running policy versions older than latest GitHub Release
    • Target: 0 drift incidents per day
    • Measurement: Comparison of deployed versions (from Hub System) vs. latest GitHub Release tags
    • Alert Threshold: > 0 incidents triggers immediate notification
  4. Policy Conflict Detection Rate

    • Description: Percentage of newly authored policies that introduce conflicts before merge.
    • Calculation: (Number of policies with detected conflicts / Total policies authored) × 100
    • Target: < 5% conflict rate
    • Measurement: Pre-merge analysis via Cedar Analysis Tools integration
    • Alert Threshold: > 10% triggers review of policy authoring process
  5. Policy Regression Failure Rate

    • Description: Number of policy changes that break historical decisions in regression testing.
    • Calculation: (Number of policy changes causing regression failures / Total policy changes) × 100
    • Target: < 2% regression failure rate
    • Measurement: Automated regression testing against historical decision log before deployment
    • Alert Threshold: > 5% triggers mandatory policy review workflow
  6. Policy Store Adoption Rate

    • Description: Percentage of AI agents and infrastructure components actively using policy stores.
    • Calculation: (Agents with active policy enforcement / Total registered agents) × 100
    • Target: ≥ 98% adoption rate
    • Measurement: Hub System tracking of Cedarling instance registration and activity
    • Alert Threshold: < 95% triggers investigation of non-adopting agents

OSCAL Compliance KPIs

  1. Control Coverage Percentage

    • Description: Percentage of controls (from profiles/baselines) mapped to component definitions, policies, and evidence.
    • Calculation: (Controls with complete mappings / Total controls in active profiles) × 100
    • Target: ≥ 90% coverage for critical controls, ≥ 75% for all controls
    • Measurement: OSCAL artifact analysis across catalogs, profiles, and component definitions
    • Alert Threshold: < 80% triggers coverage gap analysis
  2. Compliance Drift Frequency

    • Description: Frequency of infrastructure components losing compliance over time.
    • Calculation: Number of components transitioning from compliant to non-compliant per week
    • Target: < 5% of components per week
    • Measurement: Continuous compliance assessment engine comparing current state to baseline
    • Alert Threshold: > 10% triggers automated remediation workflow
  3. Automated Evidence Collection Completeness

    • Description: Percentage of evidence generated automatically vs. manual evidence.
    • Calculation: (Automated evidence items / Total evidence items) × 100
    • Target: ≥ 85% automated evidence collection
    • Measurement: Evidence collection engine tracking source (automated vs. manual) per control
    • Alert Threshold: < 75% triggers review of automation opportunities
  4. Assessment Pass Rate (per Framework)

    • Description: Percentage of OSCAL profile controls passing automated evaluation (NIST, ISO, GDPR, custom frameworks).
    • Calculation: (Passing controls / Total controls in framework profile) × 100, calculated per framework
    • Target: ≥ 95% pass rate per framework
    • Measurement: Continuous compliance assessment engine evaluating controls against infrastructure
    • Alert Threshold: < 90% triggers compliance remediation workflow
  5. Mean Time to Compliance Remediation (MTCR)

    • Description: Average time from compliance violation detection to remediation completion.
    • Calculation: Sum of remediation times / Number of remediated violations
    • Target: < 24 hours for critical violations, < 72 hours for standard violations
    • Measurement: Compliance engine tracking violation timestamps and remediation completion
    • Alert Threshold: > 48 hours triggers escalation to governance officers
  6. OSCAL Artifact Freshness

    • Description: Percentage of OSCAL artifacts (catalogs, profiles, component definitions) updated within last 90 days.
    • Calculation: (Artifacts updated in last 90 days / Total active artifacts) × 100
    • Target: ≥ 80% freshness rate
    • Measurement: Version control tracking of OSCAL artifact modification dates
    • Alert Threshold: < 70% triggers artifact review workflow

Protobuf Schema Registry KPIs

  1. Schema Compatibility Score

    • Description: Measures backward/forward compatibility across schema versions, including number of breaking changes avoided.
    • Calculation: (Compatible schema transitions / Total schema transitions) × 100, weighted by breaking change severity
    • Target: ≥ 95% compatibility score
    • Measurement: Schema registry compatibility analysis during version transitions
    • Alert Threshold: < 90% triggers schema review and migration planning
  2. Schema Validation Error Rate

    • Description: Percentage of policy data validation failures due to schema mismatches.
    • Calculation: (Schema validation errors / Total validation attempts) × 100
    • Target: < 1% error rate
    • Measurement: Protobuf validator tracking validation outcomes during policy evaluation
    • Alert Threshold: > 2% triggers schema documentation and migration support
  3. Schema Version Adoption Rate

    • Description: Percentage of policies using latest schema versions within 30 days of release.
    • Calculation: (Policies on latest schema version / Total policies) × 100
    • Target: ≥ 85% adoption within 30 days
    • Measurement: Schema registry tracking policy-to-schema version mappings
    • Alert Threshold: < 75% triggers migration assistance workflow

Hub System Distribution KPIs

  1. Policy Distribution Success Rate

    • Description: Percentage of successful policy store deployments to Cedarling instances.
    • Calculation: (Successful deployments / Total deployment attempts) × 100
    • Target: ≥ 99% success rate
    • Measurement: Hub System tracking deployment outcomes per Cedarling instance
    • Alert Threshold: < 95% triggers distribution failure investigation
  2. Log Ingestion Throughput

    • Description: Number of decision logs successfully ingested per second from all Cedarling instances.
    • Calculation: Total logs ingested / Time period (logs per second)
    • Target: ≥ 10,000 logs/second sustained throughput
    • Measurement: Hub System log ingestion metrics
    • Alert Threshold: < 8,000 logs/second triggers capacity scaling
  3. Log Ingestion Completeness

    • Description: Percentage of expected logs successfully collected from Cedarling instances.
    • Calculation: (Logs received / Expected logs based on agent activity) × 100
    • Target: ≥ 99.9% completeness
    • Measurement: Comparison of expected log volume (from agent activity) vs. actual ingestion
    • Alert Threshold: < 99% triggers connectivity and buffering investigation
  4. Policy Store Release Adoption Time

    • Description: Average time from GitHub Release to 95% of Cedarling instances adopting the new version.
    • Calculation: Time difference between release timestamp and 95% adoption timestamp
    • Target: < 4 hours for standard releases, < 1 hour for critical security updates
    • Measurement: Hub System tracking version adoption across Cedarling instances
    • Alert Threshold: > 8 hours triggers distribution optimization review

Audit & Analytics KPIs

  1. Audit Trail Completeness

    • Description: Percentage of AI agent actions and policy decisions captured in audit logs.
    • Calculation: (Logged actions / Total agent actions) × 100
    • Target: 100% completeness (all actions logged)
    • Measurement: Audit engine comparison of agent activity vs. log entries
    • Alert Threshold: < 99.9% triggers critical audit integrity investigation
  2. Anomaly Detection Accuracy

    • Description: Percentage of detected anomalies that are confirmed as actual violations or risks.
    • Calculation: (Confirmed anomalies / Total detected anomalies) × 100
    • Target: ≥ 80% accuracy (precision)
    • Measurement: Audit engine tracking anomaly confirmation by governance officers
    • Alert Threshold: < 70% triggers anomaly detection model tuning
  3. Mean Time to Detection (MTTD)

    • Description: Average time from policy violation or compliance failure to detection.
    • Calculation: Sum of detection times / Number of incidents
    • Target: < 5 minutes for critical violations, < 1 hour for standard violations
    • Measurement: Audit engine tracking violation timestamps vs. detection timestamps
    • Alert Threshold: > 15 minutes triggers real-time monitoring optimization
  4. Audit Log Query Performance (P95)

    • Description: 95th percentile response time for audit log queries and investigations.
    • Calculation: 95th percentile of query response times across all audit log queries
    • Target: P95 < 2 seconds for standard queries, P95 < 10 seconds for complex investigations
    • Measurement: Analytics engine tracking query performance metrics
    • Alert Threshold: P95 > 5 seconds triggers query optimization

AI Agent Behavior Governance KPIs

  1. Agent Risk Level Distribution

    • Description: Percentage of AI agents operating at each risk level (low, medium, high, critical).
    • Calculation: (Agents at risk level / Total agents) × 100, calculated per risk level
    • Target: ≥ 80% at low risk, < 5% at high/critical risk
    • Measurement: Governance analysis engine classifying agents based on behavior patterns
    • Alert Threshold: > 10% at high/critical triggers immediate governance review
  2. Policy Violation Rate per Agent

    • Description: Average number of policy violations per agent per day.
    • Calculation: Total violations / (Number of agents × Days)
    • Target: < 0.1 violations per agent per day
    • Measurement: Audit engine counting violations from decision logs
    • Alert Threshold: > 1 violation per agent per day triggers agent-specific review
  3. Sensitive Action Anomaly Detection Rate

    • Description: Percentage of sensitive actions (high-risk operations) flagged as anomalies.
    • Calculation: (Anomalous sensitive actions / Total sensitive actions) × 100
    • Target: < 5% anomaly rate (most sensitive actions should be expected)
    • Measurement: Anomaly detection engine analyzing temporal patterns of sensitive actions
    • Alert Threshold: > 10% triggers review of agent behavior patterns and policies
  4. Agent-to-Resource Interaction Coverage

    • Description: Percentage of agent-resource interactions monitored and evaluated.
    • Calculation: (Monitored interactions / Total agent-resource interactions) × 100
    • Target: 100% coverage (all interactions monitored)
    • Measurement: Governance analysis engine tracking interaction graph completeness
    • Alert Threshold: < 99% triggers monitoring gap investigation

Dashboard & Analytics KPIs

  1. Dashboard Load Time (P95)

    • Description: 95th percentile time to load and render dashboard visualizations.
    • Calculation: 95th percentile of dashboard load times across all dashboard types
    • Target: P95 < 3 seconds for standard dashboards, P95 < 5 seconds for complex executive dashboards
    • Measurement: Frontend performance monitoring of dashboard rendering
    • Alert Threshold: P95 > 5 seconds triggers dashboard optimization
  2. Real-Time Update Latency

    • Description: Average delay between data change and dashboard update.
    • Calculation: Average time difference between data event and dashboard refresh
    • Target: < 5 seconds for real-time dashboards
    • Measurement: WebSocket connection monitoring and data freshness tracking
    • Alert Threshold: > 10 seconds triggers real-time update optimization
  3. KPI Calculation Accuracy

    • Description: Percentage of KPI calculations verified as correct through manual audit.
    • Calculation: (Verified correct KPIs / Total KPIs audited) × 100
    • Target: 100% accuracy (all KPIs mathematically correct)
    • Measurement: Periodic manual verification of KPI calculations against source data
    • Alert Threshold: < 99% triggers KPI calculation engine review

Unified Governance Score (Executive KPI)

  1. Unified Governance Score

    • Description: Composite score combining policy health, compliance posture, and schema integrity.
    • Calculation: Weighted average of (Policy Store Health Score × 0.4) + (Compliance Pass Rate × 0.4) + (Schema Compatibility Score × 0.2)
    • Target: ≥ 90% unified score
    • Measurement: Dashboard KPI Engine calculating composite metric hourly
    • Alert Threshold: < 85% triggers executive governance review, < 80% triggers critical governance incident
  2. Top Risk Resolution Time

    • Description: Average time to resolve top 5 identified governance risks.
    • Calculation: Sum of resolution times for top 5 risks / 5
    • Target: < 48 hours for critical risks, < 1 week for standard risks
    • Measurement: Governance command center tracking risk lifecycle from identification to resolution
    • Alert Threshold: > 72 hours triggers escalation workflow
  3. Audit Integrity Score

    • Description: Composite metric of log completeness, immutability verification, and cryptographic integrity.
    • Calculation: Weighted average of (Log Completeness × 0.5) + (Immutability Verification × 0.3) + (Cryptographic Integrity × 0.2)
    • Target: 100% integrity score
    • Measurement: Audit engine performing continuous integrity checks
    • Alert Threshold: < 99.9% triggers critical audit integrity investigation

Dashboards

1. Policy Store Integrity Dashboard

Purpose: Show the correctness, deployability, and health of all policy stores. Visuals:

  • Validation pass/fail trendline
  • Conflict detection heatmap
  • Cedar Analysis results (always-allow / always-deny / unreachable rules)
  • Version distribution across agents
  • GitHub Release version adoption timeline

2. Real-Time Policy Enforcement Dashboard

Purpose: Show live state of Cedarling agents and enforcement. Visuals:

  • Policy decision latency (P50/P90/P99)
  • Policy evaluation counts per agent
  • Decision breakdown (permit/deny/error)
  • Network partition map + “offline but enforcing cached policies”

3. OSCAL Control Coverage Dashboard

Purpose: Demonstrates traceability from controls → policies → components. Visuals:

  • Controls mapped / unmapped
  • Baseline coverage by control severity
  • Components missing mappings
  • Drift detection over time

4. Continuous Compliance Monitoring Dashboard

Purpose: Real-time compliance posture. Visuals:

  • Compliance score (per framework)
  • Per-component global compliance state
  • Failed controls trend
  • Remediation workflow status

5. Evidence & Reporting Dashboard

Purpose: Visualize automated evidence generation. Visuals:

  • Auto vs manual evidence volume
  • Evidence freshness (last update timestamps)
  • Evidence coverage per control
  • Downloadable OSCAL reports

6. Schema Registry Dashboard

Purpose: Health of Protobuf schemas governing typed policy data. Visuals:

  • Schema version lineage graph
  • Compatibility matrix (color-coded breaking vs non-breaking)
  • Schema usage across policies
  • Schema validation errors over time

7. AI Agent Behavior Governance Dashboard

Purpose: Continuous oversight of AI agent decisioning. Visuals:

  • Per-agent risk level distribution
  • Violations triggered by agents
  • Temporal anomaly detection (spikes in sensitive actions)
  • Agent-to-resource interaction graph
flowchart TB

    subgraph Agents["AI Agents"]
        A1["AI Agent 1"]
        A2["AI Agent 2"]
        A3["AI Agent N"]
    end

    subgraph Cedarling["Cedarling Enforcement Points"]
        C1["Policy Evaluation\n(permit/deny/error)"]
        C2["Risk Attribute Extraction"]
        C3["Action & Resource Logging"]
    end

    subgraph Hub["Hub System"]
        H1["Decision Logs"]
        H2["Contextual Metadata"]
        H3["Behavior Telemetry"]
    end

    subgraph Analysis["Governance Analysis Engine"]
        G1["Risk Level Classification"]
        G2["Anomaly Detection"]
        G3["Violation Detection"]
        G4["Agent-to-Resource Graph Builder"]
    end

    subgraph Dashboard["AI Agent Behavior Dashboard"]
        D1["Real-Time Agent Risk Levels"]
        D2["Policy Violations Timeline"]
        D3["Anomaly Spike Chart"]
        D4["Agent ↔ Resource Interaction Graph"]
    end

    %% Connections
    A1 --> C1
    A1 --> C2
    A1 --> C3
    A2 --> C1
    A2 --> C2
    A2 --> C3
    A3 --> C1
    A3 --> C2
    A3 --> C3

    C1 --> H1
    C2 --> H2
    C3 --> H3

    H1 --> G3
    H2 --> G1
    H3 --> G2
    H3 --> G4

    G1 --> D1
    G3 --> D2
    G2 --> D3
    G4 --> D4
Loading

8. Hub System Distribution Dashboard

Purpose: Track policy distribution and log ingestion health. Visuals:

  • Policy store release adoption map
  • Cedarling upgrade/status grid
  • Log ingestion throughput and backlog
  • Missed log packets or retries

9. Governance Drift & Incident Dashboard

Purpose: Identify risks before they become incidents. Visuals:

  • Drift score per component or agent
  • Incidents by severity (policy violations, compliance failures)
  • Root-cause analysis highlights
  • MTTR (mean time to remediation)

10. Governance Command Center (Executive Dashboard)

Purpose: 1-page view for CISO / Governance Officers. Visuals:

  • Unified Governance Score (policy + compliance + schema)
  • Top 5 risks
  • Top non-compliant components
  • Policy conflicts/violations trend
  • Compliance framework posture matrix
  • Audit integrity and log completeness

Clone this wiki locally