Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat : Add CEL Expression Metrics for Improved Observability #342

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

harshithsaiv
Copy link

Description

This PR addresses issue #195 by introducing a comprehensive metrics system for CEL expression execution. With Prometheus-based metrics tracking, operators can now observe CEL compilation and evaluation performance, identify failures, and monitor resource consumption—all of which directly resolve the observability gaps highlighted in issue #195.

  • Detailed latency histograms: Monitor the performance of CEL expressions.
  • Counters: Track success and failure rates.
  • Error identification: Identify specific error types during evaluation.
  • Gauges: Observe the concurrent load of evaluations.
  • Filtering: Metrics can be filtered by resource type, operation type, and expression purpose.

Implementation Details

1. Core Metrics Package

  • Location: pkg/cel/metrics/metrics.go
  • Functionality:
    • Define and register Prometheus metrics.
    • Provide helper functions for recording metric values.
    • Implement utility functions (e.g., WithCELEvaluation()) for streamlined metrics recording.

2. Timing Utilities

  • Location: pkg/cel/utils/timers.go
  • Functionality:
    • Provide timer objects for measuring CEL compilation metrics.
    • Ensure consistent recording of both timing and result metrics.

3. Documentation

  • Location: pkg/cel/metrics/doc.go
  • Functionality:
    • Common usage patterns.
    • Explanation of available labels and their meanings.
    • Examples for different usage scenarios.

Introduces Prometheus metrics for tracking CEL (Common Expression Language) compilation and evaluation performance, including:
- Latency histograms for compilation and evaluation
- Counters for tracking success and failure rates
- Metrics registered with Kubernetes component-base metrics registry
Add placeholder files for CEL metrics and utils packages to support future implementation of performance tracking and utility functions
…tion

Expand CEL metrics to provide more granular performance and error tracking:
- Add labels for resource type, operation type, and expression type
- Introduce new metrics for active evaluations and specific error types
- Create a helper function `WithCELEvaluation` to simplify metric recording
- Update metric recording functions to support more detailed labeling
@harshithsaiv
Copy link
Author

Hi @a-hilaly,

I want to clarify that I've closely followed the structure from the Kubernetes apiserver CEL metrics implementation (source) as specified in the issue.

The implementation includes:

  • Core metrics for both CEL compilation and evaluation, capturing duration and result counters.
  • Enhanced observability through additional metrics for tracking active evaluations and capturing error types.
  • Contextual labels that facilitate better filtering and analysis.
  • Helper functions that simplify integration with the existing codebase.

While I've added some extensions beyond the basic Kubernetes implementation (such as the utils package with timer helpers), the core structure adheres to the established patterns for consistency and familiarity.

Could you please review and approve this implementation? I believe it effectively addresses the observability gap while maintaining alignment with Kubernetes metrics best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant