Skip to content

Commit 6c17abc

Browse files
feat: Update Grafana dashboard for Kuadrant metrics and improve observability
Move Grafana dashboard to docs/samples/dashboards/ for better organization and update dashboard metrics from token_usage to authorized_hits/authorized_calls/limited_calls. Replace 'group' labels with 'tier' labels for better tier-based analytics and add support for model-specific usage tracking and namespace filtering. Enhance dashboard with cost analysis, rate limiting visualization, and user activity monitoring. Update deployment README and token-metrics.md to reference new dashboard location and add comprehensive dashboard documentation with setup instructions. Remove dependency on Grafana Operator by storing dashboard as sample file for easier deployment and customization. Signed-off-by: Wen Liang <liangwen12year@gmail.com>
1 parent e593d2c commit 6c17abc

File tree

3 files changed

+1303
-0
lines changed

3 files changed

+1303
-0
lines changed

deployment/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,17 @@ For OpenShift clusters, use the automated deployment script:
4545
4646
This script handles all steps including feature gates, dependencies, and OpenShift-specific configurations.
4747
48+
### 📊 Monitoring Dashboard
49+
50+
After deployment, you can import the Grafana dashboard for monitoring:
51+
52+
1. **Dashboard Location:** `docs/samples/dashboards/maas-token-metrics-dashboard.json`
53+
2. **Import into Grafana:** Upload the JSON file to your Grafana instance
54+
3. **Configure Prometheus:** Ensure your Prometheus datasource is configured
55+
4. **View Metrics:** Monitor token usage, rate limiting, and tier-based analytics
56+
57+
See [Dashboard Documentation](../../docs/samples/dashboards/README.md) for detailed setup instructions.
58+
4859
### Manual Deployment Steps
4960
5061
### Step 0: Enable Gateway API Features (OpenShift Only)

docs/samples/dashboards/README.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# 📊 MaaS Grafana Dashboards
2+
3+
This directory contains Grafana dashboard samples for the MaaS platform.
4+
5+
## 🚀 MaaS Token Metrics Dashboard
6+
7+
**File:** `maas-token-metrics-dashboard.json`
8+
9+
### 📋 Overview
10+
11+
This dashboard provides comprehensive monitoring of token usage, rate limiting, and tier-based analytics for the MaaS platform using Kuadrant/Limitador metrics.
12+
13+
### 🎯 Key Metrics
14+
15+
- **`authorized_hits`** - Total successful API calls with tier/user/model information
16+
- **`authorized_calls`** - Rate limiting success metrics
17+
- **`limited_calls`** - Rate limiting block metrics
18+
- **`tier` labels** - Free/Premium/Enterprise tier information
19+
- **`model` labels** - Model-specific usage tracking
20+
- **`limitador_namespace`** - Namespace filtering
21+
22+
### 📊 Dashboard Panels
23+
24+
1. **🎯 Total Authorized Hits** - Main success metric
25+
2. **📈 Authorized Hits Rate by Tier** - Tier-based performance
26+
3. **👥 Authorized Hits by User & Tier** - User activity
27+
4. **🏆 Top 10 Users by Hits** - Usage leaders
28+
5. **⏰ Hourly Authorized Hits by User** - Time-based analysis
29+
6. **📊 Total Authorized Hits by Tier** - Tier comparison
30+
7. **👥 Active Users** - User count
31+
8. **💰 Top 5 Users by Cost** - Cost analysis with tier pricing
32+
9. **📋 Detailed Metrics Table** - Complete data view
33+
34+
### 🔧 How to Use
35+
36+
1. **Import into Grafana:**
37+
- Go to Grafana → Dashboards → Import
38+
- Upload `maas-token-metrics-dashboard.json`
39+
- Configure Prometheus datasource (DS_PROMETHEUS)
40+
41+
2. **Prerequisites:**
42+
- Prometheus configured to scrape Limitador metrics
43+
- ServiceMonitor for `limitador-limitador` service deployed
44+
- Kuadrant policies generating metrics
45+
46+
3. **Features:**
47+
- **Tier-based filtering** - Free vs Premium vs Enterprise
48+
- **User activity tracking** - Individual user usage patterns
49+
- **Model usage analytics** - Which models are used most
50+
- **Rate limiting monitoring** - Success vs blocked requests
51+
- **Cost analysis** - Revenue tracking by tier and user
52+
- **Namespace filtering** - Multi-tenant scenarios
53+
54+
### 💰 Cost Analysis
55+
56+
The dashboard includes cost calculations:
57+
- **Free Tier:** $0.005 per authorized hit
58+
- **Premium Tier:** $0.008 per authorized hit
59+
- **Enterprise Tier:** Custom pricing (configurable)
60+
61+
### 🎨 Visual Features
62+
63+
- **Emojis** for better visual appeal
64+
- **Color coding** by tier (Free=Green, Premium=Blue, Enterprise=Gold)
65+
- **Modern layout** with better spacing
66+
- **Interactive filtering** and grouping
67+
- **Real-time updates** (30s refresh)
68+
69+
### 📈 Key Insights
70+
71+
Monitor these important metrics:
72+
- **Success Rate:** `authorized_calls / (authorized_calls + limited_calls)`
73+
- **Tier Distribution:** Usage by Free/Premium/Enterprise
74+
- **Model Popularity:** Which models are used most
75+
- **User Activity:** Top users and their tier usage
76+
- **Cost Analysis:** Revenue by tier and user
77+
78+
### 🔗 Related Documentation
79+
80+
- [Deployment Guide](../../README.md)
81+
- [Token Metrics Guide](../../token-metrics.md)
82+
- [Observability Setup](../../deployment/base/observability/)
83+
84+
### 🛠️ Customization
85+
86+
To customize the dashboard:
87+
1. Import into Grafana
88+
2. Edit panels as needed
89+
3. Export updated JSON
90+
4. Replace this file with your custom version
91+
92+
### 📝 Notes
93+
94+
- Dashboard is compatible with Kuadrant v1.3.0-rc2+
95+
- Requires Prometheus Operator for ServiceMonitor support
96+
- Metrics are generated by Limitador with TelemetryPolicy
97+
- Dashboard auto-refreshes every 30 seconds

0 commit comments

Comments
 (0)