Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ cmd/exporter/exporter.go # Entrypoint: flags, provider selection, HT
pkg/provider/provider.go # Provider, Collector, Registry interfaces
pkg/aws/aws.go # AWS: S3, EC2, RDS, NATGATEWAY, ELB, VPC
pkg/google/gcp.go # GCP: GCS, GKE, CLB, SQL, VPC
pkg/azure/azure.go # Azure: AKS
pkg/azure/azure.go # Azure: AKS, blob
pkg/gatherer/gatherer.go # Wraps Collect(): duration, errors, metadata metrics
pkg/utils/consts.go # Shared metric suffixes, HoursInMonth, GenerateDesc()
cmd/dashboards/main.go # Dashboard generation (grafana-foundation-sdk)
Expand Down Expand Up @@ -58,7 +58,8 @@ Rule: Never push to `main`.
```bash
go run cmd/exporter/exporter.go -provider gcp -project-id=$GCP_PROJECT_ID -gcp.services GKE,GCS
go run cmd/exporter/exporter.go -provider aws -aws.profile $AWS_PROFILE -aws.services EC2,S3
go run cmd/exporter/exporter.go -provider azure -azure.subscription-id $AZ_SUBSCRIPTION_ID
go run cmd/exporter/exporter.go -provider azure -azure.subscription-id $AZ_SUBSCRIPTION_ID -azure.services AKS
go run cmd/exporter/exporter.go -provider azure -azure.subscription-id $AZ_SUBSCRIPTION_ID -azure.services blob
```

### Adding a collector
Expand Down
5 changes: 3 additions & 2 deletions cmd/exporter/exporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ func providerFlags(fs *flag.FlagSet, cfg *config.Config) {
fs.Var(config.NewDeprecatedStringSliceFlag(&cfg.Providers.GCP.Projects, &cfg.Providers.GCP.BucketProjectsDeprecated), "gcp.bucket-projects", "GCP project(s). (deprecated: use --gcp.projects instead)")
fs.Var(&cfg.Providers.AWS.Services, "aws.services", "AWS service(s).")
fs.Var(&cfg.Providers.AWS.ExcludeRegions, "aws.exclude-regions", "AWS region(s) to exclude from cost collection.")
fs.Var(&cfg.Providers.Azure.Services, "azure.services", "Azure service(s).")
fs.Var(&cfg.Providers.Azure.Services, "azure.services", "Azure service(s): AKS, blob (comma-separated and/or repeat flag; case-insensitive).")
fs.Var(&cfg.Providers.GCP.Services, "gcp.services", "GCP service(s).")
flag.StringVar(&cfg.Providers.AWS.Region, "aws.region", "", "AWS region")
flag.StringVar(&cfg.Providers.AWS.RoleARN, "aws.roleARN", "", "Optional AWS role ARN to assume for cross-account access.")
Expand Down Expand Up @@ -242,7 +242,8 @@ func selectProviderWith(
return newAzure(ctx, &azure.Config{
Logger: cfg.Logger,
SubscriptionId: cfg.Providers.Azure.SubscriptionId,
Services: cfg.Providers.Azure.Services,
ScrapeInterval: cfg.Collector.ScrapeInterval,
Services: strings.Split(cfg.Providers.Azure.Services.String(), ","),
CollectorTimeout: collectorTimeout,
})
case "aws":
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@
- [Providers](metrics/providers.md)
- **AWS:** [EC2](metrics/aws/ec2.md), [S3](metrics/aws/s3.md), [RDS](metrics/aws/rds.md), [MSK](metrics/aws/msk.md), [ELB](metrics/aws/elb.md), [NAT Gateway](metrics/aws/natgateway.md), [VPC](metrics/aws/vpc.md)
- **GCP:** [GKE](metrics/gcp/gke.md), [GCS](metrics/gcp/gcs.md), [Cloud SQL](metrics/gcp/cloudsql.md), [Managed Kafka](metrics/gcp/managedkafka.md), [CLB](metrics/gcp/clb.md), [VPC](metrics/gcp/vpc.md)
- **Azure:** [AKS](metrics/azure/aks.md)
- **Azure:** [AKS](metrics/azure/aks.md), [blob](metrics/azure/blob.md)
- [Deploying](deploying/aws/README.md) - Run the exporter
- [AWS](deploying/aws/README.md) — IRSA, Helm, cross-account access
1 change: 1 addition & 0 deletions docs/metrics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@
## Azure Services

- **[AKS](./azure/aks.md)** - Azure Kubernetes Service VM instances and managed disks
- **[Blob](./azure/blob.md)** - Azure Blob Storage (cost metrics registered; no series until Cost Management)
11 changes: 11 additions & 0 deletions docs/metrics/azure/blob.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Azure Blob Storage metrics

Pass `blob` in `--azure.services` to enable this collector. Matching is case-insensitive.

The collector defines a storage cost `GaugeVec` that the Azure provider includes in its `Describe` and `Collect` fan-out (same gatherer pattern as `azure_aks`). `Collect` calls `StorageCostQuerier.QueryBlobStorage` when `ScrapeInterval` has elapsed since the last successful query (similar billing refresh cadence to `pkg/aws/s3`). Each query uses a **30-day** lookback (`defaultQueryLookback` in `pkg/azure/blob/cost_query.go`). Cached rows are applied to the gauge every scrape. `Config.CostQuerier` supplies the querier; when it is nil the collector uses a no-op querier (no rows). The parent Azure collector forwards `StorageGauge.Collect(ch)` so blob cost metrics share one registration path with the rest of the Azure exporter. Scrape instrumentation publishes `cloudcost_exporter_collector_*` with label `collector="azure_blob"`.

## Cost metrics

| Metric name | Metric type | Description | Labels |
|-------------|-------------|-------------|--------|
| cloudcost_azure_blob_storage_by_location_usd_per_gibyte_hour | Gauge | Storage cost rate for Blob Storage by region and class. Cost represented in USD/(GiB*h) | `region`=&lt;Azure region&gt;<br/> `class`=&lt;Blob access tier or storage class&gt; |
3 changes: 2 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ require (
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.13.1
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v7 v7.3.0
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v8 v8.2.0
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/costmanagement/armcostmanagement v1.1.1
github.com/Azure/go-autorest/autorest/to v0.4.1
github.com/aws/aws-sdk-go-v2 v1.41.4
github.com/aws/aws-sdk-go-v2/config v1.32.12
Expand Down Expand Up @@ -39,6 +40,7 @@ require (

require (
cloud.google.com/go/managedkafka v0.8.1
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0
github.com/aws/aws-sdk-go-v2/service/kafka v1.49.0
)

Expand All @@ -50,7 +52,6 @@ require (
cloud.google.com/go/compute/metadata v0.9.0 // indirect
cloud.google.com/go/iam v1.6.0 // indirect
cloud.google.com/go/longrunning v0.8.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.20.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.2 // indirect
github.com/Azure/go-autorest v14.2.0+incompatible // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v1.6.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v7 v7.3
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/compute/armcompute/v7 v7.3.0/go.mod h1:e4RAYykLIz73CF52KhSooo4whZGXvXrD09m0jkgnWiU=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v8 v8.2.0 h1:aXzpyYcHexm3eSlvy6g7r3cshXtGcEg6VJpOdrN0Us0=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/containerservice/armcontainerservice/v8 v8.2.0/go.mod h1:vs/o7so4c3csg/CM0LDrqxSKDxcKgeYbgI3zaL6vu7U=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/costmanagement/armcostmanagement v1.1.1 h1:ehSLdbLah6kk6HTVc6e/lrbmbz7MMbpNxkOd3OYlhB0=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/costmanagement/armcostmanagement v1.1.1/go.mod h1:Am1cUioOk0HdZIsjpXJkQ4RIeQbwYsW6LkNIc5z/5XY=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/internal/v3 v3.1.0 h1:2qsIIvxVT+uE6yrNldntJKlLRgxGbZ85kgtz5SNBhMw=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/internal/v3 v3.1.0/go.mod h1:AW8VEadnhw9xox+VaVd9sP7NjzOAnaZBLRH6Tq3cJ38=
github.com/Azure/azure-sdk-for-go/sdk/resourcemanager/resources/armresources v1.2.0 h1:Dd+RhdJn0OTtVGaeDLZpcumkIVCtA/3/Fo42+eoYvVM=
Expand Down
30 changes: 26 additions & 4 deletions pkg/azure/azure.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/prometheus/client_golang/prometheus"

"github.com/grafana/cloudcost-exporter/pkg/azure/aks"
"github.com/grafana/cloudcost-exporter/pkg/azure/blob"
"github.com/grafana/cloudcost-exporter/pkg/azure/client"
"github.com/grafana/cloudcost-exporter/pkg/collectormetrics"
"github.com/grafana/cloudcost-exporter/pkg/provider"
Expand Down Expand Up @@ -65,6 +66,7 @@ type Config struct {
Region string

SubscriptionId string
ScrapeInterval time.Duration

CollectorTimeout time.Duration
Services []string
Expand All @@ -90,15 +92,35 @@ func New(ctx context.Context, config *Config) (*Azure, error) {
return nil, err
}

// Collector Registration
// Collector Registration (--azure.services matching is case-insensitive).
for _, svc := range config.Services {
switch strings.ToUpper(svc) {
case "AKS":
svc = strings.TrimSpace(svc)
if svc == "" {
continue
}
switch {
case strings.EqualFold(svc, "AKS"):
collector, err := aks.New(ctx, &aks.Config{
Logger: logger,
}, azClientWrapper)
if err != nil {
return nil, err
logger.LogAttrs(ctx, slog.LevelError, "Error creating collector",
slog.String("service", svc),
slog.String("message", err.Error()))
continue
}
collectors = append(collectors, collector)
case strings.EqualFold(svc, "blob"):
collector, err := blob.New(&blob.Config{
Logger: logger,
SubscriptionId: config.SubscriptionId,
ScrapeInterval: config.ScrapeInterval,
})
if err != nil {
logger.LogAttrs(ctx, slog.LevelError, "Error creating collector",
slog.String("service", svc),
slog.String("message", err.Error()))
continue
}
collectors = append(collectors, collector)
default:
Expand Down
131 changes: 131 additions & 0 deletions pkg/azure/blob/blob.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
package blob

import (
"context"
"log/slog"
"sync"
"time"

"github.com/grafana/cloudcost-exporter/pkg/provider"
"github.com/prometheus/client_golang/prometheus"

cloudcost_exporter "github.com/grafana/cloudcost-exporter"
)

const subsystem = "azure_blob"

// metrics holds Prometheus collectors for blob cost rates. Vectors are not registered on the root registry;
// Azure's top-level Collector gathers them via Collect → GaugeVec.Collect (same pattern as pkg/azure/aks).
type metrics struct {
StorageGauge *prometheus.GaugeVec
// Planned future work: operation request rate (parity with S3/GCS cloudcost_*_operation_by_location_usd_per_krequest).
// OperationsGauge *prometheus.GaugeVec
}

func newMetrics() metrics {
m := metrics{
StorageGauge: prometheus.NewGaugeVec(prometheus.GaugeOpts{
Name: prometheus.BuildFQName(cloudcost_exporter.MetricPrefix, subsystem, "storage_by_location_usd_per_gibyte_hour"),
Help: "Storage cost of blob objects by region and class. Cost represented in USD/(GiB*h). Populated when CostQuerier returns data.",
},
[]string{"region", "class"},
),
}

// Planned future work: register operation cost per 1k requests (labels region, class, tier) when Cost Management dimensions support it.
// m.OperationsGauge = prometheus.NewGaugeVec(prometheus.GaugeOpts{
// Name: prometheus.BuildFQName(cloudcost_exporter.MetricPrefix, subsystem, "operation_by_location_usd_per_krequest"),
// Help: "Operation cost of blob objects by region, class, and tier. Cost represented in USD/(1k req). No samples until Cost Management is integrated.",
// },
// []string{"region", "class", "tier"},
// )

return m
}

// Collector implements provider.Collector for Azure Blob Storage cost metrics.
type Collector struct {
logger *slog.Logger
metrics metrics
querier StorageCostQuerier
subscriptionID string
scrapeInterval time.Duration

mu sync.Mutex
cachedRows []StorageCostRow
nextRefresh time.Time // QueryBlobStorage when time.Now is on or after this (S3 billing refresh pattern).
}

// Config holds settings for the blob collector.
type Config struct {
Logger *slog.Logger
SubscriptionId string
ScrapeInterval time.Duration
// CostQuerier optional; when nil a no-op is used until Azure Cost Management is wired (e.g. from pkg/azure).
CostQuerier StorageCostQuerier
}

// New builds a blob collector. Subscription and scrape interval are stored for refresh logic; cost data comes from CostQuerier (default no-op).
func New(cfg *Config) (*Collector, error) {
interval := cfg.ScrapeInterval
if interval <= 0 {
interval = time.Hour
}
q := cfg.CostQuerier
if q == nil {
q = noopStorageCostQuerier{}
}
return &Collector{
logger: cfg.Logger.With("collector", "blob"),
metrics: newMetrics(),
querier: q,
subscriptionID: cfg.SubscriptionId,
scrapeInterval: interval,
// First Collect runs a query immediately (same idea as pkg/aws/s3 nextScrape).
nextRefresh: time.Now().Add(-interval),
}, nil
}

// Collect queries cost rows, updates the storage vec, then forwards metrics on ch for the parent gatherer.
func (c *Collector) Collect(ctx context.Context, ch chan<- prometheus.Metric) error {
c.logger.LogAttrs(ctx, slog.LevelInfo, "collecting metrics")
c.mu.Lock()
defer c.mu.Unlock()
now := time.Now()
if !now.Before(c.nextRefresh) {
rows, err := c.querier.QueryBlobStorage(ctx, c.subscriptionID, defaultQueryLookback)
if err != nil {
return err
}
c.cachedRows = rows
c.nextRefresh = now.Add(c.scrapeInterval)
}
c.applyRowsToGauge(c.cachedRows)
c.metrics.StorageGauge.Collect(ch)
return nil
}

func (c *Collector) applyRowsToGauge(rows []StorageCostRow) {
for _, row := range rows {
c.metrics.StorageGauge.WithLabelValues(row.Region, row.Class).Set(row.Rate)
}
}

// Describe satisfies provider.Collector.
func (c *Collector) Describe(ch chan<- *prometheus.Desc) error {
c.metrics.StorageGauge.Describe(ch)
// Planned future work: c.metrics.OperationsGauge.Describe(ch)
return nil
}

// Name returns the collector subsystem name for operational metrics.
func (c *Collector) Name() string {
return subsystem
}

// Register satisfies provider.Collector. Does not register cost metrics on the registry (avoids duplicate Desc
// with Azure's Describe fan-out; metrics are collected via Collect → StorageGauge.Collect).
func (c *Collector) Register(_ provider.Registry) error {
c.logger.LogAttrs(context.Background(), slog.LevelInfo, "registering collector")
return nil
}
Loading
Loading