-
Notifications
You must be signed in to change notification settings - Fork 9
Azure: Add blob collector stub #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
90e8621
60d583d
c4f76fc
94ec06b
95481f6
1647cba
c36deaa
e5ed442
e063c6d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # Azure Blob Storage metrics | ||
|
|
||
| ## Status | ||
|
|
||
| The Blob Collector is registered when you pass `blob` in `--azure.services` (matching is case-insensitive). The collector **does not query Azure Cost Management yet**. The storage cost `GaugeVec` appears in `Describe` but is **not** `MustRegister`'d on the root registry (same pattern as `azure_aks`); **`Collect` forwards** `StorageGauge.Collect(ch)` to the parent Azure gatherer. **There are no samples** until **`Collect` calls `Set`** on label values (not implemented yet). | ||
rooneyshuman marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## Planned behavior (see issue #54): | ||
|
|
||
| ### Metric shape (AWS S3 and GCP GCS in this repo) | ||
|
|
||
| - **Storage rate:** Same naming pattern as S3 and GCS: `storage_by_location_usd_per_gibyte_hour` under the service subsystem (here `azure_blob` → `cloudcost_azure_blob_storage_by_location_usd_per_gibyte_hour`). Implemented today as `cloudcost_aws_s3_*` in `pkg/aws/s3/s3.go` and `cloudcost_gcp_gcs_*` in `pkg/google/metrics/metrics.go`. | ||
| - **Operations rate (optional):** S3 and GCS also define `operation_by_location_usd_per_krequest` (`pkg/aws/s3/s3.go`, `pkg/google/metrics/metrics.go`). **Not registered in the exporter yet** (see Planned future work under [Cost metrics](#cost-metrics)). A Blob Collector may expose an analogous gauge when the chosen Azure dataset supports it. | ||
| - **Labels:** S3 uses `region` and `class` for storage (and `tier` for operations); GCS uses `location` and `storage_class` (and operation class labels for ops). Azure label sets would follow whatever dimensions the Cost Management query returns; the above collectors show the cross-cloud “by location + storage class/tier” pattern only. | ||
|
|
||
| ### Azure Cost Management | ||
|
|
||
| - **Query API:** Azure documentation mentions `POST https://management.azure.com/{scope}/providers/Microsoft.CostManagement/query` and states that `{scope}` includes **`/subscriptions/{subscriptionId}/`** for subscription scope. Sample responses in that snapshot include cost columns such as **`PreTaxCost`** alongside dimensions like **`ResourceLocation`** in example filters. | ||
| - **Permissions:** Azure documentation states that you must have **Cost Management permissions at the appropriate scope** to use Cost Management APIs (the captured article links to Microsoft’s assign-access documentation for details). No specific Azure role name is asserted here. | ||
rooneyshuman marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ### Provider operational metrics | ||
|
|
||
| - **`cloudcost_exporter_collector_*` with `collector="azure_blob"`** — same mechanism as other Azure collectors (e.g. `azure_aks`) via `pkg/azure/azure.go` and the shared gatherer pattern. | ||
| - **Registry:** Blob cost `GaugeVec`s are not registered separately on the Prometheus registry; `Describe` lists them for the parent Azure collector, and **`Collect` forwards** `StorageGauge.Collect(ch)` so metric descriptors are not duplicated. | ||
|
|
||
| ## Cost metrics | ||
|
|
||
| | Metric name | Type | Labels | Samples | | ||
| |-------------|------|--------|---------| | ||
| | `cloudcost_azure_blob_storage_by_location_usd_per_gibyte_hour` | Gauge | `region`, `class` | None; exporter does not populate this metric yet (no billing API calls in `Collect`) | | ||
|
|
||
| **Planned future work:** `cloudcost_azure_blob_operation_by_location_usd_per_krequest` (Gauge; `region`, `class`, `tier`) — parity with S3/GCS operation metrics; commented out in `pkg/azure/blob/blob.go` until operation pricing is implemented. How we group cost rows into labels is TBD when we add the API integration. | ||
rooneyshuman marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,99 @@ | ||||||||||||||||
| package blob | ||||||||||||||||
|
|
||||||||||||||||
| import ( | ||||||||||||||||
| "context" | ||||||||||||||||
| "log/slog" | ||||||||||||||||
| "time" | ||||||||||||||||
|
|
||||||||||||||||
| "github.com/grafana/cloudcost-exporter/pkg/provider" | ||||||||||||||||
| "github.com/prometheus/client_golang/prometheus" | ||||||||||||||||
|
|
||||||||||||||||
| cloudcost_exporter "github.com/grafana/cloudcost-exporter" | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| const subsystem = "azure_blob" | ||||||||||||||||
|
|
||||||||||||||||
| // metrics holds Prometheus collectors for blob cost rates. Vectors are not registered on the root registry; | ||||||||||||||||
| // Azure's top-level Collector gathers them via Collect → GaugeVec.Collect (same pattern as pkg/azure/aks). | ||||||||||||||||
| type metrics struct { | ||||||||||||||||
| StorageGauge *prometheus.GaugeVec | ||||||||||||||||
| // Planned future work: operation request rate (parity with S3/GCS cloudcost_*_operation_by_location_usd_per_krequest). | ||||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we might already have that with existing metrics, see https://ops.grafana-ops.net/goto/ffio5ap805csgf?orgId=stacks-27821 but it depends what kind of operation rate we are talking about :)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to know. I'll keep that in mind when digging into the next metric. Thanks!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
the styleguide specifically prohibits commented-out code. Since this repo is open-source, let's keep only actual functionality committed |
||||||||||||||||
| // OperationsGauge *prometheus.GaugeVec | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| func newMetrics() metrics { | ||||||||||||||||
| m := metrics{ | ||||||||||||||||
| StorageGauge: prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||||||||||||||||
| Name: prometheus.BuildFQName(cloudcost_exporter.MetricPrefix, subsystem, "storage_by_location_usd_per_gibyte_hour"), | ||||||||||||||||
| Help: "Storage cost of blob objects by region and class. Cost represented in USD/(GiB*h). No samples until Cost Management is integrated.", | ||||||||||||||||
| }, | ||||||||||||||||
| []string{"region", "class"}, | ||||||||||||||||
| ), | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Planned future work: register operation cost per 1k requests (labels region, class, tier) when Cost Management dimensions support it. | ||||||||||||||||
| // m.OperationsGauge = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||||||||||||||||
| // Name: prometheus.BuildFQName(cloudcost_exporter.MetricPrefix, subsystem, "operation_by_location_usd_per_krequest"), | ||||||||||||||||
| // Help: "Operation cost of blob objects by region, class, and tier. Cost represented in USD/(1k req). No samples until Cost Management is integrated.", | ||||||||||||||||
| // }, | ||||||||||||||||
| // []string{"region", "class", "tier"}, | ||||||||||||||||
| // ) | ||||||||||||||||
|
|
||||||||||||||||
|
Comment on lines
+34
to
+41
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
the styleguide specifically prohibits commented-out code. Since this repo is open-source, let's keep only actual functionality committed |
||||||||||||||||
| return m | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Collector implements provider.Collector for Azure Blob Storage cost metrics. | ||||||||||||||||
| // Cost Management integration is not implemented yet; there are no labeled series until Collect calls Set on the vec. | ||||||||||||||||
| type Collector struct { | ||||||||||||||||
| logger *slog.Logger | ||||||||||||||||
| metrics metrics | ||||||||||||||||
| subscriptionID string | ||||||||||||||||
| scrapeInterval time.Duration | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Config holds settings for the blob collector. | ||||||||||||||||
| type Config struct { | ||||||||||||||||
| Logger *slog.Logger | ||||||||||||||||
| SubscriptionId string | ||||||||||||||||
| ScrapeInterval time.Duration | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // New builds a blob collector. It does not call Azure APIs yet; subscription and interval are stored for Cost Management integration. | ||||||||||||||||
| func New(cfg *Config) (*Collector, error) { | ||||||||||||||||
| interval := cfg.ScrapeInterval | ||||||||||||||||
| if interval <= 0 { | ||||||||||||||||
| interval = time.Hour | ||||||||||||||||
| } | ||||||||||||||||
| return &Collector{ | ||||||||||||||||
| logger: cfg.Logger.With("collector", "blob"), | ||||||||||||||||
| metrics: newMetrics(), | ||||||||||||||||
| subscriptionID: cfg.SubscriptionId, | ||||||||||||||||
| scrapeInterval: interval, | ||||||||||||||||
| }, nil | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Collect satisfies provider.Collector. Does not call Set on cost vectors yet; still forwards the vec on ch for the parent gatherer. | ||||||||||||||||
| func (c *Collector) Collect(ctx context.Context, ch chan<- prometheus.Metric) error { | ||||||||||||||||
| c.logger.LogAttrs(ctx, slog.LevelInfo, "collecting metrics") | ||||||||||||||||
| c.metrics.StorageGauge.Collect(ch) | ||||||||||||||||
| return nil | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Describe satisfies provider.Collector. | ||||||||||||||||
| func (c *Collector) Describe(ch chan<- *prometheus.Desc) error { | ||||||||||||||||
| c.metrics.StorageGauge.Describe(ch) | ||||||||||||||||
| // Planned future work: c.metrics.OperationsGauge.Describe(ch) | ||||||||||||||||
| return nil | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Name returns the collector subsystem name for operational metrics. | ||||||||||||||||
| func (c *Collector) Name() string { | ||||||||||||||||
| return subsystem | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| // Register satisfies provider.Collector. Does not register cost metrics on the registry (avoids duplicate Desc | ||||||||||||||||
| // with Azure's Describe fan-out; metrics are collected via Collect → StorageGauge.Collect). | ||||||||||||||||
| func (c *Collector) Register(_ provider.Registry) error { | ||||||||||||||||
| c.logger.LogAttrs(context.Background(), slog.LevelInfo, "registering collector") | ||||||||||||||||
| return nil | ||||||||||||||||
| } | ||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| package blob | ||
|
|
||
| import ( | ||
| "context" | ||
| "log/slog" | ||
| "os" | ||
| "strings" | ||
| "testing" | ||
| "time" | ||
|
|
||
| "github.com/prometheus/client_golang/prometheus" | ||
| "github.com/stretchr/testify/assert" | ||
| "github.com/stretchr/testify/require" | ||
|
|
||
| cloudcost_exporter "github.com/grafana/cloudcost-exporter" | ||
| ) | ||
|
|
||
| var testLogger = slog.New(slog.NewTextHandler(os.Stdout, nil)) | ||
|
|
||
| func testCollectSink() chan prometheus.Metric { | ||
| return make(chan prometheus.Metric, 8) | ||
| } | ||
|
|
||
| func TestCollector_Describe(t *testing.T) { | ||
| c, err := New(&Config{Logger: testLogger}) | ||
| require.NoError(t, err) | ||
| ch := make(chan *prometheus.Desc, 4) | ||
| require.NoError(t, c.Describe(ch)) | ||
| close(ch) | ||
| var got []string | ||
| for d := range ch { | ||
| got = append(got, d.String()) | ||
| } | ||
| require.Len(t, got, 1) | ||
| joined := strings.Join(got, " ") | ||
| prefix := cloudcost_exporter.MetricPrefix + "_azure_blob_" | ||
| assert.Contains(t, joined, prefix+"storage_by_location_usd_per_gibyte_hour") | ||
| } | ||
|
|
||
| func TestCollector_Register(t *testing.T) { | ||
| c, err := New(&Config{Logger: testLogger}) | ||
| require.NoError(t, err) | ||
| // Register does not call registry.MustRegister on cost metrics (AKS pattern). | ||
| require.NoError(t, c.Register(nil)) | ||
| } | ||
|
|
||
| func TestCollector_Collect_forwardsStorageGauge(t *testing.T) { | ||
| c, err := New(&Config{Logger: testLogger}) | ||
| require.NoError(t, err) | ||
| ctx := context.Background() | ||
| require.NoError(t, c.Collect(ctx, testCollectSink())) | ||
| } | ||
|
|
||
| func TestNew_configPlumbing(t *testing.T) { | ||
| const subUUID = "11111111-1111-1111-1111-111111111111" | ||
| tests := map[string]struct { | ||
| subscriptionID string | ||
| scrapeInterval time.Duration | ||
| wantInterval time.Duration | ||
| }{ | ||
| "zero scrape interval defaults to one hour": { | ||
| subscriptionID: "sub-1", | ||
| scrapeInterval: 0, | ||
| wantInterval: time.Hour, | ||
| }, | ||
| "explicit subscription and interval": { | ||
| subscriptionID: subUUID, | ||
| scrapeInterval: 30 * time.Minute, | ||
| wantInterval: 30 * time.Minute, | ||
| }, | ||
| } | ||
| for name, tt := range tests { | ||
| t.Run(name, func(t *testing.T) { | ||
| c, err := New(&Config{ | ||
| Logger: testLogger, | ||
| SubscriptionId: tt.subscriptionID, | ||
| ScrapeInterval: tt.scrapeInterval, | ||
| }) | ||
| require.NoError(t, err) | ||
| assert.Equal(t, tt.subscriptionID, c.subscriptionID) | ||
| assert.Equal(t, tt.wantInterval, c.scrapeInterval) | ||
| }) | ||
| } | ||
| } |
Uh oh!
There was an error while loading. Please reload this page.