Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
---
canonical: https://grafana.com/docs/alloy/latest/reference/components/prometheus/prometheus.exporter.databricks/
aliases:
- ../prometheus.exporter.databricks/ # /docs/alloy/latest/reference/components/prometheus.exporter.databricks/
description: Learn about prometheus.exporter.databricks
labels:
stage: general-availability
products:
- oss
title: prometheus.exporter.databricks
---

# `prometheus.exporter.databricks`

The `prometheus.exporter.databricks` component embeds the [`databricks_exporter`](https://github.com/grafana/databricks-prometheus-exporter) for collecting billing, jobs, pipelines, and SQL warehouse metrics from Databricks System Tables via HTTP for Prometheus consumption.

## Usage

```alloy
prometheus.exporter.databricks "LABEL" {
server_hostname = "<DATABRICKS_SERVER_HOSTNAME>"
warehouse_http_path = "<DATABRICKS_WAREHOUSE_HTTP_PATH>"
client_id = "<DATABRICKS_CLIENT_ID>"
client_secret = "<DATABRICKS_CLIENT_SECRET>"
}
```

## Arguments

You can use the following arguments with `prometheus.exporter.databricks`:

| Name | Type | Description | Default | Required |
|-------------------------|------------|---------------------------------------------------------------------------------|---------|----------|
| `server_hostname` | `string` | The Databricks workspace hostname (e.g., `dbc-xxx.cloud.databricks.com`). | | yes |
| `warehouse_http_path` | `string` | The HTTP path of the SQL Warehouse (e.g., `/sql/1.0/warehouses/abc123`). | | yes |
| `client_id` | `string` | The OAuth2 Application ID (Client ID) of your Service Principal. | | yes |
| `client_secret` | `secret` | The OAuth2 Client Secret of your Service Principal. | | yes |
| `query_timeout` | `duration` | Timeout for individual SQL queries. | `"5m"` | no |
| `billing_lookback` | `duration` | How far back to look for billing data. | `"24h"` | no |
| `jobs_lookback` | `duration` | How far back to look for job runs. | `"2h"` | no |
| `pipelines_lookback` | `duration` | How far back to look for pipeline runs. | `"2h"` | no |
| `queries_lookback` | `duration` | How far back to look for SQL warehouse queries. | `"1h"` | no |
| `sla_threshold_seconds` | `int` | Duration threshold (seconds) for job SLA miss detection. | `3600` | no |
| `collect_task_retries` | `bool` | Collect task retry metrics (high cardinality due to `task_key` label). | `false` | no |

## Blocks

The `prometheus.exporter.databricks` component doesn't support any blocks. You can configure this component with arguments.

## Exported fields

{{< docs/shared lookup="reference/components/exporter-component-exports.md" source="alloy" version="<ALLOY_VERSION>" >}}

## Component health

`prometheus.exporter.databricks` is only reported as unhealthy if given an invalid configuration.
In those cases, exported fields retain their last healthy values.

## Debug information

`prometheus.exporter.databricks` doesn't expose any component-specific debug information.

## Debug metrics

`prometheus.exporter.databricks` doesn't expose any component-specific debug metrics.

## Prerequisites

Before using this component, you need:

1. **Databricks Workspace** with Unity Catalog and System Tables enabled
2. **Service Principal** with OAuth2 M2M authentication configured
3. **SQL Warehouse** for querying System Tables (serverless recommended for cost efficiency)

See the [Databricks documentation](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html) for detailed OAuth2 M2M setup instructions.

## Example

The following example uses a [`prometheus.scrape`][scrape] component to collect metrics from `prometheus.exporter.databricks`:

```alloy
prometheus.exporter.databricks "example" {
server_hostname = "dbc-abc123-def456.cloud.databricks.com"
warehouse_http_path = "/sql/1.0/warehouses/xyz789"
client_id = "my-service-principal-id"
client_secret = "my-service-principal-secret"
}

// Configure a prometheus.scrape component to collect databricks metrics.
prometheus.scrape "demo" {
targets = prometheus.exporter.databricks.example.targets
forward_to = [prometheus.remote_write.demo.receiver]
scrape_interval = "5m"
scrape_timeout = "4m"
}

prometheus.remote_write "demo" {
endpoint {
url = "<PROMETHEUS_REMOTE_WRITE_URL>"

basic_auth {
username = "<USERNAME>"
password = "<PASSWORD>"
}
}
}
```

Replace the following:

- _`<PROMETHEUS_REMOTE_WRITE_URL>`_: The URL of the Prometheus `remote_write` compatible server to send metrics to.
- _`<USERNAME>`_: The username to use for authentication to the `remote_write` API.
- _`<PASSWORD>`_: The password to use for authentication to the `remote_write` API.

[scrape]: ../prometheus.scrape/

## Tuning recommendations

- **`scrape_interval`**: Default is 5 minutes. The exporter queries Databricks System Tables which can be slow. Increase to reduce SQL Warehouse costs.
- **`scrape_timeout`**: Default is 4 minutes. The exporter typically takes 90-120 seconds per scrape depending on data volume.

## High cardinality warning

The `collect_task_retries` flag adds task-level retry metrics which can significantly increase cardinality for workspaces with many jobs. Only enable if needed.

<!-- START GENERATED COMPATIBLE COMPONENTS -->

## Compatible components

`prometheus.exporter.databricks` has exports that can be consumed by the following components:

- Components that consume [Targets](../../../compatibility/#targets-consumers)

{{< admonition type="note" >}}
Connecting some components may not be sensible or components may require further configuration to make the connection work correctly.
Refer to the linked documentation for more details.
{{< /admonition >}}

<!-- END GENERATED COMPATIBLE COMPONENTS -->

1 change: 1 addition & 0 deletions internal/component/all/all.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ import (
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/catchpoint" // Import prometheus.exporter.catchpoint
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/cloudwatch" // Import prometheus.exporter.cloudwatch
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/consul" // Import prometheus.exporter.consul
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/databricks" // Import prometheus.exporter.databricks
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/dnsmasq" // Import prometheus.exporter.dnsmasq
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/elasticsearch" // Import prometheus.exporter.elasticsearch
_ "github.com/grafana/alloy/internal/component/prometheus/exporter/gcp" // Import prometheus.exporter.gcp
Expand Down
78 changes: 78 additions & 0 deletions internal/component/prometheus/exporter/databricks/databricks.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
package databricks

import (
"time"

"github.com/grafana/alloy/internal/component"
"github.com/grafana/alloy/internal/component/prometheus/exporter"
"github.com/grafana/alloy/internal/featuregate"
"github.com/grafana/alloy/internal/static/integrations"
"github.com/grafana/alloy/internal/static/integrations/databricks_exporter"
"github.com/grafana/alloy/syntax/alloytypes"
config_util "github.com/prometheus/common/config"
)

func init() {
component.Register(component.Registration{
Name: "prometheus.exporter.databricks",
Stability: featuregate.StabilityGenerallyAvailable,
Args: Arguments{},
Exports: exporter.Exports{},

Build: exporter.New(createExporter, "databricks"),
})
}

func createExporter(opts component.Options, args component.Arguments) (integrations.Integration, string, error) {
a := args.(Arguments)
defaultInstanceKey := opts.ID // if cannot resolve instance key, use the component ID
return integrations.NewIntegrationWithInstanceKey(opts.Logger, a.Convert(), defaultInstanceKey)
}

// DefaultArguments holds the default settings for the databricks exporter
var DefaultArguments = Arguments{
QueryTimeout: 5 * time.Minute,
BillingLookback: 24 * time.Hour,
JobsLookback: 2 * time.Hour,
PipelinesLookback: 2 * time.Hour,
QueriesLookback: 1 * time.Hour,
SLAThresholdSeconds: 3600,
CollectTaskRetries: false,
}

// Arguments controls the databricks exporter.
type Arguments struct {
ServerHostname string `alloy:"server_hostname,attr"`
WarehouseHTTPPath string `alloy:"warehouse_http_path,attr"`
ClientID string `alloy:"client_id,attr"`
ClientSecret alloytypes.Secret `alloy:"client_secret,attr"`
QueryTimeout time.Duration `alloy:"query_timeout,attr,optional"`
BillingLookback time.Duration `alloy:"billing_lookback,attr,optional"`
JobsLookback time.Duration `alloy:"jobs_lookback,attr,optional"`
PipelinesLookback time.Duration `alloy:"pipelines_lookback,attr,optional"`
QueriesLookback time.Duration `alloy:"queries_lookback,attr,optional"`
SLAThresholdSeconds int `alloy:"sla_threshold_seconds,attr,optional"`
CollectTaskRetries bool `alloy:"collect_task_retries,attr,optional"`
}

// SetToDefault implements syntax.Defaulter.
func (a *Arguments) SetToDefault() {
*a = DefaultArguments
}

func (a *Arguments) Convert() *databricks_exporter.Config {
return &databricks_exporter.Config{
ServerHostname: a.ServerHostname,
WarehouseHTTPPath: a.WarehouseHTTPPath,
ClientID: a.ClientID,
ClientSecret: config_util.Secret(a.ClientSecret),
QueryTimeout: a.QueryTimeout,
BillingLookback: a.BillingLookback,
JobsLookback: a.JobsLookback,
PipelinesLookback: a.PipelinesLookback,
QueriesLookback: a.QueriesLookback,
SLAThresholdSeconds: a.SLAThresholdSeconds,
CollectTaskRetries: a.CollectTaskRetries,
}
}

107 changes: 107 additions & 0 deletions internal/component/prometheus/exporter/databricks/databricks_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
package databricks

import (
"testing"
"time"

"github.com/grafana/alloy/internal/static/integrations/databricks_exporter"
"github.com/grafana/alloy/syntax"
"github.com/grafana/alloy/syntax/alloytypes"
config_util "github.com/prometheus/common/config"
"github.com/stretchr/testify/require"
)

func TestAlloyUnmarshal(t *testing.T) {
alloyConfig := `
server_hostname = "dbc-abc123.cloud.databricks.com"
warehouse_http_path = "/sql/1.0/warehouses/xyz789"
client_id = "my-client-id"
client_secret = "my-client-secret"
query_timeout = "10m"
billing_lookback = "48h"
jobs_lookback = "4h"
pipelines_lookback = "4h"
queries_lookback = "2h"
sla_threshold_seconds = 7200
collect_task_retries = true
`

var args Arguments
err := syntax.Unmarshal([]byte(alloyConfig), &args)
require.NoError(t, err)

expected := Arguments{
ServerHostname: "dbc-abc123.cloud.databricks.com",
WarehouseHTTPPath: "/sql/1.0/warehouses/xyz789",
ClientID: "my-client-id",
ClientSecret: alloytypes.Secret("my-client-secret"),
QueryTimeout: 10 * time.Minute,
BillingLookback: 48 * time.Hour,
JobsLookback: 4 * time.Hour,
PipelinesLookback: 4 * time.Hour,
QueriesLookback: 2 * time.Hour,
SLAThresholdSeconds: 7200,
CollectTaskRetries: true,
}

require.Equal(t, expected, args)
}

func TestAlloyUnmarshal_Defaults(t *testing.T) {
alloyConfig := `
server_hostname = "dbc-abc123.cloud.databricks.com"
warehouse_http_path = "/sql/1.0/warehouses/xyz789"
client_id = "my-client-id"
client_secret = "my-client-secret"
`

var args Arguments
err := syntax.Unmarshal([]byte(alloyConfig), &args)
require.NoError(t, err)

// Check that defaults are applied
require.Equal(t, 5*time.Minute, args.QueryTimeout)
require.Equal(t, 24*time.Hour, args.BillingLookback)
require.Equal(t, 2*time.Hour, args.JobsLookback)
require.Equal(t, 2*time.Hour, args.PipelinesLookback)
require.Equal(t, 1*time.Hour, args.QueriesLookback)
require.Equal(t, 3600, args.SLAThresholdSeconds)
require.False(t, args.CollectTaskRetries)
}

func TestConvert(t *testing.T) {
alloyConfig := `
server_hostname = "dbc-abc123.cloud.databricks.com"
warehouse_http_path = "/sql/1.0/warehouses/xyz789"
client_id = "my-client-id"
client_secret = "my-client-secret"
query_timeout = "10m"
billing_lookback = "48h"
jobs_lookback = "4h"
pipelines_lookback = "4h"
queries_lookback = "2h"
sla_threshold_seconds = 7200
collect_task_retries = true
`
var args Arguments
err := syntax.Unmarshal([]byte(alloyConfig), &args)
require.NoError(t, err)

res := args.Convert()

expected := databricks_exporter.Config{
ServerHostname: "dbc-abc123.cloud.databricks.com",
WarehouseHTTPPath: "/sql/1.0/warehouses/xyz789",
ClientID: "my-client-id",
ClientSecret: config_util.Secret("my-client-secret"),
QueryTimeout: 10 * time.Minute,
BillingLookback: 48 * time.Hour,
JobsLookback: 4 * time.Hour,
PipelinesLookback: 4 * time.Hour,
QueriesLookback: 2 * time.Hour,
SLAThresholdSeconds: 7200,
CollectTaskRetries: true,
}
require.Equal(t, expected, *res)
}

Loading
Loading