Skip to content

Commit 59e1e10

Browse files
dustin-temporalclaudelennessyy
authored
Move SDK metrics setup to dedicated page (#4327)
* Move SDK metrics setup instructions to dedicated page Extracts SDK metrics content from the Prometheus Grafana page into a new standalone page at /cloud/metrics/sdk-metrics-setup, peer to OpenMetrics and PromQL in the sidebar. The existing page now covers only Temporal Cloud metrics. The new page links to SDK dev guides, metrics samples for all 5 languages, and the SDK metrics reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Cleanup language * Clarify SDK metrics set up * tighten up SDK metrics setup * docs: edit heading * docs: edit title --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Lenny Chen <lenny.chen@temporal.io> Co-authored-by: Lenny Chen <55669665+lennessyy@users.noreply.github.com>
1 parent e3cd4d3 commit 59e1e10

4 files changed

Lines changed: 171 additions & 208 deletions

File tree

docs/cloud/metrics/index.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,3 +45,5 @@ Cloud Metrics for all Namespaces in your account are available from two sources:
4545
OpenMetrics is the recommended option for most users.
4646

4747
:::
48+
49+
For setting up SDK metrics emitted by your Workers and Clients, see [SDK metrics setup](/cloud/metrics/sdk-metrics-setup).

docs/cloud/metrics/prometheus-grafana.mdx

Lines changed: 21 additions & 208 deletions
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,21 @@
22
id: prometheus-grafana
33
title: Prometheus Grafana setup
44
sidebar_label: Prometheus Grafana
5-
description: Set up Grafana with Temporal Cloud observability to monitor performance and troubleshoot errors. Use Prometheus API endpoints and SDK metrics for efficient, real-time insights.
5+
description: Set up Grafana with Temporal Cloud observability to monitor performance and troubleshoot errors using the Prometheus HTTP API endpoint.
66
slug: /cloud/metrics/prometheus-grafana
77
toc_max_heading_level: 4
88
keywords:
99
- grafana temporal integration
1010
- temporal cloud observability
1111
- prometheus temporal cloud
12-
- temporal sdk metrics
1312
- grafana prometheus setup
1413
- temporal cloud grafana dashboard
15-
- prometheus scrape endpoint
1614
- grafana data source setup
17-
- temporal sdk monitoring
1815
- grafana monitoring workflows
1916
- prometheus metrics visualization
2017
- observability with grafana
2118
- workflow metrics grafana
2219
- grafana prometheus integration
23-
- temporal sdk metrics setup
2420
tags:
2521
- Metrics
2622
- Observability
@@ -29,23 +25,21 @@ tags:
2925

3026
import { ZoomingImage } from '@site/src/components';
3127

32-
**How to set up Grafana with Temporal Cloud observability to view metrics.**
33-
34-
Temporal Cloud and SDKs generate metrics for monitoring performance and troubleshooting errors.
28+
**How to set up Grafana with Temporal Cloud PromQL endpoint to view Cloud metrics.**
3529

3630
Temporal Cloud emits metrics through a [Prometheus HTTP API endpoint](https://prometheus.io/docs/prometheus/latest/querying/api/), which can be directly used as a Prometheus data source in Grafana or to query and export Cloud metrics to any observability platform.
3731

38-
The open-source SDKs require you to set up a Prometheus scrape endpoint for Prometheus to collect and aggregate the Worker and Client metrics.
32+
:::note
33+
34+
For setting up SDK metrics (emitted by your Workers and Clients), see [SDK metrics setup](/cloud/metrics/sdk-metrics-setup).
3935

40-
This section describes how to set up your Temporal Cloud and SDK metrics and use them as data sources in Grafana.
36+
:::
4137

42-
The process for setting up observability includes the following steps:
38+
The process for setting up Temporal Cloud PromQL to work with Grafana includes the following steps:
4339

44-
1. Create or get your Prometheus endpoint for Temporal Cloud metrics and enable SDK metrics.
45-
- For Temporal Cloud, [generate a Prometheus HTTP API endpoint](/cloud/metrics/general-setup) on Temporal Cloud using valid certificates.
46-
- For SDKs, [expose a metrics endpoint](#sdk-metrics-setup) where Prometheus can scrape SDK metrics and [run Prometheus](#prometheus-configuration) on your host. The examples in this article describe running Prometheus on your local machine where you run your application code.
47-
2. Run Grafana and [set up data sources for Temporal Cloud and SDK metrics](#grafana-data-sources-configuration) in Grafana. The examples in this article describe running Grafana on your local host where you run your application code.
48-
3. [Create dashboards](#grafana-dashboards-setup) in Grafana to view Temporal Cloud metrics and SDK metrics. Temporal provides [sample community-driven Grafana dashboards](https://github.com/temporalio/dashboards) for Cloud and SDK metrics that you can use and customize according to your requirements.
40+
1. [Generate a Prometheus HTTP API endpoint](/cloud/metrics/general-setup) on Temporal Cloud using valid certificates.
41+
2. Run Grafana and [set up a data source for Temporal Cloud metrics](#grafana-data-source-configuration) in Grafana.
42+
3. [Create dashboards](#grafana-dashboards-setup) in Grafana to view Temporal Cloud metrics. Temporal provides [sample community-driven Grafana dashboards](https://github.com/temporalio/dashboards) for Cloud metrics that you can use and customize according to your requirements.
4943

5044
If you're following through with the examples provided here, ensure that you have the following:
5145

@@ -59,7 +53,7 @@ If you're following through with the examples provided here, ensure that you hav
5953
- [TypeScript](/develop/typescript/core-application#connect-to-temporal-cloud)
6054
- [.NET](/develop/dotnet/temporal-client#connect-to-temporal-cloud)
6155

62-
- Prometheus and Grafana installed.
56+
- Grafana installed.
6357

6458
## Temporal Cloud metrics setup
6559

@@ -84,172 +78,16 @@ The following steps describe how to set up Observability on Temporal Cloud to ge
8478
6. Copy the HTTP API endpoint that is generated (it is shown in the UI).
8579

8680
This endpoint should be configured as a data source for Temporal Cloud metrics in Grafana.
87-
See [Data sources configuration for Temporal Cloud and SDK metrics in Grafana](#grafana-data-sources-configuration) for details.
88-
89-
## SDK metrics setup
90-
91-
SDK metrics are emitted by SDK Clients used to start your Workers and to start, signal, or query your Workflow Executions.
92-
You must configure a Prometheus scrape endpoint for Prometheus to collect and aggregate your SDK metrics.
93-
Each language development guide has details on how to set this up.
94-
95-
- [Go SDK](/develop/go/observability#metrics)
96-
- [Java SDK](/develop/java/observability#metrics)
97-
- [TypeScript SDK](/develop/typescript/observability#metrics)
98-
- [Python](/develop/python/observability#metrics)
99-
- [.NET](/develop/dotnet/observability#metrics)
100-
101-
The following example uses the Java SDK to set the Prometheus registry and Micrometer stats reporter, set the scope, and expose an endpoint from which Prometheus can scrape the SDK metrics.
102-
103-
```java
104-
//You need the following packages to set up metrics in Java.
105-
//See the Developer's guide for packages required for other SDKs.
106-
107-
//
108-
import com.sun.net.httpserver.HttpServer;
109-
import com.uber.m3.tally.RootScopeBuilder;
110-
import com.uber.m3.tally.Scope;
111-
import com.uber.m3.util.Duration;
112-
import com.uber.m3.util.ImmutableMap;
113-
114-
import io.micrometer.prometheus.PrometheusConfig;
115-
import io.micrometer.prometheus.PrometheusMeterRegistry;
116-
import io.temporal.common.reporter.MicrometerClientStatsReporter;
117-
118-
import java.io.IOException;
119-
import java.io.OutputStream;
120-
import java.net.InetSocketAddress;
121-
122-
import io.temporal.serviceclient.SimpleSslContextBuilder;
123-
import io.temporal.serviceclient.WorkflowServiceStubs;
124-
import io.temporal.serviceclient.WorkflowServiceStubsOptions;
125-
126-
import java.io.FileInputStream;
127-
import java.io.InputStream;
128-
//
129-
{
130-
// See the Micrometer documentation for configuration details on other supported monitoring systems.
131-
// Set up the Prometheus registry.
132-
PrometheusMeterRegistry yourRegistry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
133-
134-
public static Scope yourScope(){
135-
//Set up a scope, report every 10 seconds
136-
Scope yourScope = new RootScopeBuilder()
137-
.tags(ImmutableMap.of(
138-
"customtag1",
139-
"customvalue1",
140-
"customtag2",
141-
"customvalue2"))
142-
.reporter(new MicrometerClientStatsReporter(yourRegistry))
143-
.reportEvery(Duration.ofSeconds(10));
144-
145-
//Start Prometheus scrape endpoint at port 8077 on your local host
146-
HttpServer scrapeEndpoint = startPrometheusScrapeEndpoint(yourRegistry, 8077);
147-
return yourScope;
148-
}
149-
150-
/**
151-
* Starts HttpServer to expose a scrape endpoint. See
152-
* https://micrometer.io/docs/registry/prometheus for more info.
153-
*/
154-
155-
public static HttpServer startPrometheusScrapeEndpoint(
156-
PrometheusMeterRegistry yourRegistry, int port) {
157-
try {
158-
HttpServer server = HttpServer.create(new InetSocketAddress(port), 0);
159-
server.createContext(
160-
"/metrics",
161-
httpExchange -> {
162-
String response = registry.scrape();
163-
httpExchange.sendResponseHeaders(200, response.getBytes(UTF_8).length);
164-
try (OutputStream os = httpExchange.getResponseBody()) {
165-
os.write(response.getBytes(UTF_8));
166-
}
167-
});
168-
server.start();
169-
return server;
170-
} catch (IOException e) {
171-
throw new RuntimeException(e);
172-
}
173-
}
174-
}
175-
176-
//
177-
178-
// With your scrape endpoint configured, set the metrics scope in your Workflow service stub and
179-
// use it to create a Client to start your Workers and Workflow Executions.
180-
181-
//
182-
{
183-
//Create Workflow service stubs to connect to the Frontend Service.
184-
WorkflowServiceStubs service = WorkflowServiceStubs.newServiceStubs(
185-
WorkflowServiceStubsOptions.newBuilder()
186-
.setMetricsScope(yourScope()) //set the metrics scope for the WorkflowServiceStubs
187-
.build());
188-
189-
//Create a Workflow service client, which can be used to start, signal, and query Workflow Executions.
190-
WorkflowClient yourClient = WorkflowClient.newInstance(service,
191-
WorkflowClientOptions.newBuilder().build());
192-
}
193-
194-
//
195-
```
196-
197-
To check whether your scrape endpoints are emitting metrics, run your code and go to [http://localhost:8077/metrics](http://localhost:8077/metrics) to verify that you see the SDK metrics.
198-
199-
You can set up separate scrape endpoints in your Clients that you use to start your Workers and Workflow Executions.
200-
201-
For more examples on setting metrics endpoints in other SDKs, see the metrics samples:
202-
203-
- [Java SDK Samples](https://github.com/temporalio/samples-java/tree/main/core/src/main/java/io/temporal/samples/metrics)
204-
- [Go SDK Samples](https://github.com/temporalio/samples-go/tree/main/metrics)
205-
206-
## SDK metrics Prometheus Configuration {#prometheus-configuration}
207-
208-
**How to configure Prometheus to ingest Temporal SDK metrics.**
209-
210-
For Temporal SDKs, you must have Prometheus running and configured to listen on the scrape endpoints exposed in your application code.
211-
212-
For this example, you can run Prometheus locally or as a Docker container.
213-
In either case, ensure that you set the listen targets to the ports where you expose your scrape endpoints.
214-
When you run Prometheus locally, set your target address to port 8077 in your Prometheus configuration YAML file. (We set the scrape endpoint to port 8077 in the [SDK metrics setup](#sdk-metrics-setup) example.)
215-
216-
Example:
217-
218-
```yaml
219-
global:
220-
scrape_interval: 10s # Set the scrape interval to every 10 seconds. Default is every 1 minute.
221-
#...
222-
223-
# Set your scrape configuration targets to the ports exposed on your endpoints in the SDK.
224-
scrape_configs:
225-
- job_name: 'temporalsdkmetrics'
226-
metrics_path: /metrics
227-
scheme: http
228-
static_configs:
229-
- targets:
230-
# This is the scrape endpoint where Prometheus listens for SDK metrics.
231-
- localhost:8077
232-
# You can have multiple targets here, provided they are set up in your application code.
233-
```
234-
235-
See the [Prometheus documentation](https://prometheus.io/docs/introduction/first_steps/) for more details on how you can run Prometheus locally or using Docker.
236-
237-
Note that Temporal Cloud exposes metrics through a [Prometheus HTTP API endpoint](https://prometheus.io/docs/prometheus/latest/querying/api/) (not a scrape endpoint) that can be configured as a data source in Grafana.
238-
The Prometheus configuration described here is for scraping metrics data on endpoints for SDK metrics only.
239-
240-
To check whether Prometheus is receiving metrics from your SDK target, go to [http://localhost:9090](http://localhost:9090) and navigate to **Status&nbsp;> Targets**.
241-
The status of your target endpoint defined in your configuration appears here.
242-
243-
## Grafana data sources configuration {#grafana-data-sources-configuration}
81+
See [Grafana data source configuration](#grafana-data-source-configuration) for details.
24482

245-
**How to configure data sources for Temporal Cloud and SDK metrics in Grafana.**
83+
## Grafana data source configuration {#grafana-data-source-configuration}
84+
85+
**How to configure the Temporal Cloud metrics data source in Grafana.**
24686

24787
Depending on how you use Grafana, you can either install and run it locally, run it as a Docker container, or log in to Grafana Cloud to set up your data sources.
24888

24989
If you have installed and are running Grafana locally, go to [http://localhost:3000](http://localhost:3000) and sign in.
25090

251-
You must configure your Temporal Cloud and SDK metrics data sources separately in Grafana.
252-
25391
To add the Temporal Cloud Prometheus HTTP API endpoint that we generated in the [Temporal Cloud metrics setup](/cloud/metrics/general-setup) section, do the following:
25492

25593
1. Go to **Configuration&nbsp;> Data sources**.
@@ -266,52 +104,27 @@ To add the Temporal Cloud Prometheus HTTP API endpoint that we generated in the
266104

267105
If you see issues in setting this data source, verify your CA certificate chain and ensure that you are setting the correct certificates in your Temporal Cloud observability setup and in the TLS authentication in Grafana.
268106

269-
To add the SDK metrics Prometheus endpoint that we configured in the [SDK metrics setup](#sdk-metrics-setup) and [Prometheus configuration for SDK metrics](#prometheus-configuration) sections, do the following:
270-
271-
1. Go to **Configuration&nbsp;> Data sources**.
272-
2. Select **Add data source&nbsp;> Prometheus**.
273-
3. Enter a name for your Temporal Cloud metrics data source, such as _Temporal SDK metrics_.
274-
4. In the **HTTP** section, enter your Prometheus endpoint in the URL field.
275-
If running Prometheus locally as described in the examples in this article, enter `http://localhost:9090`.
276-
5. For this example, enable **Skip TLS Verify** in the **Auth** section.
277-
6. Click **Save and test** to verify that the data source is working.
278-
279-
If you see issues in setting this data source, check whether the endpoints set in your SDKs are showing metrics.
280-
If you don't see your SDK metrics at the scrape endpoints defined, check whether your Workers and Workflow Executions are running.
281-
If you see metrics on the scrape endpoints, but Prometheus shows your targets are down, then there is an issue with connecting to the targets set in your SDKs.
282-
Verify your Prometheus configuration and restart Prometheus.
283-
284-
If you're running Grafana as a container, you can set your SDK metrics Prometheus data source in your Grafana configuration.
285-
See the example Grafana configuration described in the [Prometheus and Grafana setup for open-source Temporal Service](/self-hosted-guide/monitoring#grafana) article.
286-
287107
### Grafana dashboards setup
288108

289109
To set up dashboards in Grafana, you can use the UI or configure them directly in your Grafana deployment.
290110

291111
:::tip
292112

293-
Temporal provides community-driven example dashboards for [Temporal Cloud](https://github.com/temporalio/dashboards/tree/master/cloud) and [Temporal SDKs](https://github.com/temporalio/dashboards/tree/master/sdk) that you can customize to meet your needs.
113+
Temporal provides community-driven [example dashboards for Temporal Cloud](https://github.com/temporalio/dashboards/tree/master/cloud) that you can customize to meet your needs.
294114

295115
:::
296116

297117
To import a dashboard in Grafana:
298118

299119
1. In the left-hand navigation bar, select **Dashboards** > **Import dashboard**.
300-
2. You can either copy and paste the JSON from the [Temporal Cloud](https://github.com/temporalio/dashboards/tree/master/cloud) and [Temporal SDK](https://github.com/temporalio/dashboards/tree/master/sdk) sample dashboards, or import the JSON files into Grafana.
120+
2. You can either copy and paste the JSON from the [Temporal Cloud sample dashboards](https://github.com/temporalio/dashboards/tree/master/cloud), or import the JSON files into Grafana.
301121
3. Save the dashboard and review the metrics data in the graphs.
302122

303123
To configure dashboards with the UI:
304124

305125
1. Go to **Create > Dashboard** and add an empty panel.
306-
2. On the **Panel configuration** page, in the **Query** tab, select the "Temporal Cloud metrics" or "Temporal SDK metrics" data source that you configured earlier.
307-
If you need to add multiple queries from both data sources, choose `–Mixed–`.
308-
3. Add your metrics queries:
309-
- For Temporal Cloud metrics, expand the **Metrics browser** and select the metrics you want.
310-
You can also select associated labels and values to sort the query data.
311-
The [Cloud metrics documentation](/cloud/metrics/reference) lists all metrics emitted from Temporal Cloud.
312-
- For Temporal SDK metrics, expand the **Metrics browser** and select the metrics you want.
313-
A list of Worker performance metrics is described in the [Developer's Guide - Worker performance](/develop/worker-performance).
314-
All SDK-related metrics are listed in the [SDK metrics](/references/sdk-metrics) reference.
126+
2. On the **Panel configuration** page, in the **Query** tab, select the "Temporal Cloud metrics" data source that you configured earlier.
127+
3. Expand the **Metrics browser** and select the metrics you want.
128+
You can also select associated labels and values to sort the query data.
129+
The [PromQL documentation](/cloud/metrics/reference) lists all metrics emitted from PromQL in Temporal Cloud.
315130
4. The graph should now display data based on your selected queries.
316-
Note that SDK metrics will only show if you have Workflow Execution data and running Workers.
317-
If you don't see SDK metrics, run your Worker and Workflow Executions, then monitor the dashboard.

0 commit comments

Comments
 (0)