-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[DOC-1350][2025.2.4.0] Distributed tracing #32111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ddhodge
wants to merge
6
commits into
yugabyte:master
Choose a base branch
from
ddhodge:doc_1350_distributedtracing
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+178
−0
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
172 changes: 172 additions & 0 deletions
172
.../content/stable/launch-and-manage/monitor-and-alert/ysql-distributed-tracing.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,172 @@ | ||
| --- | ||
| title: YSQL Distributed Tracing | ||
| linkTitle: YSQL Distributed Tracing | ||
| headerTitle: YSQL Distributed Tracing | ||
| description: Export OpenTelemetry traces for YSQL query execution. | ||
| headcontent: Trace YSQL queries with OpenTelemetry | ||
| menu: | ||
| stable: | ||
| parent: monitor-and-alert | ||
| identifier: ysql-distributed-tracing | ||
| weight: 130 | ||
| type: docs | ||
| tags: | ||
| feature: tech-preview | ||
| rightNav: | ||
| hideH4: true | ||
| --- | ||
|
|
||
| YSQL queries spend time in many places: parsing, planning, execution, transaction commit, and RPC calls to tablet servers. YugabyteDB can export timing data for these stages as OpenTelemetry (OTel) traces so you can inspect a waterfall view of query execution in tools such as [Jaeger](/stable/integrations/jaeger/), [Grafana Tempo](https://grafana.com/oss/tempo/), or [Honeycomb](https://www.honeycomb.io). | ||
|
|
||
| Distributed Tracing is {{<tags/feature/tp>}} and available in v2025.2.4.0 and later, and is currently only available for YSQL. Tracing inside tablet servers is not included in this release. | ||
|
|
||
| ## How it works | ||
|
|
||
| YSQL Distributed Tracing follows the [W3C Trace Context](https://www.w3.org/TR/trace-context/) standard. Your application supplies a `traceparent` value. YugabyteDB creates spans for the query lifecycle and exports them to an OTel collector over OTLP/HTTP. | ||
|
|
||
| Each traced query produces a _trace_ made up of _spans_. Spans are nested to show where time is spent. For example, planning, execution, commit, and individual RPC calls to `PgClientService`. | ||
|
|
||
| When tracing is disabled (the default), there is no measurable performance impact. When tracing is enabled for a query, other queries and other YSQL backends are not affected. | ||
|
|
||
| ## Configure Distributed Tracing | ||
|
|
||
| To configure Distributed Tracing, set the following YB-TServer flags on each node in the cluster. Changing these flags requires a YB-TServer restart. | ||
|
|
||
| | Flag | Description | Default | | ||
| | :--- | :---------- | :------ | | ||
| | otel_collector_traces_endpoint | OTLP/HTTP URL where spans are exported. For example, `http://<collector-host>:4318/v1/traces`.<br>Setting this flag enables tracing infrastructure in each YSQL backend process. | Empty | | ||
| | otel_batch_max_queue_size | Maximum number of spans buffered in the queue. Spans beyond this limit are dropped. Must be greater than 0 and at least as large as `otel_batch_max_export_batch_size`. | `2048` | | ||
| | otel_batch_schedule_delay_ms | Milliseconds between batch exports. Lower values reduce export latency but increase export frequency. | `5000` | | ||
| | otel_batch_max_export_batch_size | Maximum spans per export batch. Must be greater than 0 and no larger than `otel_batch_max_queue_size`. | `512` | | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - YSQL must be enabled on the cluster. | ||
| - The OTLP/HTTP endpoint must be reachable from each YB-TServer node. | ||
| - Because Distributed Tracing is a preview feature, add `otel_collector_traces_endpoint` to the [allowed_preview_flags_csv](../../../reference/configuration/yb-tserver/#allowed-preview-flags-csv) list before setting it. | ||
|
|
||
| ## Set up tracing | ||
|
|
||
| The following example uses [Jaeger](https://www.jaegertracing.io/) as the trace backend. Jaeger accepts OTLP over HTTP on port 4318. | ||
|
|
||
| ### Start Jaeger | ||
|
|
||
| Run Jaeger all-in-one with OTLP enabled: | ||
|
|
||
| ```sh | ||
| docker run --rm --name jaeger \ | ||
| -e COLLECTOR_OTLP_ENABLED=true \ | ||
| -p 16686:16686 \ | ||
| -p 4318:4318 \ | ||
| jaegertracing/all-in-one:1.61 | ||
| ``` | ||
|
|
||
| Open the Jaeger UI at [http://localhost:16686](http://localhost:16686). | ||
|
|
||
| For a local single-node cluster started with yugabyted: | ||
|
|
||
| ```sh | ||
| ./bin/yugabyted start \ | ||
| --tserver_flags "allowed_preview_flags_csv=otel_collector_traces_endpoint,otel_collector_traces_endpoint=http://127.0.0.1:4318/v1/traces" | ||
| ``` | ||
|
|
||
| If you use YugabyteDB Anywhere, set the flags using [Edit configuration flags](../../../yugabyte-platform/manage-deployments/edit-config-flags/#modify-configuration-flags). | ||
|
ddhodge marked this conversation as resolved.
|
||
|
|
||
| {{< note title="Note" >}} | ||
|
|
||
| If `otel_collector_traces_endpoint` is not set, attempting to use the [yb_dist_tracecontext](#configuration-parameter-per-session-or-transaction) configuration parameter returns an error indicating that Distributed Tracing is not enabled. | ||
|
|
||
| {{</note >}} | ||
|
|
||
| ### Trace a query | ||
|
|
||
| After the cluster is running with tracing configured, enable tracing for individual queries using one of the following methods. | ||
|
|
||
| #### SQL comment (per query) | ||
|
|
||
| Prepend or append a block comment that contains a W3C `traceparent` value. The comment must be the _first_ block comment at the start of the query, or the _last_ block comment at the end of the query. If the comment is not the first or last, the `traceparent` is not parsed. | ||
|
|
||
| ```sql | ||
| /*traceparent='00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01'*/ | ||
| SELECT * FROM orders WHERE customer_id = 500; | ||
| ``` | ||
|
|
||
| You can also place the comment at the end of the statement: | ||
|
|
||
| ```sql | ||
| SELECT * FROM orders WHERE customer_id = 500 | ||
| /*traceparent='00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01'*/; | ||
| ``` | ||
|
|
||
| #### Configuration parameter (per session or transaction) | ||
|
ddhodge marked this conversation as resolved.
|
||
|
|
||
| Set the `yb_dist_tracecontext` YSQL [configuration parameter](../../../reference/configuration/yb-tserver/#postgresql-configuration-parameters). | ||
|
|
||
| For example, to trace every query in the current transaction: | ||
|
|
||
| ```sql | ||
| BEGIN; | ||
| SET LOCAL yb_dist_tracecontext = 'traceparent=''00-00000000000000000000000000000001-0000000000000005-01'''; | ||
|
ddhodge marked this conversation as resolved.
|
||
| SELECT * FROM users WHERE id = 10; | ||
| UPDATE accounts SET balance = balance - 100 WHERE id = 10; | ||
| COMMIT; | ||
| ``` | ||
|
|
||
| To stop tracing for the session: | ||
|
|
||
| ```sql | ||
| RESET yb_dist_tracecontext; | ||
| ``` | ||
|
|
||
| {{< note title="Parameter takes precedence" >}} | ||
|
|
||
| If you provide both a `traceparent` parameter _and_ SQL comment, the parameter takes priority and a warning is emitted. | ||
|
|
||
| {{< /note >}} | ||
|
|
||
| ### View traces | ||
|
|
||
| Run a traced query, wait a few seconds for spans to be batched and exported (controlled by `otel_batch_schedule_delay_ms`), then open the Jaeger UI. | ||
|
|
||
| 1. Select **Service** `ysql`. | ||
| 2. Click **Find Traces**. | ||
| 3. Open a trace to see spans such as `query`, `parse`, `plan`, `execute`, `commit`, and RPC spans. | ||
|
|
||
| ## Trace data | ||
|
|
||
| ### Process attributes | ||
|
|
||
| Each YSQL backend exports process metadata with every trace: | ||
|
|
||
| | Attribute | Description | | ||
| | :-------- | :---------- | | ||
| | service.name | Always `ysql`. | | ||
| | service.instance.id | UUID of the YB-TServer node running the YSQL backend. | | ||
| | process.pid | Operating-system PID of the YSQL backend process. | | ||
|
|
||
| ### Span attributes | ||
|
|
||
| Every span includes standard OpenTelemetry fields such as operation name, span ID, parent span ID, trace ID, start time, duration, span kind, and status. | ||
|
|
||
| Additional attributes depend on the span type: | ||
|
|
||
| | Span | Additional attributes | | ||
| | :--- | :-------------------- | | ||
| | Root (`query`) | `query.text` (truncated to 256 characters), `user.id` (PostgreSQL role OID) | | ||
| | RPC (for example, `rpc yb.tserver.PgClientService.Perform`) | `rpc.table_names` — tables accessed by the RPC | | ||
|
|
||
| Typical span names include: | ||
|
|
||
| - Query lifecycle: `query`, `parse`, `rewrite`, `plan`, `execute`, `commit`, `abort` | ||
| - Extended query protocol: `ext.parse`, `ext.bind`, `ext.execute`, `ext.sync`, `ext.describe`, `ext.flush` | ||
| - RPC calls: `rpc yb.tserver.PgClientService.Perform` and related RPC operation names | ||
|
|
||
| ## Join application traces | ||
|
|
||
| Because YugabyteDB accepts W3C `traceparent` values, you can continue a trace started in application code. Pass the same `traceparent` in a SQL comment or `yb_dist_tracecontext` parameter so database spans appear as children of your application's root span in your observability backend. | ||
|
|
||
| ## Related topics | ||
|
|
||
| - [Monitor with Active Session History](../active-session-history-monitor/) - sample-based view of database wait events | ||
| - [Query tuning](../query-tuning/) - optimize query performance with EXPLAIN, pg_stat_statements, and related tools | ||
| - [Jaeger integration](../../../integrations/jaeger/) - use YCQL as Jaeger trace storage (separate from YSQL query tracing) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.