diff --git a/.chloggen/rpc-duration-seconds.yaml b/.chloggen/rpc-duration-seconds.yaml new file mode 100644 index 0000000000..2fd9e6fd2b --- /dev/null +++ b/.chloggen/rpc-duration-seconds.yaml @@ -0,0 +1,6 @@ +change_type: breaking +component: rpc +note: Rename `rpc.client|server.duration` to `rpc.client|server.call.duration`, + change RPC duration metrics from milliseconds to seconds, + and clarify metric and span duration semantics for streaming. +issues: [383, 2961] diff --git a/docs/rpc/connect-rpc.md b/docs/rpc/connect-rpc.md index 350b56bead..aec9cfdf6f 100644 --- a/docs/rpc/connect-rpc.md +++ b/docs/rpc/connect-rpc.md @@ -12,7 +12,7 @@ described on this page. ## Client Span - + @@ -163,7 +163,7 @@ the `rpc.connect_rpc.response.metadata.my-custom-key` attribute with value `["at ## Server Span - + diff --git a/docs/rpc/grpc.md b/docs/rpc/grpc.md index 248033dbb9..17d3abaf26 100644 --- a/docs/rpc/grpc.md +++ b/docs/rpc/grpc.md @@ -12,7 +12,7 @@ described on this page. ## Client Span - + @@ -164,7 +164,7 @@ the `rpc.grpc.response.metadata.my-custom-key` attribute with value `["attribute ## Server Span - + diff --git a/docs/rpc/json-rpc.md b/docs/rpc/json-rpc.md index 82e81bde3f..b54f8ac852 100644 --- a/docs/rpc/json-rpc.md +++ b/docs/rpc/json-rpc.md @@ -12,7 +12,7 @@ described on this page. ## Client Span - + @@ -126,7 +126,7 @@ different processes could be listening on TCP port 12345 and UDP port 12345. ## Server Span - + diff --git a/docs/rpc/rpc-metrics.md b/docs/rpc/rpc-metrics.md index a65b7dc0e9..8e6009b05a 100644 --- a/docs/rpc/rpc-metrics.md +++ b/docs/rpc/rpc-metrics.md @@ -16,11 +16,11 @@ metrics can be filtered for finer grain analysis. - [Metric instruments](#metric-instruments) - [RPC server](#rpc-server) - - [Metric: `rpc.server.duration`](#metric-rpcserverduration) + - [Metric: `rpc.server.call.duration`](#metric-rpcservercallduration) - [Metric: `rpc.server.request.size`](#metric-rpcserverrequestsize) - [Metric: `rpc.server.response.size`](#metric-rpcserverresponsesize) - [RPC client](#rpc-client) - - [Metric: `rpc.client.duration`](#metric-rpcclientduration) + - [Metric: `rpc.client.call.duration`](#metric-rpcclientcallduration) - [Metric: `rpc.client.request.size`](#metric-rpcclientrequestsize) - [Metric: `rpc.client.response.size`](#metric-rpcclientresponsesize) - [Semantic Conventions for specific RPC technologies](#semantic-conventions-for-specific-rpc-technologies) @@ -63,11 +63,15 @@ MUST be of the specified type and units. Below is a list of RPC server metric instruments. -#### Metric: `rpc.server.duration` +#### Metric: `rpc.server.call.duration` This metric is [recommended][MetricRecommended]. - +This metric SHOULD be specified with +[`ExplicitBucketBoundaries`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.50.0/specification/metrics/api.md#instrument-advisory-parameters) +of `[ 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 ]`. + + @@ -76,12 +80,10 @@ This metric is [recommended][MetricRecommended]. | Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | | -------- | --------------- | ----------- | -------------- | --------- | ------ | -| `rpc.server.duration` | Histogram | `ms` | Measures the duration of inbound RPC. [1] | ![Development](https://img.shields.io/badge/-development-blue) | | - -**[1]:** While streaming RPCs may record this metric as start-of-batch -to end-of-batch, it's hard to interpret in practice. +| `rpc.server.call.duration` | Histogram | `s` | Measures the duration of inbound remote procedure calls (RPC). [1] | ![Development](https://img.shields.io/badge/-development-blue) | | -**Streaming**: N/A. +**[1]:** When this metric is reported alongside an RPC server span, the metric value +SHOULD be the same as the RPC server span duration. **Attributes:** @@ -369,13 +371,16 @@ different processes could be listening on TCP port 12345 and UDP port 12345. ### RPC client Below is a list of RPC client metric instruments. -These apply to traditional RPC usage, not streaming RPCs. -#### Metric: `rpc.client.duration` +#### Metric: `rpc.client.call.duration` This metric is [recommended][MetricRecommended]. - +This metric SHOULD be specified with +[`ExplicitBucketBoundaries`](https://github.com/open-telemetry/opentelemetry-specification/blob/v1.50.0/specification/metrics/api.md#instrument-advisory-parameters) +of `[ 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 ]`. + + @@ -384,12 +389,10 @@ This metric is [recommended][MetricRecommended]. | Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations | | -------- | --------------- | ----------- | -------------- | --------- | ------ | -| `rpc.client.duration` | Histogram | `ms` | Measures the duration of outbound RPC. [1] | ![Development](https://img.shields.io/badge/-development-blue) | | - -**[1]:** While streaming RPCs may record this metric as start-of-batch -to end-of-batch, it's hard to interpret in practice. +| `rpc.client.call.duration` | Histogram | `s` | Measures the duration of outbound remote procedure calls (RPC). [1] | ![Development](https://img.shields.io/badge/-development-blue) | | -**Streaming**: N/A. +**[1]:** When this metric is reported alongside an RPC client span, the metric value +SHOULD be the same as the RPC client span duration. **Attributes:** diff --git a/docs/rpc/rpc-spans.md b/docs/rpc/rpc-spans.md index c66bd2eaf1..c67565f219 100644 --- a/docs/rpc/rpc-spans.md +++ b/docs/rpc/rpc-spans.md @@ -86,7 +86,7 @@ Generally, a user SHOULD NOT set `peer.service` to a fully qualified RPC service ### RPC client span - + @@ -97,6 +97,16 @@ Generally, a user SHOULD NOT set `peer.service` to a fully qualified RPC service This span represents an outgoing Remote Procedure Call (RPC). +RPC client spans SHOULD cover the entire client-side lifecycle of an RPC, +starting when the RPC is initiated and ending when the response is received +or the RPC is terminated due to an error or cancellation. + +For streaming RPCs, the span covers the full lifetime of the request and/or +response streams until they are closed or terminated. + +If a transient issue happened and was retried within this RPC, the corresponding +span SHOULD cover the duration of the logical call with all retries. + **Span name:** refer to the [Span Name](/docs/rpc/rpc-spans.md#span-name) section. **Span kind** MUST be `CLIENT`. @@ -212,7 +222,7 @@ different processes could be listening on TCP port 12345 and UDP port 12345. ### RPC server span - + @@ -223,6 +233,13 @@ different processes could be listening on TCP port 12345 and UDP port 12345. This span represents an incoming Remote Procedure Call (RPC). +RPC server spans SHOULD cover the entire server-side lifecycle of an RPC, +starting when the request is received and ending when the response is sent +or the RPC is terminated due to an error or cancellation. + +For streaming RPCs, the span SHOULD cover the full lifetime of the request +and/or response streams until they are closed or terminated. + **Span name:** refer to the [Span Name](/docs/rpc/rpc-spans.md#span-name) section. **Span kind** MUST be `SERVER`. diff --git a/model/rpc/deprecated/metrics-deprecated.yaml b/model/rpc/deprecated/metrics-deprecated.yaml index 8716965835..062aad5a6f 100644 --- a/model/rpc/deprecated/metrics-deprecated.yaml +++ b/model/rpc/deprecated/metrics-deprecated.yaml @@ -74,3 +74,43 @@ groups: deprecated: reason: obsoleted note: Removed, no replacement at this time. + + - id: metric.rpc.server.duration + type: metric + metric_name: rpc.server.duration + annotations: + code_generation: + metric_value_type: double + stability: development + instrument: histogram + unit: "ms" + note: | + While streaming RPCs may record this metric as start-of-batch + to end-of-batch, it's hard to interpret in practice. + + **Streaming**: N/A. + extends: attributes.metrics.rpc.server + deprecated: + reason: uncategorized + note: Replaced by `rpc.server.call.duration` with unit `s`. + brief: "Deprecated, use `rpc.server.call.duration` instead. Note: the unit also changed from `ms` to `s`." + + - id: metric.rpc.client.duration + type: metric + metric_name: rpc.client.duration + annotations: + code_generation: + metric_value_type: double + stability: development + instrument: histogram + unit: "ms" + note: | + While streaming RPCs may record this metric as start-of-batch + to end-of-batch, it's hard to interpret in practice. + + **Streaming**: N/A. + extends: attributes.metrics.rpc.client + deprecated: + reason: uncategorized + note: Replaced by `rpc.client.call.duration` with unit `s`. + brief: "Deprecated, use `rpc.client.call.duration` instead. Note: the unit also changed from `ms` to `s`." diff --git a/model/rpc/metrics.yaml b/model/rpc/metrics.yaml index 589afcb123..ba4f0ce603 100644 --- a/model/rpc/metrics.yaml +++ b/model/rpc/metrics.yaml @@ -1,20 +1,19 @@ groups: # RPC Server metrics - - id: metric.rpc.server.duration + - id: metric.rpc.server.call.duration type: metric - metric_name: rpc.server.duration + metric_name: rpc.server.call.duration annotations: code_generation: metric_value_type: double stability: development - brief: Measures the duration of inbound RPC. + brief: Measures the duration of inbound remote procedure calls (RPC). instrument: histogram - unit: "ms" + unit: "s" note: | - While streaming RPCs may record this metric as start-of-batch - to end-of-batch, it's hard to interpret in practice. + When this metric is reported alongside an RPC server span, the metric value + SHOULD be the same as the RPC server span duration. - **Streaming**: N/A. extends: attributes.metrics.rpc.server - id: metric.rpc.server.request.size @@ -47,21 +46,19 @@ groups: # RPC Client metrics - - id: metric.rpc.client.duration + - id: metric.rpc.client.call.duration type: metric - metric_name: rpc.client.duration + metric_name: rpc.client.call.duration annotations: code_generation: metric_value_type: double stability: development - brief: Measures the duration of outbound RPC. + brief: Measures the duration of outbound remote procedure calls (RPC). instrument: histogram - unit: "ms" + unit: "s" note: | - While streaming RPCs may record this metric as start-of-batch - to end-of-batch, it's hard to interpret in practice. - - **Streaming**: N/A. + When this metric is reported alongside an RPC client span, the metric value + SHOULD be the same as the RPC client span duration. extends: attributes.metrics.rpc.client - id: metric.rpc.client.request.size diff --git a/model/rpc/spans.yaml b/model/rpc/spans.yaml index 312383cab7..e69e697f72 100644 --- a/model/rpc/spans.yaml +++ b/model/rpc/spans.yaml @@ -1,9 +1,19 @@ groups: - - id: span.rpc.client + - id: span.rpc.call.client type: span stability: development brief: This span represents an outgoing Remote Procedure Call (RPC). note: | + RPC client spans SHOULD cover the entire client-side lifecycle of an RPC, + starting when the RPC is initiated and ending when the response is received + or the RPC is terminated due to an error or cancellation. + + For streaming RPCs, the span covers the full lifetime of the request and/or + response streams until they are closed or terminated. + + If a transient issue happened and was retried within this RPC, the corresponding + span SHOULD cover the duration of the logical call with all retries. + **Span name:** refer to the [Span Name](/docs/rpc/rpc-spans.md#span-name) section. **Span kind** MUST be `CLIENT`. @@ -14,7 +24,7 @@ groups: - ref: rpc.system requirement_level: required - - id: span.rpc.server + - id: span.rpc.call.server type: span stability: development extends: rpc_service.server @@ -22,6 +32,13 @@ groups: brief: This span represents an incoming Remote Procedure Call (RPC). events: [rpc.message] note: | + RPC server spans SHOULD cover the entire server-side lifecycle of an RPC, + starting when the request is received and ending when the response is sent + or the RPC is terminated due to an error or cancellation. + + For streaming RPCs, the span SHOULD cover the full lifetime of the request + and/or response streams until they are closed or terminated. + **Span name:** refer to the [Span Name](/docs/rpc/rpc-spans.md#span-name) section. **Span kind** MUST be `SERVER`. @@ -29,7 +46,7 @@ groups: - ref: rpc.system requirement_level: required - - id: span.rpc.connect_rpc.client + - id: span.rpc.connect_rpc.call.client type: span stability: development brief: This span represents an outgoing Remote Procedure Call (RPC). @@ -50,7 +67,7 @@ groups: - ref: rpc.connect_rpc.response.metadata requirement_level: opt_in - - id: span.rpc.connect_rpc.server + - id: span.rpc.connect_rpc.call.server type: span stability: development extends: rpc_service.server @@ -71,7 +88,7 @@ groups: - ref: rpc.connect_rpc.response.metadata requirement_level: opt_in - - id: span.rpc.grpc.client + - id: span.rpc.grpc.call.client type: span stability: development brief: This span represents an outgoing Remote Procedure Call (RPC). @@ -97,7 +114,7 @@ groups: - ref: rpc.grpc.response.metadata requirement_level: opt_in - - id: span.rpc.grpc.server + - id: span.rpc.grpc.call.server type: span stability: development extends: rpc_service.server @@ -119,7 +136,7 @@ groups: - ref: rpc.grpc.response.metadata requirement_level: opt_in - - id: span.rpc.jsonrpc.client + - id: span.rpc.jsonrpc.call.client type: span stability: development brief: This span represents an outgoing Remote Procedure Call (RPC). @@ -145,7 +162,7 @@ groups: - ref: rpc.jsonrpc.error_message requirement_level: recommended - - id: span.rpc.jsonrpc.server + - id: span.rpc.jsonrpc.call.server type: span stability: development extends: rpc.server