You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/administration/management/monitoring/metrics.md
+192Lines changed: 192 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1678,3 +1678,195 @@ For more information on how to build a monitoring service for your StarRocks clu
1678
1678
1679
1679
- Unit: Count
1680
1680
- Description: The number of times blacklisted sql have been intercepted.
1681
+
<<<<<<< HEAD
1682
+
=======
1683
+
1684
+
### starrocks_fe_scheduled_pending_tablet_num
1685
+
1686
+
- Unit: Count
1687
+
- Type: Instantaneous
1688
+
- Description: The number of Clone tasks in Pending state FE scheduled, including both BALANCE and REPAIR types.
1689
+
1690
+
### starrocks_fe_scheduled_running_tablet_num
1691
+
1692
+
- Unit: Count
1693
+
- Type: Instantaneous
1694
+
- Description: The number of Clone tasks in Running state FE scheduled, including both BALANCE and REPAIR types.
1695
+
1696
+
### starrocks_fe_clone_task_total
1697
+
1698
+
- Unit: Count
1699
+
- Type: Cumulative
1700
+
- Description: The total number of Clone tasks in the cluster.
1701
+
1702
+
### starrocks_fe_clone_task_success
1703
+
1704
+
- Unit: Count
1705
+
- Type: Cumulative
1706
+
- Description: The number of successfully executed Clone tasks in the cluster.
1707
+
1708
+
### starrocks_fe_clone_task_copy_bytes
1709
+
1710
+
- Unit: Bytes
1711
+
- Type: Cumulative
1712
+
- Description: The total file size copied by Clone tasks in the cluster, including both INTER_NODE and INTRA_NODE types.
1713
+
1714
+
### starrocks_fe_clone_task_copy_duration_ms
1715
+
1716
+
- Unit: ms
1717
+
- Type: Cumulative
1718
+
- Description: The total time for copy consumed by Clone tasks in the cluster, including both INTER_NODE and INTRA_NODE types.
1719
+
1720
+
### starrocks_be_clone_task_copy_bytes
1721
+
1722
+
- Unit: Bytes
1723
+
- Type: Cumulative
1724
+
- Description: The total file size copied by Clone tasks in the BE node, including both INTER_NODE and INTRA_NODE types.
1725
+
1726
+
### starrocks_be_clone_task_copy_duration_ms
1727
+
1728
+
- Unit: ms
1729
+
- Type: Cumulative
1730
+
- Description: The total time for copy consumed by Clone tasks in the BE node, including both INTER_NODE and INTRA_NODE types.
1731
+
1732
+
### Transaction Latency Metrics
1733
+
1734
+
The following metrics are `summary`-type metrics that provide latency distributions for different phases of a transaction. These metrics are reported exclusively by the Leader FE node.
1735
+
1736
+
Each metric includes the following outputs:
1737
+
-**Quantiles**: Latency values at different percentile boundaries. These are exposed via the `quantile` label, which can have values of `0.75`, `0.95`, `0.98`, `0.99`, and `0.999`.
1738
+
-**`<metric_name>_sum`**: The total cumulative time spent in this phase, for example, `starrocks_fe_txn_total_latency_ms_sum`.
1739
+
-**`<metric_name>_count`**: The total number of transactions recorded for this phase, for example, `starrocks_fe_txn_total_latency_ms_count`.
1740
+
1741
+
All transaction metrics share the following labels:
1742
+
-`type`: Categorizes transactions by their load job source type (for example, `all`, `stream_load`, `routine_load`). This allows for monitoring both overall transaction performance and the performance of specific load types. The reported groups can be configured via the FE parameter [`txn_latency_metric_report_groups`](../FE_configuration.md#txn_latency_metric_report_groups).
1743
+
-`is_leader`: Indicates whether the reporting FE node is the Leader. Only the Leader FE (`is_leader="true"`) reports actual metric values. Followers will have `is_leader="false"` and report no data.
1744
+
1745
+
#### starrocks_fe_txn_total_latency_ms
1746
+
1747
+
- Unit: ms
1748
+
- Type: Summary
1749
+
- Description: The total latency for a transaction to complete, measured from the `prepare` time to the `finish` time. This metric represents the full end-to-end duration of a transaction.
1750
+
1751
+
#### starrocks_fe_txn_write_latency_ms
1752
+
1753
+
- Unit: ms
1754
+
- Type: Summary
1755
+
- Description: The latency of the `write` phase of a transaction, from `prepare` time to `commit` time. This metric isolates the performance of the data writing and preparation stage before the transaction is ready to be published.
1756
+
1757
+
#### starrocks_fe_txn_publish_latency_ms
1758
+
1759
+
- Unit: ms
1760
+
- Type: Summary
1761
+
- Description: The latency of the `publish` phase, from `commit` time to `finish` time. This is the duration it takes for a committed transaction to become visible to queries. It is the sum of the `schedule`, `execute`, and `ack` sub-phases.
1762
+
1763
+
#### starrocks_fe_txn_publish_schedule_latency_ms
1764
+
1765
+
- Unit: ms
1766
+
- Type: Summary
1767
+
- Description: The time a transaction spends waiting to be published after it has been committed, measured from `commit` time to when the publish task is picked up. This metric reflects scheduling delays or queueing time in the `publish` pipeline.
1768
+
1769
+
#### starrocks_fe_txn_publish_execute_latency_ms
1770
+
1771
+
- Unit: ms
1772
+
- Type: Summary
1773
+
- Description: The active execution time of the `publish` task, from when the task is picked up to when it finishes. This metric represents the actual time being spent to make the transaction's changes visible.
1774
+
1775
+
#### starrocks_fe_txn_publish_ack_latency_ms
1776
+
1777
+
- Unit: ms
1778
+
- Type: Summary
1779
+
- Description: The final acknowledgment latency, from when the `publish` task finishes to the final `finish` time when the transaction is marked as `VISIBLE`. This metric includes any final steps or acknowledgments required.
1780
+
1781
+
### Merge Commit BE Metrics
1782
+
1783
+
#### merge_commit_request_total
1784
+
1785
+
- Unit: Count
1786
+
- Type: Cumulative
1787
+
- Description: Total number of merge commit requests received by BE.
1788
+
1789
+
#### merge_commit_request_bytes
1790
+
1791
+
- Unit: Bytes
1792
+
- Type: Cumulative
1793
+
- Description: Total bytes of data received across merge commit requests.
1794
+
1795
+
#### merge_commit_success_total
1796
+
1797
+
- Unit: Count
1798
+
- Type: Cumulative
1799
+
- Description: Merge commit requests that finished successfully.
1800
+
1801
+
#### merge_commit_fail_total
1802
+
1803
+
- Unit: Count
1804
+
- Type: Cumulative
1805
+
- Description: Merge commit requests that failed.
1806
+
1807
+
#### merge_commit_pending_total
1808
+
1809
+
- Unit: Count
1810
+
- Type: Instantaneous
1811
+
- Description: Merge commit tasks currently waiting in the execution queue.
1812
+
1813
+
#### merge_commit_pending_bytes
1814
+
1815
+
- Unit: Bytes
1816
+
- Type: Instantaneous
1817
+
- Description: Total bytes of data held by pending merge commit tasks.
1818
+
1819
+
#### merge_commit_send_rpc_total
1820
+
1821
+
- Unit: Count
1822
+
- Type: Cumulative
1823
+
- Description: RPC requests sent to FE for starting merge commit operations.
1824
+
1825
+
#### merge_commit_register_pipe_total
1826
+
1827
+
- Unit: Count
1828
+
- Type: Cumulative
1829
+
- Description: Stream load pipes registered for merge commit operations.
1830
+
1831
+
#### merge_commit_unregister_pipe_total
1832
+
1833
+
- Unit: Count
1834
+
- Type: Cumulative
1835
+
- Description: Stream load pipes unregistered from merge commit operations.
1836
+
1837
+
Latency metrics expose percentile series such as `merge_commit_request_latency_99` and `merge_commit_request_latency_90`, reported in microseconds. The end-to-end latency obeys:
0 commit comments