|
| 1 | +--- |
| 2 | +title: "Notes on Various Errors with respect to replication and distributed connections" |
| 3 | +linkTtitle: "Notes on Various Errors with respect to replication and distributed connections" |
| 4 | +description: > |
| 5 | + Notes on errors related to replication and distributed connections |
| 6 | +keywords: |
| 7 | + - replication |
| 8 | + - distributed connections |
| 9 | +--- |
| 10 | +# Notes on Various Errors with respect to replication and distributed connections |
| 11 | + |
| 12 | +## `ClickHouseDistributedConnectionExceptions` |
| 13 | + |
| 14 | +This alert usually indicates that one of the nodes isn’t responding or that there’s an interconnectivity issue. Debug steps: |
| 15 | + |
| 16 | +## 1. Check Cluster Connectivity |
| 17 | +Verify connectivity inside the cluster by running: |
| 18 | +``` |
| 19 | +SELECT count() FROM clusterAllReplicas('{cluster}', cluster('{cluster}', system.one)) |
| 20 | +``` |
| 21 | + |
| 22 | +## 2. Check for Errors |
| 23 | +Run the following queries to see if any nodes report errors: |
| 24 | + |
| 25 | +``` |
| 26 | +SELECT hostName(), * FROM clusterAllReplicas('{cluster}', system.clusters) WHERE errors_count > 0; |
| 27 | +SELECT hostName(), * FROM clusterAllReplicas('{cluster}', system.errors) WHERE last_error_time > now() - 3600 ORDER BY value; |
| 28 | +``` |
| 29 | + |
| 30 | + Depending on the results, ensure that the affected node is up and responding to queries. Also, verify that connectivity (DNS, routes, delays) is functioning correctly. |
| 31 | + |
| 32 | +### `ClickHouseReplicatedPartChecksFailed` & `ClickHouseReplicatedPartFailedFetches` |
| 33 | + |
| 34 | +Unless you’re seeing huge numbers, these alerts can generally be ignored. They’re often a sign of temporary replication issues that ClickHouse resolves on its own. However, if the issue persists or increases rapidly, follow the steps to debug replication issues: |
| 35 | + |
| 36 | +* Check the replication status using tables such as system.replicas and system.replication_queue. |
| 37 | +* Examine server logs, system.errors, and system load for any clues. |
| 38 | +* Try to restart the replica (`SYSTEM RESTART REPLICA db_name.table_name` command) and, if necessary, contact Altinity support. |
0 commit comments