You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
'Service to optimize stale GraphiteMergeTree tables
7
+
This software looking for tables with GraphiteMergeTree engine and evaluate if some of partitions should be optimized. It could work both as one-shot script and background daemon.'
Copy file name to clipboardExpand all lines: README.md
+56-4Lines changed: 56 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,35 @@
1
1
# Service to optimize stale GraphiteMergeTree tables
2
+
When you use [GraphiteMergeTree](https://clickhouse.yandex/docs/en/operations/table_engines/graphitemergetree) in ClickHouse DBMS, it applies retention policies from `system.graphite_retentions` configuration during merge processes. Unfortunately, ClickHouse doesn't launch merges for partitions a) without active inserts or b) with only one part in. It means, that it never will watch for the actual retention scheme applied to partitions.
2
3
This software looking for tables with GraphiteMergeTree engine and evaluate if some of partitions should be optimized. It could work both as one-shot script and background daemon.
3
4
5
+
## Build
6
+
To build a binary just run `make`.
7
+
8
+
You should have make, golang and [fpm](https://github.com/jordansissel/fpm) installed to build packages. To build packages run one of the following:
9
+
10
+
```
11
+
make packages
12
+
make deb
13
+
make rpm
14
+
```
15
+
4
16
## FAQ
5
17
* The `go` version 1.13 or newer is required
6
-
*Deamon mode is preferable over one-shot script for the normal work
18
+
*Daemon mode is preferable over one-shot script for the normal work
7
19
* It's safe to run it on the cluster hosts
8
20
* You could either run it on the one of replicated host or just over the all hosts
9
21
* If you have big partitions (month or something like this) and will get exceptions about timeout, then you need to adjust `read_timeout` parameter in DSN
10
22
*`optimize_throw_if_noop=1` is not mandatory, but good to have.
23
+
* The next picture demonstrates the result of running the daemon for the first time on ~3 years old GraphiteMergeTree table:
24
+
<imgsrc="./docs/result.jpg"alt="example"/>
11
25
12
26
### Details
13
-
The next query is executed as search for the partitions to optimize:
27
+
The next query is executed with some additional conditions as search for the partitions to optimize:
14
28
15
29
```sql
16
30
SELECT
17
31
concat(p.database, '.', p.table) AS table,
32
+
p.partition_idAS partition_id,
18
33
p.partitionAS partition,
19
34
max(g.age) AS age,
20
35
countDistinct(p.name) AS parts,
@@ -46,7 +61,9 @@ ORDER BY
46
61
age ASC
47
62
```
48
63
49
-
Before and after running you could run the next query:
64
+
#### The next queries could be executed before and after the daemon running
65
+
66
+
* Detailed info about each partition of GraphiteMergeTree tables:
50
67
51
68
```sql
52
69
SELECT
@@ -83,7 +100,41 @@ ORDER BY
83
100
active ASC
84
101
```
85
102
86
-
It will show general info about every GraphiteMergeTree table on the server.
103
+
* Summary about each GraphiteMergeTree table:
104
+
105
+
```sql
106
+
SELECT
107
+
database,
108
+
table,
109
+
count() AS parts,
110
+
active,
111
+
min(min_date) AS min_date,
112
+
max(max_date) AS max_date,
113
+
formatReadableSize(sum(bytes_on_disk)) AS size,
114
+
sum(rows) AS rows
115
+
FROMsystem.parts
116
+
INNER JOIN
117
+
(
118
+
SELECT
119
+
Tables.databaseAS database,
120
+
Tables.tableAS table
121
+
FROMsystem.graphite_retentions
122
+
ARRAY JOIN Tables
123
+
GROUP BY
124
+
database,
125
+
table
126
+
) USING (database, table)
127
+
GROUP BY
128
+
database,
129
+
table,
130
+
active
131
+
ORDER BY
132
+
database ASC,
133
+
table ASC,
134
+
active ASC
135
+
```
136
+
137
+
They will show general info about every GraphiteMergeTree table on the server.
87
138
88
139
## Run the graphite-ch-optimizer
89
140
If you run the ClickHouse locally, you could just run `graphite-ch-optimizer -n --log-level debug` and see how many partitions on the instance are able to be merged automatically.
@@ -111,6 +162,7 @@ Possible command line arguments:
111
162
Usage of graphite-ch-optimizer:
112
163
-c, --config string Filename of the custom config. CLI arguments override it
113
164
--print-defaults Print default config values and exit
165
+
-v, --version Print version and exit
114
166
--optimize-interval duration The active partitions won't be optimized more than once per this interval, seconds (default 72h0m0s)
115
167
-s, --server-dsn string DSN to connect to ClickHouse server (default "tcp://localhost:9000?&optimize_throw_if_noop=1&read_timeout=3600&debug=true")
116
168
-n, --dry-run Will print how many partitions would be merged without actions
0 commit comments