You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 23, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: docker/docker-cluster/metrictank.ini
+11-7
Original file line number
Diff line number
Diff line change
@@ -9,12 +9,14 @@ accounting-period = 5min
9
9
## data ##
10
10
11
11
# see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for more details
12
+
12
13
# duration of raw chunks. e.g. 10min, 30min, 1h, 90min...
14
+
# must be valid value as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
13
15
chunkspan = 10min
14
16
# number of raw chunks to keep in in-memory ring buffer
15
-
#note that the chunk-cache (settings further down) is a more effective method to cache data and alleviate workload for cassandra.
16
-
#but this allows secondary nodes to keep serving data in case the primary is not able to save data upto chunkspan*numchunks")
17
-
numchunks = 5
17
+
#See https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for details and trade-offs, especially when compared to chunk-cache
18
+
#(settings further down) which may be a more effective method to cache data and alleviate workload for cassandra.
19
+
numchunks = 7
18
20
# minimum wait before raw metrics are removed from storage
19
21
ttl = 35d
20
22
@@ -33,11 +35,13 @@ warm-up-period = 1h
33
35
# settings for rollups (aggregation for archives)
34
36
# comma-separated list of archive specifications.
35
37
# archive specification is of the form: aggSpan:chunkSpan:numChunks:TTL[:ready as bool. default true]
36
-
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false
37
-
# 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
38
-
# 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
38
+
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false you get:
39
+
#- 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
40
+
#- 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
39
41
# When running a cluster of metrictank instances, all instances should have the same agg-settings.
40
-
# chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
42
+
# Note:
43
+
# * chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
44
+
# * numchunks -like the global setting- has nuanced use compared to chunk cache. see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md
Copy file name to clipboardExpand all lines: docker/docker-dev-custom-cfg-kafka/metrictank.ini
+10-6
Original file line number
Diff line number
Diff line change
@@ -9,11 +9,13 @@ accounting-period = 5min
9
9
## data ##
10
10
11
11
# see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for more details
12
+
12
13
# duration of raw chunks. e.g. 10min, 30min, 1h, 90min...
14
+
# must be valid value as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
13
15
chunkspan = 2min
14
16
# number of raw chunks to keep in in-memory ring buffer
15
-
#note that the chunk-cache (settings further down) is a more effective method to cache data and alleviate workload for cassandra.
16
-
#but this allows secondary nodes to keep serving data in case the primary is not able to save data upto chunkspan*numchunks")
17
+
#See https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for details and trade-offs, especially when compared to chunk-cache
18
+
#(settings further down) which may be a more effective method to cache data and alleviate workload for cassandra.
17
19
numchunks = 2
18
20
# minimum wait before raw metrics are removed from storage
19
21
ttl = 35d
@@ -33,11 +35,13 @@ warm-up-period = 1h
33
35
# settings for rollups (aggregation for archives)
34
36
# comma-separated list of archive specifications.
35
37
# archive specification is of the form: aggSpan:chunkSpan:numChunks:TTL[:ready as bool. default true]
36
-
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false
37
-
# 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
38
-
# 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
38
+
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false you get:
39
+
#- 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
40
+
#- 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
39
41
# When running a cluster of metrictank instances, all instances should have the same agg-settings.
40
-
# chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
42
+
# Note:
43
+
# * chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
44
+
# * numchunks -like the global setting- has nuanced use compared to chunk cache. see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md
Copy file name to clipboardExpand all lines: docs/config.md
+10-7
Original file line number
Diff line number
Diff line change
@@ -36,11 +36,12 @@ accounting-period = 5min
36
36
```
37
37
# see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for more details
38
38
# duration of raw chunks. e.g. 10min, 30min, 1h, 90min...
39
+
# must be valid value as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
39
40
chunkspan = 10min
40
41
# number of raw chunks to keep in in-memory ring buffer
41
-
# note that the chunk-cache (settings further down) is a more effective method to cache data and alleviate workload for cassandra.
42
-
# but this allows secondary nodes to keep serving data in case the primary is not able to save data upto chunkspan*numchunks")
43
-
numchunks = 5
42
+
# See https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for details and trade-offs, especially when compared to chunk-cache
43
+
# (settings further down) which may be a more effective method to cache data and alleviate workload for cassandra.
44
+
numchunks = 7
44
45
# minimum wait before raw metrics are removed from storage
45
46
ttl = 35d
46
47
# max age for a chunk before to be considered stale and to be persisted to Cassandra
@@ -56,11 +57,13 @@ warm-up-period = 1h
56
57
# settings for rollups (aggregation for archives)
57
58
# comma-separated list of archive specifications.
58
59
# archive specification is of the form: aggSpan:chunkSpan:numChunks:TTL[:ready as bool. default true]
59
-
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false
60
-
# 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
61
-
# 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
60
+
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false you get:
61
+
# - 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
62
+
# - 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
62
63
# When running a cluster of metrictank instances, all instances should have the same agg-settings.
63
-
# chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
64
+
# Note:
65
+
# * chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
66
+
# * numchunks -like the global setting- has nuanced use compared to chunk cache. see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md
Copy file name to clipboardExpand all lines: docs/memory-server.md
+11-7
Original file line number
Diff line number
Diff line change
@@ -9,11 +9,9 @@ It has two mechanisms to support this: the ring buffers, and the chunk-cache. T
9
9
10
10
The ring buffer is simply a list of chunks - one for each series - that holds the latest data for each series that has been ingested (or generated, for rollup series).
11
11
You can configure how many chunks to retain (`numchunks`).
12
-
* The main function of the ring buffers is to keep secondaries able to satisfy queries from RAM, even if the primary is not able to save its chunks instantly, or if the primary
13
-
crashed and needs to be restarted. Effectively, the more data in your ring buffer, the longer outages of a primary you can sustain. (up to `(numchunks-1) * chunkspan` in duration)
14
-
* For keeping a "hot cache" of frequently accessed data, this is not necessarily an effective solution, since the same `numchunks` is applied to all raw series
15
-
(and aggregation settings are applied to all series in the same fashion, so a given rollup frequency will have the same `numchunks` for all series)
16
-
So unless you're confident your metrics are all subject to queries of the same timeranges, and that they are predictable, you should look at the chunk cache below.
12
+
The ring buffer can be useful to assure data that may be needed is in memory, in these cases:
13
+
* you know a majority of your queries hits the most recent data of a given time window (e.g. last 2 hours, last day), you know this is unlikely to change and true for the vast majority of your metrics.
14
+
* keep secondaries able to satisfy queries from RAM for the most recent data of cold (infrequently queried) series, even if the primary is not able to save its chunks instantly, if it crashed and needs to be restarted or if you're having a cassandra outage so that chunks can't be loaded or saved. Note that this does not apply for hot data: data queried frequently enough (at least as frequent as their chunkspan) will be added to the chunk cache automatically (see below) and not require cassandra lookups.
17
15
18
16
Note:
19
17
* the last (current) chunk is always a "work in progress", so depending on what time it is, it may be anywhere between empty and full.
@@ -22,10 +20,16 @@ Note:
22
20
23
21
Both of these make it tricky to articulate how much data is in the ringbuffer for a given series. But `(numchunks-1) * chunkspan` is the conservative approximation which is valid in the typical case (a warmed up metrictank that's ingesting fresh data).
24
22
23
+
For keeping a "hot cache" of frequently accessed data in a more flexible way, this is not an effective solution, since the same `numchunks` is applied to all raw series
24
+
(and aggregation settings are applied to all series in the same fashion, so a given rollup frequency will have the same `numchunks` for all series)
25
+
So unless you're confident your metrics are all subject to queries of the same timeranges, and that they are predictable, you should look at the chunk cache below.
26
+
25
27
### Chunk Cache
26
28
27
29
The goal of the chunk cache is to offload as much read workload from cassandra as possible.
28
30
Any data chunks fetched from Cassandra are added to the chunk cache.
31
+
But also, more interestingly, chunks expired out of the ring buffers will automatically be added to the chunk cache if the chunk before it is also in the cache.
32
+
In other words, for series we know to be "hot" (queried frequently enough so that their data is kept in the chunk cache) we will try to avoid a roundtrip to Cassandra before adding the chunks to the cache. This can be especially useful when it takes long for the primary to save data to cassandra, or when there is a cassandra outage.
29
33
The chunk cache has a configurable [maximum size](https://github.com/raintank/metrictank/blob/master/docs/config.md#chunk-cache),
30
34
within that size it tries to always keep the most often queried data by using an LRU mechanism that evicts the Least Recently Used chunks.
31
35
@@ -92,8 +96,8 @@ We plan to keep working on performance and memory management and hope to make th
92
96
93
97
In principle, you need just 1 chunk for each series.
94
98
However:
95
-
* when the data stream moves into a new chunk, secondary nodes would drop the previous chunk and query Cassandra. But the primary needs some time to save the chunk to Cassandra. Based on your deployment this could take anywhere between milliseconds or many minutes. As you don't want to slam Cassandra with requests at each chunk clear, you should probably use a numchunks of 2, or a numchunks that lets you retain data in memory for however long it takes to flush data to cassandra.
96
-
* The ringbuffers are a great tool to let you deal with crashes or outages of your primary node. If your primary went down, or for whatever reason cannot save data to Cassandra, then you won't even feel it if the ringbuffers can "clear the gap" between in memory data and older data in cassandra. So we advise to think about how fast your organisation could resolve a potential primary outage, and then set your parameters such that `(numchunks-1) * chunkspan` is more than that.
99
+
* when the data stream moves into a new chunk, secondary nodes would drop the previous chunk and query Cassandra. But the primary needs some time to save the chunk to Cassandra. Based on your deployment this could take anywhere between milliseconds or many minutes. Possibly even an hour or more. As you don't want to slam Cassandra with requests at each chunk clear, you should probably use a numchunks of 2, or a numchunks that lets you retain data in memory for however long it takes to flush data to cassandra. (though the chunk cache alleviates this concern for hot data, see above).
100
+
* The ringbuffers can be useful to let you deal with crashes or outages of your primary node. If your primary went down, or for whatever reason cannot save data to Cassandra, then you won't even feel it if the ringbuffers can "clear the gap" between in memory data and older data in cassandra. So we advise to think about how fast your organisation could resolve a potential primary outage, and then set your parameters such that `(numchunks-1) * chunkspan` is more than that. (again, with a sufficiently large cache, this is only a concern for cold data)
97
101
98
102
#### Rollups remove the need to keep large number of higher resolution chunks
Copy file name to clipboardExpand all lines: metrictank-sample.ini
+11-7
Original file line number
Diff line number
Diff line change
@@ -12,12 +12,14 @@ accounting-period = 5min
12
12
## data ##
13
13
14
14
# see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for more details
15
+
15
16
# duration of raw chunks. e.g. 10min, 30min, 1h, 90min...
17
+
# must be valid value as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
16
18
chunkspan = 10min
17
19
# number of raw chunks to keep in in-memory ring buffer
18
-
#note that the chunk-cache (settings further down) is a more effective method to cache data and alleviate workload for cassandra.
19
-
#but this allows secondary nodes to keep serving data in case the primary is not able to save data upto chunkspan*numchunks")
20
-
numchunks = 5
20
+
#See https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for details and trade-offs, especially when compared to chunk-cache
21
+
#(settings further down) which may be a more effective method to cache data and alleviate workload for cassandra.
22
+
numchunks = 7
21
23
# minimum wait before raw metrics are removed from storage
22
24
ttl = 35d
23
25
@@ -36,11 +38,13 @@ warm-up-period = 1h
36
38
# settings for rollups (aggregation for archives)
37
39
# comma-separated list of archive specifications.
38
40
# archive specification is of the form: aggSpan:chunkSpan:numChunks:TTL[:ready as bool. default true]
39
-
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false
40
-
# 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
41
-
# 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
41
+
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false you get:
42
+
#- 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
43
+
#- 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
42
44
# When running a cluster of metrictank instances, all instances should have the same agg-settings.
43
-
# chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
45
+
# Note:
46
+
# * chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
47
+
# * numchunks -like the global setting- has nuanced use compared to chunk cache. see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md
Copy file name to clipboardExpand all lines: metrictank.go
+1-1
Original file line number
Diff line number
Diff line change
@@ -58,7 +58,7 @@ var (
58
58
59
59
// Data:
60
60
chunkSpanStr=flag.String("chunkspan", "10min", "duration of raw chunks")
61
-
numChunksInt=flag.Int("numchunks", 5, "number of raw chunks to keep in in-memory ring buffer. note that the chunk-cache is a more effective method to cache data and alleviate workload for cassandra. but this allows secondary nodes to keep serving data in case the primary is not able to save data upto chunkspan*numchunks")
61
+
numChunksInt=flag.Int("numchunks", 7, "number of raw chunks to keep in in-memory ring buffer. See https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for details and trade-offs, especially when compared to chunk-cache")
62
62
ttlStr=flag.String("ttl", "35d", "minimum wait before metrics are removed from storage")
63
63
64
64
chunkMaxStaleStr=flag.String("chunk-max-stale", "1h", "max age for a chunk before to be considered stale and to be persisted to Cassandra.")
Copy file name to clipboardExpand all lines: scripts/config/metrictank-docker.ini
+11-7
Original file line number
Diff line number
Diff line change
@@ -9,12 +9,14 @@ accounting-period = 5min
9
9
## data ##
10
10
11
11
# see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for more details
12
+
12
13
# duration of raw chunks. e.g. 10min, 30min, 1h, 90min...
14
+
# must be valid value as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
13
15
chunkspan = 10min
14
16
# number of raw chunks to keep in in-memory ring buffer
15
-
#note that the chunk-cache (settings further down) is a more effective method to cache data and alleviate workload for cassandra.
16
-
#but this allows secondary nodes to keep serving data in case the primary is not able to save data upto chunkspan*numchunks")
17
-
numchunks = 5
17
+
#See https://github.com/raintank/metrictank/blob/master/docs/memory-server.md for details and trade-offs, especially when compared to chunk-cache
18
+
#(settings further down) which may be a more effective method to cache data and alleviate workload for cassandra.
19
+
numchunks = 7
18
20
# minimum wait before raw metrics are removed from storage
19
21
ttl = 35d
20
22
@@ -33,11 +35,13 @@ warm-up-period = 1h
33
35
# settings for rollups (aggregation for archives)
34
36
# comma-separated list of archive specifications.
35
37
# archive specification is of the form: aggSpan:chunkSpan:numChunks:TTL[:ready as bool. default true]
36
-
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false
37
-
# 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
38
-
# 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
38
+
# with these aggregation rules: 5min:1h:2:3mon,1h:6h:2:1y:false you get:
39
+
#- 5 min of data, store in a chunk that lasts 1hour, keep 2 chunks in in-memory ring buffer, keep for 3months in cassandra
40
+
#- 1hr worth of data, in chunks of 6 hours, 2 chunks in in-memory ring buffer, keep for 1 year, but this series is not ready yet for querying.
39
41
# When running a cluster of metrictank instances, all instances should have the same agg-settings.
40
-
# chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
42
+
# Note:
43
+
# * chunk spans must be valid values as described here https://github.com/raintank/metrictank/blob/master/docs/memory-server.md#valid-chunk-spans
44
+
# * numchunks -like the global setting- has nuanced use compared to chunk cache. see https://github.com/raintank/metrictank/blob/master/docs/memory-server.md
0 commit comments