3
3
This workspace contains experimental tools that attempt to reduce the number of
4
4
rows in the ` state_groups_state ` table inside of a Synapse Postgresql database.
5
5
6
- # Automated tool: auto_compressor
6
+ # Automated tool: synapse_auto_compressor
7
7
8
8
## Introduction:
9
9
10
10
This tool is significantly more simple to use than the manual tool (described below).
11
11
It scans through all of the rows in the ` state_groups ` database table from the start. When
12
12
it finds a group that hasn't been compressed, it runs the compressor for a while on that
13
13
group's room, saving where it got up to. After compressing a number of these chunks it stops,
14
- saving where it got up to for the next run of the ` auto_compressor ` .
14
+ saving where it got up to for the next run of the ` synapse_auto_compressor ` .
15
15
16
16
It creates three extra tables in the database: ` state_compressor_state ` which stores the
17
17
information needed to stop and start the compressor for each room, ` state_compressor_progress `
@@ -21,41 +21,42 @@ which stores how far through the `state_groups` table the compressor has scanned
21
21
The tool can be run manually when you are running out of space, or be scheduled to run
22
22
periodically.
23
23
24
- ## Building
24
+ ## Building
25
25
26
26
This tool requires ` cargo ` to be installed. See https://www.rust-lang.org/tools/install
27
27
for instructions on how to do this.
28
28
29
- To build ` auto_compressor ` , clone this repository and navigate to the ` autocompressor/ `
30
- subdirectory. Then execute ` cargo build ` .
29
+ To build ` synapse_auto_compressor ` , clone this repository and navigate to the
30
+ ` synapse_auto_compressor/ ` subdirectory. Then execute ` cargo build ` .
31
31
32
- This will create an executable and store it in ` auto_compressor/target/debug/auto_compressor ` .
32
+ This will create an executable and store it in
33
+ ` synapse_auto_compressor/target/debug/synapse_auto_compressor ` .
33
34
34
35
## Example usage
35
36
```
36
- $ auto_compressor -p postgresql://user:pass@localhost/synapse -c 500 -n 100
37
+ $ synapse_auto_compressor -p postgresql://user:pass@localhost/synapse -c 500 -n 100
37
38
```
38
39
## Running Options
39
40
40
- - -p [ POSTGRES_LOCATION] ** Required**
41
+ - -p [ POSTGRES_LOCATION] ** Required**
41
42
The configuration for connecting to the Postgres database. This should be of the form
42
43
` "postgresql://username:[email protected] /database" ` or a key-value pair
43
44
string: ` "user=username password=password dbname=database host=mydomain.com" `
44
45
See https://docs.rs/tokio-postgres/0.7.2/tokio_postgres/config/struct.Config.html
45
46
for the full details.
46
47
47
- - -c [ CHUNK_SIZE] ** Required**
48
+ - -c [ CHUNK_SIZE] ** Required**
48
49
The number of state groups to work on at once. All of the entries from state_groups_state are
49
50
requested from the database for state groups that are worked on. Therefore small chunk
50
51
sizes may be needed on machines with low memory. Note: if the compressor fails to find
51
52
space savings on the chunk as a whole (which may well happen in rooms with lots of backfill
52
53
in) then the entire chunk is skipped.
53
54
54
- - -n [ CHUNKS_TO_COMPRESS] ** Required**
55
+ - -n [ CHUNKS_TO_COMPRESS] ** Required**
55
56
* CHUNKS_TO_COMPRESS* chunks of size * CHUNK_SIZE* will be compressed. The higher this
56
57
number is set to, the longer the compressor will run for.
57
58
58
- - -d [ LEVELS]
59
+ - -d [ LEVELS]
59
60
Sizes of each new level in the compression algorithm, as a comma-separated list.
60
61
The first entry in the list is for the lowest, most granular level, with each
61
62
subsequent entry being for the next highest level. The number of entries in the
@@ -67,14 +68,14 @@ given set of state. [defaults to "100,50,25"]
67
68
## Scheduling the compressor
68
69
The automatic tool may put some strain on the database, so it might be best to schedule
69
70
it to run at a quiet time for the server. This could be done by creating an executable
70
- script and scheduling it with something like
71
+ script and scheduling it with something like
71
72
[ cron] ( https://www.man7.org/linux/man-pages/man1/crontab.1.html ) .
72
73
73
74
# Manual tool: synapse_compress_state
74
75
75
76
## Introduction
76
77
77
- A manual tool that reads in the rows from ` state_groups_state ` and ` state_group_edges `
78
+ A manual tool that reads in the rows from ` state_groups_state ` and ` state_group_edges `
78
79
tables for a specified room and calculates the changes that could be made that
79
80
(hopefully) will significantly reduce the number of rows.
80
81
@@ -85,7 +86,7 @@ that if `-t` is given then each change to a particular state group is wrapped
85
86
in a transaction). If you do wish to send the changes to the database automatically
86
87
then the ` -c ` flag can be set.
87
88
88
- The SQL generated is safe to apply against the database with Synapse running.
89
+ The SQL generated is safe to apply against the database with Synapse running.
89
90
This is because the ` state_groups ` and ` state_groups_state ` tables are append-only:
90
91
once written to the database, they are never modified. There is therefore no danger
91
92
of a modification racing against a running Synapse. Further, this script makes its
@@ -95,7 +96,7 @@ from any of the queries that Synapse performs.
95
96
The tool will also ensure that the generated state deltas do give the same state
96
97
as the existing state deltas before generating any SQL.
97
98
98
- ## Building
99
+ ## Building
99
100
100
101
This tool requires ` cargo ` to be installed. See https://www.rust-lang.org/tools/install
101
102
for instructions on how to do this.
@@ -125,54 +126,54 @@ $ psql synapse < out.data
125
126
126
127
## Running Options
127
128
128
- - -p [ POSTGRES_LOCATION] ** Required**
129
+ - -p [ POSTGRES_LOCATION] ** Required**
129
130
The configuration for connecting to the Postgres database. This should be of the form
130
131
` "postgresql://username:[email protected] /database" ` or a key-value pair
131
132
string: ` "user=username password=password dbname=database host=mydomain.com" `
132
133
See https://docs.rs/tokio-postgres/0.7.2/tokio_postgres/config/struct.Config.html
133
134
for the full details.
134
135
135
- - -r [ ROOM_ID] ** Required**
136
+ - -r [ ROOM_ID] ** Required**
136
137
The room to process (this is the value found in the ` rooms ` table of the database
137
138
not the common name for the room - it should look like: "!wOlkWNmgkAZFxbTaqj: matrix .org".
138
139
139
- - -b [ MIN_STATE_GROUP]
140
+ - -b [ MIN_STATE_GROUP]
140
141
The state group to start processing from (non-inclusive).
141
142
142
- - -n [ GROUPS_TO_COMPRESS]
143
+ - -n [ GROUPS_TO_COMPRESS]
143
144
How many groups to load into memory to compress (starting
144
145
from the 1st group in the room or the group specified by -b).
145
146
146
- - -l [ LEVELS]
147
+ - -l [ LEVELS]
147
148
Sizes of each new level in the compression algorithm, as a comma-separated list.
148
- The first entry in the list is for the lowest, most granular level, with each
149
+ The first entry in the list is for the lowest, most granular level, with each
149
150
subsequent entry being for the next highest level. The number of entries in the
150
151
list determines the number of levels that will be used. The sum of the sizes of
151
152
the levels affects the performance of fetching the state from the database, as the
152
153
sum of the sizes is the upper bound on the number of iterations needed to fetch a
153
154
given set of state. [ defaults to "100,50,25"]
154
155
155
- - -m [ COUNT]
156
+ - -m [ COUNT]
156
157
If the compressor cannot save this many rows from the database then it will stop early.
157
158
158
- - -s [ MAX_STATE_GROUP]
159
+ - -s [ MAX_STATE_GROUP]
159
160
If a max_state_group is specified then only state groups with id's lower than this
160
161
number can be compressed.
161
162
162
- - -o [ FILE]
163
+ - -o [ FILE]
163
164
File to output the SQL transactions to (for later running on the database).
164
165
165
- - -t
166
+ - -t
166
167
If this flag is set then each change to a particular state group is wrapped in a
167
168
transaction. This should be done if you wish to apply the changes while synapse is
168
169
still running.
169
170
170
- - -c
171
+ - -c
171
172
If this flag is set then the changes the compressor makes will be committed to the
172
173
database. This should be safe to use while synapse is running as it wraps the changes
173
174
to every state group in it's own transaction (as if the transaction flag was set).
174
175
175
- - -g
176
+ - -g
176
177
If this flag is set then output the node and edge information for the state_group
177
178
directed graph built up from the predecessor state_group links. These can be looked
178
179
at in something like Gephi (https://gephi.org ).
@@ -196,10 +197,10 @@ $ docker-compose down
196
197
# Using the synapse_compress_state library
197
198
198
199
If you want to use the compressor in another project, it is recomended that you
199
- use jemalloc ` https://github.com/gnzlbg/jemallocator ` .
200
+ use jemalloc ` https://github.com/gnzlbg/jemallocator ` .
200
201
201
202
To prevent the progress bars from being shown, use the ` no-progress-bars ` feature.
202
- (See ` auto_compressor /Cargo.toml` for an example)
203
+ (See ` synapse_auto_compressor /Cargo.toml` for an example)
203
204
204
205
# Troubleshooting
205
206
@@ -216,29 +217,29 @@ from the machine where Postgres is running, the url will be the following:
216
217
### From remote machine
217
218
218
219
If you wish to connect from a different machine, you'll need to edit your Postgres settings to allow
219
- remote connections. This requires updating the
220
+ remote connections. This requires updating the
220
221
[ ` pg_hba.conf ` ] ( https://www.postgresql.org/docs/current/auth-pg-hba-conf.html ) and the ` listen_addresses `
221
222
setting in [ ` postgresql.conf ` ] ( https://www.postgresql.org/docs/current/runtime-config-connection.html )
222
223
223
224
## Printing debugging logs
224
225
225
- The amount of output the tools produce can be altered by setting the RUST_LOG
226
- environment variable to something.
226
+ The amount of output the tools produce can be altered by setting the RUST_LOG
227
+ environment variable to something.
227
228
228
- To get more logs when running the auto_compressor tool try the following:
229
+ To get more logs when running the synapse_auto_compressor tool try the following:
229
230
230
231
```
231
- $ RUST_LOG=debug auto_compressor -p postgresql://user:pass@localhost/synapse -c 50 -n 100
232
+ $ RUST_LOG=debug synapse_auto_compressor -p postgresql://user:pass@localhost/synapse -c 50 -n 100
232
233
```
233
234
234
- If you want to suppress all the debugging info you are getting from the
235
+ If you want to suppress all the debugging info you are getting from the
235
236
Postgres client then try:
236
237
237
238
```
238
- RUST_LOG=auto_compressor =debug,synapse_compress_state=debug auto_compressor [etc.]
239
+ RUST_LOG=synapse_auto_compressor =debug,synapse_compress_state=debug synapse_auto_compressor [etc.]
239
240
```
240
241
241
- This will only print the debugging information from those two packages. For more info see
242
+ This will only print the debugging information from those two packages. For more info see
242
243
https://docs.rs/env_logger/0.9.0/env_logger/ .
243
244
244
245
## Building difficulties
@@ -248,7 +249,7 @@ and building on Linux will also require `pkg-config`
248
249
249
250
This can be done on Ubuntu with: ` $ apt-get install libssl-dev pkg-config `
250
251
251
- Note that building requires quite a lot of memory and out-of-memory errors might not be
252
+ Note that building requires quite a lot of memory and out-of-memory errors might not be
252
253
obvious. It's recomended you only build these tools on machines with at least 2GB of RAM.
253
254
254
255
## Auto Compressor skips chunks when running on already compressed room
@@ -265,8 +266,8 @@ be a large problem.
265
266
266
267
## Compressor is trying to increase the number of rows
267
268
268
- Backfilling can lead to issues with compression. The auto_compressor will
269
- skip chunks it can't reduce the size of and so this should help jump over the backfilled
269
+ Backfilling can lead to issues with compression. The synapse_auto_compressor will
270
+ skip chunks it can't reduce the size of and so this should help jump over the backfilled
270
271
state_groups. Lots of state resolution might also impact the ability to use the compressor.
271
272
272
273
To examine the state_group hierarchy run the manual tool on a room with the ` -g ` option
0 commit comments