Skip to content

push_rules_stream table getting huge #19313

@dawagner

Description

@dawagner

Description

Hello,

I'm managing a (federated) synapse server with 40 users and my push_rules_stream table weighs 16GB. It also seems to be increasing but I'm not sure. That table doesn't have any date column so I can't tell how old are most rows but I noticed this:

> SELECT 'redacted' AS user_id, count(*) FROM push_rules_stream GROUP BY user_id;

"redacted"	5
"redacted"	634312
"redacted"	178
"redacted"	83
"redacted"	19
"redacted"	1187516
"redacted"	19
"redacted"	99
"redacted"	114
"redacted"	32
"redacted"	22
"redacted"	47
"redacted"	28
"redacted"	3
"redacted"	42
"redacted"	67
"redacted"	63
"redacted"	8
"redacted"	4
"redacted"	6
"redacted"	76
"redacted"	62
"redacted"	27
"redacted"	265
"redacted"	131
"redacted"	40
"redacted"	1
"redacted"	11752101
"redacted"	24
"redacted"	29792667
"redacted"	12805107
"redacted"	66
"redacted"	74
"redacted"	8
"redacted"	124
"redacted"	46
"redacted"	20795959
"redacted"	51
"redacted"	96
"redacted"	77

In other words, only a handful of users have produced almost 100% of the rows. Maybe this is triggered by an unexpected behaviour in the clients but it also really looks like a bug in the server, in that they should probably be cleared out after some time. Note that even the events were cleared out, because of how postgresql's autovacuuming works, and depending on the pattern with which the events occur over time, autovacuuming might not even help or might need to be tuned. That requires some non-trivial DBA knowledge.

I've noticed these tickets: #13456 and #5888 but they haven't move in a while.

Until this is fixed, is there some way I can identify and remove unneeded rows from that table ?

BTW, my state_groups_state table is also huge (11GB) and I have a few more big ones. There are some open issues already about some of them but they don't seem to have much activity.

Steps to reproduce

N/A

Homeserver

my own homeserver

Synapse Version

1.138.0

Installation Method

Debian packages from packages.matrix.org

Database

Using PostgreSQL. Single server.

Workers

Single process

Platform

Debian 12

Configuration

No response

Relevant log output

N/A

Anything else that would be useful to know?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions