Skip to content

Conversation

hpk42
Copy link
Contributor

@hpk42 hpk42 commented Sep 14, 2025

Before this PR, in https://github.com/chatmail/relay/blob/1.7.0/cmdeploy/src/cmdeploy/dovecot/expunge.cron.j2 we execute a number of "find" commands, which at least on nine.testrun.org take overall 45+ minutes to complete, and access the full directory structure 9 times (kernel caching helps a little). The new chatmaild.expiry traverses all mailboxes only once and has tests.

On the server side, you can run "python -m chatmaild.expire /usr/local/lib/chatmail.ini /home/vmail/mail/{domain}" to see what it would remove. Only if you specify "--remove" will it actually remove it. See the invocation also in the new cron job with a single line.

The PR adds also "python -m chatmaild.fsreport" which gives various statistics of file usages, last login times etc. This is not run in any cron job and purely for curiosity purposes (and to help determine what we might do if storage becomes full again).

@hpk42 hpk42 force-pushed the hpk/expiry branch 3 times, most recently from 50bd43d to 081642c Compare September 16, 2025 10:06
…them immediately, provide a few command line options to select
echobot = "chatmaild.echo:main"
chatmail-metrics = "chatmaild.metrics:main"
delete_inactive_users = "chatmaild.delete_inactive_users:main"
expire = "chatmaild.expire:main"
Copy link
Contributor

@adbenitez adbenitez Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta: would be nicer if all chatmail-added commands started with chatmail- like the chatmail-metrics command, this makes it easier to discover by typing chatmail+TAB and get autocompletion of available chatmail commands while avoiding conflicts with other tools with same name ex "expire" seems quite generic

Copy link
Contributor Author

@hpk42 hpk42 Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree about some common prefix, maybe "cm-" is enough. not sure this PR here should change it though.

def main(args=None):
"""Report about filesystem storage usage of all mailboxes and messages"""
parser = ArgumentParser(description=main.__doc__)
# parser.add_argument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code.

self.mdir = mdir

self.num_ci_logins = self.num_all_logins = 0
self.login_buckets = dict((x, 0) for x in (1, 10, 30, 40, 80, 100, 150))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.login_buckets = dict((x, 0) for x in (1, 10, 30, 40, 80, 100, 150))
self.login_buckets = {(x, 0) for x in (1, 10, 30, 40, 80, 100, 150)}

help="maximum number of mailbxoes to iterate on",
)

args = parser.parse_args([str(x) for x in args] if args else args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why arguments need to be converted to str? args is even None outside the tests, and seems to be a list of strings already inside the tests.

example invocation:
python -m chatmaild.fsreport /home/vmail/mail/nine.testrun.org
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nine.testrun.org is not an example domain.

"chatmail_ini", action="store", help="path pointing to chatmail.ini file"
)
parser.add_argument(
"mailboxes_dir", action="store", help="path to directory of mailboxes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chatmail.ini is a required argument already, mailboxes directory can be determined from there, no need for users to specify it manually.



def K(size):
if size < 1000:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M always returns megabytes, but K somehow sometimes returns bytes? Should this logic go into H?

return f"{size:6.0f}"
elif size < 10000:
return f"{size / 1000:3.2f}K"
return f"{int(size / 1000):5.0f}K"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you format an integer using an f (float) formatter? I guess int is not needed and the same effect can be achieved with a formatter, but no idea what the expected result is.

return f"{int(size / 1000000):5.0f}M"


def H(size):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected this to be used everywhere to format numbers, but now it seems that K and M are used somewhat arbitrarily directly, likely based on the numbers expected. But then for small servers where the number of users is below 100 it does not make sense to format them with K (except that, well, K currently has some logic that should have been in H, so it sort of works).


self.num_ci_logins = self.num_all_logins = 0
self.login_buckets = dict((x, 0) for x in (1, 10, 30, 40, 80, 100, 150))
self.message_buckets = dict((x, 0) for x in (0, 160000, 500000, 2000000))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these numbers come from? Also for login_buckets?

exp = Expiry(config, dry=not args.remove, now=now, verbose=args.verbose)
for mailbox in iter_mailboxes(os.path.abspath(args.mailboxes_dir), maxnum=maxnum):
exp.process_mailbox_stat(mailbox)
print(exp.get_summary())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is meant to with cron, it should not print anything. cron assumes that any job that printed something failed and sends the message to the user (vmail I guess).

I think the best way to handle it is to turn expire into a long-running service so its logs can be inspected with journalctl. This only needs a long-running loop which sleeps for a day in-between.

This way we will also be able to collect metrics by parsing its logs with mtail and e.g. display how many accounts or messages expired during the day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants