Skip to content

Conversation

@Goddesen
Copy link

@Goddesen Goddesen commented Apr 19, 2025

This PR adds optional interval handling for all retention filter flags of the prune command, previously only available on --keep-within. E.g. prune --keep-hourly=7d will keep hourly archives for the last 7 days regardless of count. Adds --keep which acts as a combination of --keep-last and --keep-within.

Implements #8637.

I opted to make the existing flags handle both ints and intervals instead of adding new flags as there are already so many flags for this command. Simplified some prune filtering: With the default filter function now also handling intervals, prune_within is no longer needed as a special case.

I added a library to freeze time in testing, let me know if that's not wanted and I'll figure out something else. It's such a hassle to deal with timestamps relative to now() in test. The tests should be fairly comprehensive in checking both their timely filter (hourly/yearly, etc.) and the new inclusive timestamp check. I did not add new helper tests for prune_split as this function is not used anywhere other than prune_cmd and isn't really a helper.

TODOs:

  • New complicated example using intervals, showing how they overlap and interact with simple ints. Test that confirms this example (currently no test at all for overlapping intervals).
  • Docs once exact implementation is decided.

TODOs from comments:

  • Figure out default values and their interaction with the must pass at least one flag check.

Notes:

  • New retention flag --keep: Merges functionality of --keep-within and --keep-last with new int-or-interval handling.
  • --keep-last is no longer an alias of --keep-secondly and thus keeps archives made on the same second. It now fits together with --keep-within and new flag --keep.
  • Intervals now support 0 seconds. While working with int_or_interval and creating timedelta objects it seemed weird to restrict input like this. Not extremely useful unless you want to prune on archives in the same second as they were created, but it seemed logical when setting up tests to verify new --keep-last behavior.. Technically breaking, but will likely not meaningfully affect any real scenarios.
  • Timestamp comparisons for retention intervals are now inclusive on seconds. Matches (my personal) human intuition: If it is currently xx:xx:10 and there's an archive at xx:xx:05 then --keep-within 5S should cover that archive. Easy change to make when already altering the filtering logic. Technically breaking, but will likely not meaningfully affect any real scenarios.

This has been a pet peeve of mine in the pruning command for a long time. In my mind the most clean backup regime keeps all backups for a short time (allows catching small errors quickly), then hourly backups for a reasonable time (say, 7 days), then daily backups for a little longer and then finally weekly/monthly as storage permits. This was not previously possible, requiring for example --keep-within 7d --keep-hourly 168 --keep-daily 14 --keep-weekly -1 for an approximation. But keeping around 168 archives for a machine that's only running a few hours a day seems mighty excessive. So here we are :)
With this implemented my ideal retention for my primary laptop with archives every 15m looks something like --keep 3d --keep-hourly 7d --keep-daily 30d --keep-weekly -1.

@PhrozenByte
Copy link
Contributor

This is loosely related to #8715 as well.

I like the idea very much (I didn't and can't review the code, but I'm happy to help with the docs if desired). Borg's current retention policy IMHO is rather hard to understand for beginners (but very safe, because it usually keeps more than what users expect) and an interval-based approach feels more intuitive to me. However, special care is necessary to not cause unexpected behaviour, especially with frequent backups (e.g. daily), combined with overlapping rules (e.g. including all of within, daily, weekly, monthly, and yearly), and some missing backups (and thus the need to sometimes keep the "next best" to later fulfil less frequent rules). As far as I remember there were some issues with this in the early days of Borg that at least caused multiple major docs overhauls (not sure whether the behaviour actually changed substantially, but the docs definitely did because many users understood things wrong). Maybe Thomas can give some insights about the challenges back then?

@Goddesen Goddesen force-pushed the prune-timely-by-interval branch from 82c9e4e to da77550 Compare April 20, 2025 08:56
Copy link
Member

@ThomasWaldmann ThomasWaldmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Some minor stuff I found...

@Goddesen Goddesen force-pushed the prune-timely-by-interval branch from 64c48ed to 38fbd43 Compare April 21, 2025 13:15
@Goddesen
Copy link
Author

It seems the GHA test runner did not like the previous tzinfo=None change in helpers_test.py. Replaced with an implementation in MockArchive more alike to the one for real archives, let's see if that plays nicely.

@PhrozenByte
Copy link
Contributor

New retention flag --keep: Merges functionality of --keep-within and --keep-last with new int-or-interval handling.

I'd vote for removing --keep-within and --keep-last in favour of the new --keep option with Borg 2.0 for consistency with other --keep-* options. If this gets backported to Borg 1.x, they should be deprecated. WDYT?

Timestamp comparisons for retention intervals are now inclusive on seconds. Matches (my personal) human intuition: If it is currently xx:xx:10 and there's an archive at xx:xx:05 then --keep-within 5S should cover that archive. Easy change to make when already altering the filtering logic.

IMHO this must be consistent with #8776. Question is what we agree on. I honestly believe differently: If it is xx:xx:10 now and what to cover the last five seconds, that's exclusive the xx:xx:05 second, because I also count archives that are created at xx:xx:10 (i.e. seconds 10, 9, 8, 7, and 6). WDYT?

@ThomasWaldmann
Copy link
Member

If you want to be able to backport this to 1.4-maint, you must not break compatibility in this PR, but deprecating some options that can be replaced by a better new option can be done here.

@codecov
Copy link

codecov bot commented Apr 21, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.91%. Comparing base (d602cf2) to head (c57cecd).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8775      +/-   ##
==========================================
+ Coverage   75.84%   75.91%   +0.07%     
==========================================
  Files          86       86              
  Lines       14742    14758      +16     
  Branches     2194     2192       -2     
==========================================
+ Hits        11181    11204      +23     
+ Misses       2888     2885       -3     
+ Partials      673      669       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Goddesen
Copy link
Author

If you want to be able to backport this to 1.4-maint, you must not break compatibility in this PR, but deprecating some options that can be replaced by a better new option can be done here.

How do you do separation of functionality like this when a breaking 2.0 change is to be backported with deprecations instead? The code might turn out very different, especially with changes introduced on master.

@Goddesen
Copy link
Author

IMHO this must be consistent with #8776. Question is what we agree on. I honestly believe differently: If it is xx:xx:10 now and what to cover the last five seconds, that's exclusive the xx:xx:05 second, because I also count archives that are created at xx:xx:10 (i.e. seconds 10, 9, 8, 7, and 6). WDYT?

I don't think #8776 is relevant, as it matches absolute timestamps based on a pattern (like how the prune period grouping functions work) and does not deal with relative intervals where this consideration is needed. I am not familiar with the code being touched in that PR but I don't think there is overlap with the work here.

I had a whole comment here written out defending the inclusive check based on comparing timestamps only using seconds-granularity. In that scenario this makes sense. Having taken a closer look I see that archive.save explicitly saves timestamps with microseconds and the second-precision timestamps I have been seeing have all been test values. Good thing I checked. In that case the change from > to >= only makes for one microsecond of difference. I'm comfortable reverting this.

@Goddesen
Copy link
Author

Comments fixed, inclusive timestamp change reverted with tests made slightly more robust. If that's it for implementation comments I'll extend the documentation with an involved example like the retention policy I have described in the PR description and write a test for it.

I am still not sure how to go about doing breaking change on 2.0 and backporting change with deprecations to 1.*. @ThomasWaldmann ?

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 24, 2025

Easiest is to backport a non-breaking change and then do the breaking change in a separate PR afterwards (and not backport that).

There are some code structure differences between master and 1.4-maint though, so even backporting the first PR might not be trivial. As 1.4.x is a stable release, we need to be very careful to not break anything there. That sometimes can mean "no backport" if the change is too big or too risky.

@Goddesen
Copy link
Author

Last explicit default removed.

There are some code structure differences between master and 1.4-maint though, so even backporting the first PR might not be trivial. As 1.4.x is a stable release, we need to be very careful to not break anything there. That sometimes can mean "no backport" if the change is too big or too risky.

The changiest change introduced here is allowing intervals to be of length 0. Since this is an extension of functionality I think it's a fairly safe backport.

Am I correct in assuming I just put entries for --keep-within and --keep-last in deprecations in preprocess_args in archiver/__init__.py?

@PhrozenByte
Copy link
Contributor

In regards to backporting this to 1.4-maint and breaking changes:

I kinda feel like I accidentally caused some confusion, so: Do we even have breaking changes right now?

The only "breaking" change is that --keep-within 0 works now. Since that is undocumented anyway, I personally wouldn't consider that a breaking change. The only breaking change would be to drop --keep-within and --keep-last in favour of the new --keep. I still strongly vote for doing that. However, just for 2.0: We can safely keep them for 1.4 and mark them as deprecated. They're just aliases for --keep with this PR, but don't really change functionality, do they?

@Goddesen
Copy link
Author

Goddesen commented Apr 24, 2025

The only "breaking" change is that --keep-within 0 works now.

Ah I forgot there is one more thing: --keep-last is no longer an alias for --keep-secondly. A little more breaky than the interval change I think, but I struggle to believe this behavior is heavily relied upon in the wild? That is, unless you read the docs about secondly pruning and then decide to just use --keep-last for some reason.

But if you want, it's pretty easy to just not introduce that change now, and do it in the breaking PR instead.

@PhrozenByte
Copy link
Contributor

Ah I forgot there is one more thing: --keep-last is no longer an alias for --keep-secondly.

There's only a difference if someone actually managed to create multiple archives within the same second, right? I'd say that's similar to 0 intervals: Sure, it's a breaking change in theory, but in practice?

Anyway, I don't think we really have a problem here. If we decide on dropping --keep-last (and --keep-within for this matter) with Borg 2.x anyway, we can simply let --keep-last stay an alias for --keep-secondly (whose behaviour hasn't changed, right?) with Borg 1.4. Users that want to try the "true" and new "last" with Borg 1.x can use the new --keep. Same for 0 intervals: Adding a check and explicitly rejecting 0 intervals with just Borg 1.4 isn't that much of a big deal, is it?

Not for me to decide, but I suggest dealing with that stuff in the backport PR.

I don't think #8776 is relevant, as it matches absolute timestamps based on a pattern (like how the prune period grouping functions work) and does not deal with relative intervals where this consideration is needed.

It does: It can also match all archives within e.g. a full month relative to "now" (and added just recently, also relative to arbitrary other timestamps). However, I agree that there's no practical difference due to the microseconds scale. I didn't consider that. Yet I'd still vote for consistency. I just skimmed through @c-herz's work in #8776 and noticed that matching exclusively causes some problems there as well (minor, yet there are some). IMHO the consistency question still somewhat matters, so how about we simply decide on always matching inclusively? I'm thus also tagging @c-herz, WDYT?

@Goddesen
Copy link
Author

After some consideration I think this new retention interval should also account for the oldest rule that was implemented for the retention count flags. I'll get back to this.

@ThomasWaldmann
Copy link
Member

before doing further changes, please rebase on current master branch to get the cython workaround. otherwise, builds will fail.

@ThomasWaldmann
Copy link
Member

Guess I'ld like to merge this for next beta, can we finish this until then?

@Goddesen
Copy link
Author

I'll endeavor to get something done soon then. Factorio has been a distraction 😄

@ThomasWaldmann
Copy link
Member

btw, i dissected the borg.testsuite.helpers_test monster module into a borg.testsuite.helpers package with modules corresponding to modules in borg.helpers.

@ThomasWaldmann
Copy link
Member

ping?

shall I help here a bit by rebasing this on current master branch and resolve the conflicts?

@Goddesen Goddesen force-pushed the prune-timely-by-interval branch from a87a0d9 to dc6f878 Compare June 3, 2025 21:59
@Goddesen
Copy link
Author

Goddesen commented Jun 3, 2025

I'm still here! I pondered myself into a corner with how to best write a similarly comprehensive test and example for the interval code path. I've got to admit I've found it quite hard to iterate on tests with this test suite setup -- but I am not very familiar with Python and its ecosystem so that might be on me.

While thinking about this new large example and how it would relate to my own preferred backup scheme it has additionally become clear to me that it is a bit confusing if you specify --keep-daily 3d and you run daily backups immediately followed by a prune, the backup that occurred exactly three days ago will sometimes be included depending on the time taken to do that backup. I would like to experiment with some kind of fuzziness, maybe each --keep-* rule defines the interval to last for that entire time period (so my 3d example will consider every backup made on that day; if --keep-hourly was used, consider every archive in the hour that covers 3d ago). But maybe the current implementation is good enough for a feature merge and beta release and can be further refined in another PR.

I have rebased on master and moved my changes to the split helper tests. I have squashed all my commits and made some alterations for the rebase to make sense as best I could. I will try to keep thinking about that comprehensive example + tests.

@Goddesen Goddesen force-pushed the prune-timely-by-interval branch from dc6f878 to dfcee1d Compare June 4, 2025 07:24
@Goddesen
Copy link
Author

Goddesen commented Jun 4, 2025

Fixed lint.

Accepts either int or interval, first tries parsing int then tries
parsing as interval if that fails. Returns a timedelta for easy date
math later. Now allows intervals of length 0 as a 0-length timedelta is
perfectly fine to work with.
@Goddesen Goddesen force-pushed the prune-timely-by-interval branch from 20c8431 to 3756301 Compare December 28, 2025 17:30
Support is added for setting prune retention with either an int (keep n
archives) or an interval (keep within). This works much like
--keep-within currently does, but extends support to all retention
filters.

Additionally adds a generic --keep flag to take over (or live alongside)
both --keep-last and --keep-within. --keep-last is no longer an alias of
--keep-secondly, now keeps archives made on the same second.

Comparisons against archive timestamp are made to use local timezone
instead of UTC. Should be equal result in practice, but allows for
easier testing with frozen local time.
Default with tzinfo=None is local timezone anyway, no need to set it
manually.
@Goddesen Goddesen force-pushed the prune-timely-by-interval branch from 3756301 to 073f3e3 Compare December 28, 2025 17:48
@Goddesen
Copy link
Author

@ThomasWaldmann Sorry for the delay!

First of all, branch is rebased on latest master.

I went ahead and implemented my earlier suggestion for fuzzy relative time deltas. If you run prune on a Wednesday with --keep-daily=1wz, you will retain archives as far back as the Monday of the week that time delta lands in (~9 days ago). This matches my intuition when talking about retention policies: When I think to myself "I would like to keep two weeks of archives" what I really want is the last two full work weeks. It would feel weird to be throwing away the Monday and Tuesday backups when running prune on a Wednesday. Same with fuzzy days or fuzzy months, maybe not so much fuzzy hours or minutes but that is of course also implemented for completeness. This fuzzy delta is optional and not default -- but I would like to make the case for it to be the default since this functionality is wholly new anyway. In any case, we can discuss the exact suffix to apply for this ruleset to apply, I just went with "z".

I finally got around to creating that comprehensive test for this new functionality with prune arguments --keep-daily=2w, --keep-weekly=6m, --keep-monthly=1y, --keep-yearly=3 where the last one shows that intervals and exact counts can mix. Please take a look to see if the pruning makes sense. This took by far the longest time and is probably the reason I took so long to get back to this PR.

The fuzzy rounding-like functionality is implemented using dateutil.relativedelta (new dependency) to do calendar-wise timestamp arithmetic in a new FlexibleDelta class (allowing either exact or fuzzy delta math) in the time helpers. This new implementation simplifies another code path where calendar-wise arithemetic was implemented only for years and months, no tests needed fixing.

I am aware this project has very few dependencies so I expect some pushback on introducing a new one, I just found that expressing this logic with relativedelta was extremely elegant and dateutil strikes me as a trusted and stable project. If you insist I will find another way (though a custom implementation would open up for more bugs requiring more testing).

I am hopeful that this proves easy to backport if we still want to this close (?) to 2.0 release.

@PhrozenByte
Copy link
Contributor

@Goddesen, great work ❤️

I'm happy to help with the docs if you want the help. I think best would be a comparative explanation (i.e., comparing the results of otherwise similar policies, e.g., --keep-daily=365 vs. --keep-daily=365d vs. --keep-daily=365dz vs. --keep-daily=12m vs. --keep-daily=12mz) - next to a theoretical explanation and the big calendar example of course.

What I understand

Let me recap how I understand it, please correct me if I'm wrong:

By using --keep-daily=2wz (--keep-daily to keep daily archives, 2 is an arbitrary integer, w for weeks, and z for fuzzy), the fuzzy suffix effectively tells Borg to keep daily archives within the previous 2 weeks plus (for "fuzzy" or something) the current week. Correct?

Another view on the matter could be that Borg first "rounds down" to the start of the current week, then subtracts 2 weeks, and keeps one archive per day (i.e., daily archives) since then. So, if running on 2025-12-30 12:00:00, Borg "rounds down" to 2025-12-29 00:00:00 first, then subtracts two weeks (yielding 2025-12-15 00:00:00) and keeps daily archives from there. Assuming no new archives are created, Borg would keep the exact same archives when running on 2026-01-04 23:59:59, too. Right?

Another example is running --keep-daily=1yz on 2025-12-30 12:00:00: Borg would then keep daily archives since 2024-01-01 00:00:00. Running --keep-daily=12mz would keep daily archives since 2024-12-01 00:00:00. --keep-daily=52wz since 2024-12-30, and --keep-daily=365dz since 2024-12-30. All correct?

On the other hand, the non-fuzzy matcher is exact and also takes the time including µs into account. So, --keep-daily=2w keeps daily archives for the last 2 weeks exactly, i.e., since 2025-12-16 12:00:00+1µs Running the same on 2026-01-04 23:59:59 and Borg would keep only daily archives since 2025-12-21 23:59:59+1µs. Right?

There's no support for quarters yet? For the sake of completeness, how about q for 3-monthly, and Q for 13-weekly?

What I think

I honestly don't see much benefit in the exact matcher. It's unintuitive and behaves weirdly in practice, as you've explained earlier. I very much doubt that many users have consistent backup timing down to the µs - and if they don't have that, the results can be unexpected at best. So, personally, I'd drop the exact matcher altogether and use fuzzy matching only.

I very much like the idea of the "user-chosen precision", especially as an advanced user. It's very powerful to be able to choose the precision of the cut-off time ("1 year" could be represented as 1yz, 4qz (if added), 4Qz (if added), 12mz, 52wz, 365dz, 8760Hz, 525600Mz, or 31536000Sz).

However, I have some doubts in regards to whether this could be a little bit too complicated for people to understand. I mean, on one hand the "1 year" example sounds great because it shows how powerful the solution is, but on the other hand it's kinda confusing that 1yz, 12mz, … all yield different results (some just slightly different, others wildly different). It can also yield rather weird results, like running --keep-daily=1yz on New Year's Eve and keeping daily archives of two full years, just for Borg to prune (up to) 365 daily archives a day later… Also, to be honest, I wouldn't expect --keep-daily=1wz to keep archives of e.g. 9 days either (like your example). It's explainable, because it's without a doubt consistent and pretty neat. But it definitely needs explaining.

I'm torn…

On one hand your solution is incredibly powerful. On the other hand we can definitely make it a little easier for users, without resorting to the exact matcher. The solution you've outlined in #8775 (comment) seems reasonable to me: For --keep-daily Borg always behaves like dz (even for e.g. --keep-daily=1y), for --keep-weekly like wz, etc. This is rather easy to explain and makes a lot of sense IMHO. Let's call this the "static fuzzy matcher".

However, I certainly don't want to speak ill of your solution! I like it very much. I'm just looking at it from an user's perspective and prune already is among Borg's more complex topics (like check, and choosing the right encryption mode for repo-create). So, maybe we could add both? Using the "static fuzzy matcher" as default and without a suffix, and the "dynamic fuzzy matcher" with a suffix (maybe changing the suffix from z to +?).

Just my two cents on the topic. WDYT?

months=0 if earlier else 1, day=1, hour=0, minute=0, second=0, microsecond=0
),
"w": lambda earlier: relativedelta(
weekday=MO(-1), weeks=(0 if earlier else 1), hour=0, minute=0, second=0, microsecond=0
Copy link
Author

@Goddesen Goddesen Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Not everyone starts their weeks on a Monday. Maybe make configurable (in a follow-up PR). Logic would remain unchanged regardless of which day is chosen. (MO(-1) in this case means setting the day to be the preceding Monday, or same day if datetime is already a Monday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants