Method for user performed compaction? #329

javafanboy · 2025-07-27T04:23:22Z

javafanboy
Jul 27, 2025

I think it would be really neat if there similarly to adding a file/object existed a method for compacting one or more files/objects belonging to the same table/partition into a new file/object (as with the built in support both the original and new file would be retained until snapshots are removed).

This would provide users with fine grained control over what files/objects to compact and in particular perform custom sorting of the compacted file.

Sorting of compacted files is desirable for optimal performance in cases where ordering is based on a composite key - just appending several small, individually sorted files may in this case produce a compacted file where row group metrics will not provide efficient performance improvement...

The concept of rowid usind in ducklake may. depending on how it is implemented, be problematic for implementing sorting of compacted files...

Alex-Monahan · 2025-10-05T04:45:35Z

Alex-Monahan
Oct 5, 2025

I'm working on a super rough proof of concept for a way to order files during compaction. The initial plan would be to use the existing logic to determine how many files to be compacted and then sort within those pre-selected files. A few open questions / thoughts!

I was thinking of calling this default_sort since this is approximate (And that is similar to Iceberg's approach).
I was planning to add an optional parameter to ducklake_merge_adjacent_files, as well as a setting (so that CHECKPOINT can pick it up as well). I am super flexible on naming. Maybe default_sort?
- Should the setting be only settable at the table level? Or should it follow the typical global/schema/table hierarchy?
- Should the setting be a string, or a list, or a map? Ex:
  - 'columna asc, columnb desc'
  - ['columna asc', 'columnb desc']
  - {"columna":"asc", "columnb":"desc"}
If I can figure it out, I'll try to make this work on arbitrary expressions. The first version might start with columns instead. Would it be a hard requirement to work with arbitrary expressions, or a later PR?
In the future, I could see another optimization for picking which files to combine for each specific compaction operation (each destination file) using the file level statistics. (Ex: Sorting the list of files before choosing which to merge)
- If we are sorting by column A, we could first sort the file list by the min (or max or (min+max)/2) ) of column A according to the file stats. Then we could at least approximately sort the list of files prior to selecting the right ones to compact according to the file size cutoff.
- My thought is that it would not take too much computation/overhead to do this since the stats would already be in the metadata DB and not within each file.

cc @guillesd - I would love your input!

4 replies

guillesd Oct 13, 2025
Collaborator

I think probably the best would be to have a table level setting so that any write operation can benefit from this. But this probably requires quite some work. I guess adding this to merge_adjacent_files could be an option although it may also mean that this operation will become considerably heavier on mem.

Alex-Monahan Oct 13, 2025

Thanks for the feedback!

It could be something on each write - I had not considered that! Today, users can still do that manually with their order by clause during insert, so I had been focused on the area where they did not yet have the ability to adjust it.

Another area I was considering was that maybe when inlined data is flushed it could be sorted too? That is another area where the user can't quite "reach it" to adjust it within their own queries.

If it were downscoped to those 2 areas (and not all writes), would that make it less invasive? I'd like to give that a try if so!

Alex-Monahan Oct 13, 2025

As I've been noodling on it, I'm starting to favor approx_sort as the parameter to convey that it is not absolute. I think having a string be passed in is also my current plan for the setting syntax - I figured out how to parse it with existing Order By parsing logic.

guillesd Oct 14, 2025
Collaborator

Give it a go and maybe add a benchmark on the PR to justify whether this improves the usual things (reading time, reducing files scanned, etc). Ultimately it will probably be up to Mark to merge this though

javafanboy · 2025-10-05T12:00:43Z

javafanboy
Oct 5, 2025
Author

Not sure what you mean with "approximate" sorting but imho sorting of the merged data should be "exact" as even some misplaced vales" have a quite negative effect on row group selectiveness ..

…

On Sun, Oct 5, 2025, 06:45 Alex Monahan ***@***.***> wrote: I'm working on a super rough proof of concept <https://github.com/Alex-Monahan/ducklake/tree/ordered-compaction>for a way to order files during compaction. The initial plan would be to use the existing logic to determine how many files to be compacted and then sort within those pre-selected files. A few open questions / thoughts! - I was thinking of calling this default_sort since this is approximate (And that is similar to Iceberg's approach <https://iceberg.apache.org/spec/#sorting>). - I was planning to add an optional parameter to ducklake_merge_adjacent_files, as well as a setting (so that CHECKPOINT can pick it up as well). I am super flexible on naming. Maybe default_sort? - Should the setting be only settable at the table level? Or should it follow the typical global/schema/table hierarchy? - In the future, I could see another optimization for picking which files to combine for each specific compaction operation (each destination file) using the file level statistics. - If we are sorting by column A, we could first sort by the min (or max or (min+max)/2) ) of column A according to the file stats. Then we could at least approximately sort the files prior to selecting the right ones to compact according to the file size cutoff. - My thought is that it would not take too much computation/overhead to do this since the stats would already be in the metadata DB and not at the file level. cc @guillesd <https://github.com/guillesd> - I would love your input! — Reply to this email directly, view it on GitHub <#329 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADXQFYBBLW4URQYZ6DGQ6D3WCPAHAVCNFSM6AAAAACCOLIZVSVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTINJZGQ3TMMA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

1 reply

Alex-Monahan Oct 5, 2025

Good question! I am a bit worried about trying to sort all uncompacted data in a single sort operation. It may be a very large operation that is not feasible to complete. So, instead I was planning to sort each destination file, not the entire set of uncompacted files. So that would make sorting exact within a file, but approximate at the table level. Here is some of my thought process from this thread (#389 ).

Sort all data in files eligible for compaction

Pros: Gets best performance for reads
Cons: Takes potentially quite a lot of memory and CPU during compaction, especially if compacting for the first time or not regularly

Sort data within the files selected to be compacted into a single file

Ex: 25 total files being compacted, but every 5 files get compacted into 1 file since that aligns with target size. This would just sort each set of 5 files independently.
Pros: Relatively bounded memory use (could be configured with target file size)
Cons: Less optimal performance for reads. Could drive towards a larger file size to get more sorting benefits.

Iceberg uses approach 2 based on my reading of the Iceberg spec.

javafanboy · 2025-10-05T15:13:01Z

javafanboy
Oct 5, 2025
Author

Assuming we are ingesting new data by date/time (and possibly even partitioning on say hours) then reading as many "adjacent" objects (by date/time) as we can safely hold in memory, sort them and write as a new more optimally sized and internally well "indexed" object will work well...

…

On Sun, Oct 5, 2025, 14:36 Alex Monahan ***@***.***> wrote: Good question! I am a bit worried about trying to sort all uncompacted data in a single sort operation. It may be a very large operation that is not feasible to complete. So, instead I was planning to sort each destination file, not the entire set of uncompacted files. So that would make sorting exact within a file, but approximate at the table level. Here is some of my thought process from this thread (#389 <#389> ). 1. Sort all data in files eligible for compaction - Pros: Gets best performance for reads - Cons: Takes potentially quite a lot of memory and CPU during compaction, especially if compacting for the first time or not regularly 2. Sort data within the files selected to be compacted into a single file - Ex: 25 total files being compacted, but every 5 files get compacted into 1 file since that aligns with target size. This would just sort each set of 5 files independently. - Pros: Relatively bounded memory use (could be configured with target file size) - Cons: Less optimal performance for reads. Could drive towards a larger file size to get more sorting benefits. Iceberg uses approach 2 based on my reading of the Iceberg spec <https://iceberg.apache.org/spec/#sorting>. — Reply to this email directly, view it on GitHub <#329 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADXQF5GEN6VA233MUNGRKD3WEGGNAVCNFSM6AAAAACCOLIZVSVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTINJZGY3DSMY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

Method for user performed compaction? #329

Uh oh!

Uh oh!

javafanboy Jul 27, 2025

Replies: 3 comments · 5 replies

Uh oh!

Uh oh!

Alex-Monahan Oct 5, 2025

Uh oh!

guillesd Oct 13, 2025 Collaborator

Uh oh!

Alex-Monahan Oct 13, 2025

Uh oh!

Alex-Monahan Oct 13, 2025

Uh oh!

guillesd Oct 14, 2025 Collaborator

Uh oh!

javafanboy Oct 5, 2025 Author

Uh oh!

Alex-Monahan Oct 5, 2025

Uh oh!

javafanboy Oct 5, 2025 Author

javafanboy
Jul 27, 2025

Replies: 3 comments 5 replies

Alex-Monahan
Oct 5, 2025

guillesd Oct 13, 2025
Collaborator

guillesd Oct 14, 2025
Collaborator

javafanboy
Oct 5, 2025
Author

javafanboy
Oct 5, 2025
Author