Add support for splitting a chunk #7946

erimatnor · 2025-04-10T13:21:55Z

A chunk can be split with a new procedure called split_chunk(). In this initial version, a chunk can only be split in two given a split point:

call split_chunk('chunk_1', split_at => '2025-03-01 00:00');

If no split point is given, the chunk is split in two equal size partition ranges. The partitioning dimension/column to split along can also be specified, but only the primary dimension (time) is supported. Future updates to split_chunk() can add the ability to split also along other dimensions, e.g., space partitions.

Currently, splitting a chunk takes an AccessExclusiveLock on the chunk being split. A lock is also taken on the hypertable root to prevent new chunks being created while the split is ongoing.

erimatnor · 2025-04-10T13:39:09Z

I want to add some concurrency/isolation tests. But might do it in another PR.

codecov · 2025-04-10T14:08:50Z

Codecov Report

Attention: Patch coverage is 84.54545% with 34 lines in your changes missing coverage. Please review.

Project coverage is 82.15%. Comparing base (59f50f2) to head (2807d8a).
Report is 908 commits behind head on main.

Files with missing lines	Patch %	Lines
tsl/src/chunk.c	84.47%	9 Missing and 25 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7946      +/-   ##
==========================================
+ Coverage   80.06%   82.15%   +2.08%     
==========================================
  Files         190      250      +60     
  Lines       37181    46504    +9323     
  Branches     9450    11670    +2220     
==========================================
+ Hits        29770    38205    +8435     
- Misses       2997     3671     +674     
- Partials     4414     4628     +214

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

erimatnor · 2025-04-10T15:15:03Z

Going to work on expanding tests to improve the code coverage.

mkindahl

There seems to be a few minor things that might be good to take a look at to make sure that they are not causing issues.

Looking for ways this could break, you might want to test these cases in the part of the test where you actually check the contents of the tuple (because there is potential for corrupting the data and printing out the tuples is more likely to reveal issues):

No tuples in table and splitting
Deleting and/or updating tuples in the table and splitting (you do vacuum as part of the split).
Odd number of tuples in table and splitting
Splitting table so that one part is empty and the other contains all of them (two cases, original is empty and new chunk is empty).
Using split chunk with the start timestamp of the chunk
Using split chunk with the end timestamp of the chunk
Using both byval types (int, for example) and byref types (text, for example). You are using byval types, but I see no byref types.

tsl/test/expected/split_chunk.out

tsl/src/chunk.c

fetchezar · 2025-04-11T20:27:19Z

Hi! I wanted to ask how this works regarding compression. For example I have a database with a chunk size that's too large for the resources it has, 180 days. This chunk ends by the end of the year.

Ideally I wouldn't want to wait until that chunk is done to compress it, and it seems this function would let me split it in two chunks of, say, 90 days, compress the first, adapt the chunk policy so it compresses and segments at 90 days, and all should work fine, right?

This is our use case for splitting and merging. Basically fixing bad estimations for hypertable chunks in either way (or sometimes cases where compression works really well and we could merge several old compressed chunks for performance reasons).

Does this splitting support both compressed and uncompressed chunks?

erimatnor · 2025-04-18T06:40:25Z

Does this splitting support both compressed and uncompressed chunks?

Thanks for asking. The first version won't support compressed chunks, but this will be added in a follw-up PR.

tsl/test/expected/split_chunk.out

mkindahl

Might be good to add a test that splits into an empty new or empty old chunk. I did not see any such test.

A chunk can be split with a new procedure called split_chunk(). In this initial version, a chunk can only be split in two given a split point: ``` call split_chunk('chunk_1', split_at => '2025-03-01 00:00'); ``` If no split point is given, the chunk is split in two equal size partition ranges. The partitioning dimension/column to split along can also be specified, but only the primary dimension (time) is supported. Future updates to split_chunk() can add the ability to split also along other dimensions, e.g., space partitions. Currently, splitting a chunk takes an AccessExclusiveLock on the chunk being split. A lock is also taken on the hypertable root to prevent new chunks being created while the split is ongoing.

Can't support multi-dimensional splits because of limitation in substore space data structure.

erimatnor · 2025-04-23T10:10:31Z

Might be good to add a test that splits into an empty new or empty old chunk. I did not see any such test.

Added.

fetchezar · 2025-04-23T13:56:02Z

Does this splitting support both compressed and uncompressed chunks?

Thanks for asking. The first version won't support compressed chunks, but this will be added in a follw-up PR.

I see, thank you!

github-actions bot assigned erimatnor Apr 10, 2025

erimatnor force-pushed the split-chunk branch 2 times, most recently from 780e89e to a58b0c3 Compare April 10, 2025 13:26

erimatnor requested review from fabriziomello, mkindahl and melihmutlu April 10, 2025 13:27

erimatnor force-pushed the split-chunk branch 3 times, most recently from 5aefdae to db2c221 Compare April 10, 2025 13:37

erimatnor changed the title ~~Add support for splitting chunks~~ Add support for splitting a chunk Apr 10, 2025

erimatnor force-pushed the split-chunk branch from db2c221 to 5e710a8 Compare April 10, 2025 13:38

erimatnor force-pushed the split-chunk branch 2 times, most recently from a9d1f38 to a2a0f84 Compare April 10, 2025 13:56

erimatnor force-pushed the split-chunk branch from a2a0f84 to 4eb7c34 Compare April 10, 2025 14:16

erimatnor marked this pull request as ready for review April 10, 2025 15:14

mkindahl reviewed Apr 11, 2025

View reviewed changes

philkra added this to the v2.20.0 milestone Apr 12, 2025

erimatnor force-pushed the split-chunk branch from 4eb7c34 to 41b49da Compare April 18, 2025 05:35

erimatnor force-pushed the split-chunk branch 6 times, most recently from a644a88 to 4087068 Compare April 20, 2025 05:08

erimatnor requested a review from mkindahl April 21, 2025 15:06

mkindahl reviewed Apr 22, 2025

View reviewed changes

tsl/test/expected/split_chunk.out Show resolved Hide resolved

mkindahl approved these changes Apr 22, 2025

View reviewed changes

erimatnor added 10 commits April 23, 2025 16:53

Add isolation test for splitting chunk

f9c6182

Add update/delete/visibility tests

7e0358a

Add more tests to improve test coverage

6694d0c

Check constraint failure

94fe6bb

Support only single-dimensional split

d6becbe

Can't support multi-dimensional splits because of limitation in substore space data structure.

More cleanups and tests

3864369

Add test for insert in progress

ac71ad9

Add tests for integer boundary splits

66e6813

Move check for access method

9acd034

erimatnor force-pushed the split-chunk branch from 4087068 to 9acd034 Compare April 23, 2025 09:56

Add test with empty chunk after split

2807d8a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for splitting a chunk #7946

Add support for splitting a chunk #7946

erimatnor commented Apr 10, 2025 •

edited

Loading

erimatnor commented Apr 10, 2025

codecov bot commented Apr 10, 2025 •

edited

Loading

erimatnor commented Apr 10, 2025

mkindahl left a comment

fetchezar commented Apr 11, 2025

erimatnor commented Apr 18, 2025

mkindahl left a comment

erimatnor commented Apr 23, 2025

fetchezar commented Apr 23, 2025

Add support for splitting a chunk #7946

Are you sure you want to change the base?

Add support for splitting a chunk #7946

Conversation

erimatnor commented Apr 10, 2025 • edited Loading

erimatnor commented Apr 10, 2025

codecov bot commented Apr 10, 2025 • edited Loading

Codecov Report

erimatnor commented Apr 10, 2025

mkindahl left a comment

Choose a reason for hiding this comment

fetchezar commented Apr 11, 2025

erimatnor commented Apr 18, 2025

mkindahl left a comment

Choose a reason for hiding this comment

erimatnor commented Apr 23, 2025

fetchezar commented Apr 23, 2025

erimatnor commented Apr 10, 2025 •

edited

Loading

codecov bot commented Apr 10, 2025 •

edited

Loading