Skip to content

Commit 1b9a24b

Browse files
committed
v5.3.0 initial commit
1 parent da0d708 commit 1b9a24b

35 files changed

+4099
-247
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ ignore/*
99
sql/*.sql
1010
test/not_working_yet/*
1111
*.zip
12+
*.fuse*

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
repos:
22
- repo: https://github.com/pre-commit/pre-commit-hooks
3-
rev: v4.6.0
3+
rev: v6.0.0
44
hooks:
55
- id: check-merge-conflict
66
- id: check-symlinks
@@ -12,6 +12,6 @@ repos:
1212
- id: trailing-whitespace
1313
args: ['--markdown-linebreak-ext=md']
1414
- repo: https://github.com/codespell-project/codespell
15-
rev: v2.3.0
15+
rev: v2.4.1
1616
hooks:
1717
- id: codespell

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,32 @@
1+
5.3.0
2+
=====
3+
4+
NEW FEATURES
5+
------------
6+
- Added new function partition_data_async() to allow smaller batching of data per transaction when moving data out of the default partition. (Github Issue #353)
7+
- Note this function currently only works with time-based partitioning. ID/integer partitioning is in development.
8+
- WARNING: While data is in transition between the default and the destination child table using this procedure, it is NOT visible to users of the partition table. See documentation for this function for additional details.
9+
- Better support filtering out any columns with `p_ignored_columns` while partitioning data using the `partition_data_time()`, `partition_data_id()`, or `partition_data_proc()` utilities. (Github PR#723)
10+
- Allows for filtering out GENERATED columns while moving data so that newly generated values will be entered for moved rows.
11+
- Non-GENERATED columns that are filtered out will either have NULL values or use the default value when rows are moved.
12+
- TODO update partition_data_id
13+
- Added support for uuid-based partition sets to partition_data_time()/partition_data_proc() functions (Github #789)
14+
- Allow a starting offset to id/integer based partitioning. Added a new parameter to create_parent: p_offset_id. Note that the offset will carry through to all subsequent child tables. Ex: offset of 5 with interval 10 will make lower boundaries 5, 15, 25, etc. (Github Issue #339)
15+
- Reduce the logging of the dynamic background working runs to be DEBUG1. Changed existing DEBUG1 logging messages in the BGW to DEBUG2.
16+
- Unlogged tables are still supported in pg_partman as of PostgreSQL 18 and newer, but the parent table can no longer be flagged unlogged. This only works through the template table system in pg_partman.
17+
18+
BUGFIXES
19+
--------
20+
- Allow `partition_data_*()` utilities to properly work when a PK/Unique key is set to GENERATE ALWAYS.
21+
- Handle if the given default table already exists when calling `create_parent()`. Helps to better handle migrating an existing partition set to pg_partman.
22+
- Added check to ensure that the default table cannot be manually set as the value of p_source_table in partitioning functions and procedures. This would previously cause an unhandled edge case endless loop since the data moved out of the default was getting moved right back into the default again instead of a new child partition. (Github Issue #353)
23+
- Always ensure transaction is committed at proper time when using reapply_constraints_proc(). (Github PR#780)
24+
- Added plpgsql as a required dependency in the extension control file. (Github PR# 808)
25+
26+
DOCUMENTATION
27+
-------------
28+
- Updated documentation for the time decoder function to note that it must take a TEXT value as its parameter at this time.
29+
130
5.2.4
231
=====
332
BUG FIXES

META.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "pg_partman",
33
"abstract": "Extension to manage partitioned tables by time or ID",
4-
"version": "5.2.4",
4+
"version": "5.3.0",
55
"maintainer": [
66
"Keith Fiske <[email protected]>"
77
],
@@ -20,9 +20,9 @@
2020
},
2121
"provides": {
2222
"pg_partman": {
23-
"file": "sql/pg_partman--5.2.4.sql",
23+
"file": "sql/pg_partman--5.3.0.sql",
2424
"docfile": "doc/pg_partman.md",
25-
"version": "5.2.4",
25+
"version": "5.3.0",
2626
"abstract": "Extension to manage partitioned tables by time or ID"
2727
}
2828
},

bin/common/check_unique_constraint.py

100755100644
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,14 @@
44

55
partman_version = "2.0.0"
66

7-
parser = argparse.ArgumentParser(description="This script is used to check that all rows in a partition set are unique for the given columns. Since unique constraints are not applied across partition sets, this cannot be enforced within the database. This script can be used as a monitor to ensure uniquness. If any unique violations are found, the values, along with a count of each, are output.")
7+
parser = argparse.ArgumentParser(description="This script is used to check that all rows in a partition set are unique for the given columns. Since unique constraints are not applied across partition sets, this cannot be enforced within the database. This script can be used as a monitor to ensure uniqueness. If any unique violations are found, the values, along with a count of each, are output.")
88
parser.add_argument('-p', '--parent', help="Parent table of the partition set to be checked")
99
parser.add_argument('-l', '--column_list', help="Comma separated list of columns that make up the unique constraint to be checked")
1010
parser.add_argument('-c','--connection', default="host=", help="""Connection string for use by psycopg. Defaults to "host=" (local socket).""")
1111
parser.add_argument('-t', '--temp', help="Path to a writable folder that can be used for temp working files. Defaults system temp folder.")
1212
parser.add_argument('--psql', help="Full path to psql binary if not in current PATH")
1313
parser.add_argument('--simple', action="store_true", help="Output a single integer value with the total duplicate count. Use this for monitoring software that requires a simple value to be checked for.")
14-
parser.add_argument('--index_scan', action="store_true", help="By default index scans are disabled to force the script to check the actual table data with sequential scans. Set this option if you want the script to allow index scans to be used (does not guarentee that they will be used).")
14+
parser.add_argument('--index_scan', action="store_true", help="By default index scans are disabled to force the script to check the actual table data with sequential scans. Set this option if you want the script to allow index scans to be used (does not guarantee that they will be used).")
1515
parser.add_argument('-q', '--quiet', action="store_true", help="Suppress all output unless there is a constraint violation found.")
1616
parser.add_argument('--version', action="store_true", help="Print out the minimum version of pg_partman this script is meant to work with. The version of pg_partman installed may be greater than this.")
1717
args = parser.parse_args()

bin/common/dump_partition.py

100755100644
File mode changed.

bin/common/vacuum_maintenance.py

100755100644
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ def vacuum_table(conn, schemaname, tablename):
204204
sys.exit(2)
205205

206206
if args.interval != None and args.type == None:
207-
print("--interval argment requires setting --type argument as well")
207+
print("--interval argument requires setting --type argument as well")
208208
sys.exit(2)
209209

210210
main_conn = create_conn()

doc/pg_partman.md

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,7 @@ partition_data_time(
271271
, p_analyze boolean DEFAULT true
272272
, p_source_table text DEFAULT NULL
273273
, p_ignored_columns text[] DEFAULT NULL
274+
, p_override_system_value boolean DEFAULT false
274275
)
275276
RETURNS bigint
276277
```
@@ -287,6 +288,7 @@ RETURNS bigint
287288
* `p_analyze` - optional argument, by default whenever a new child table is created, an analyze is run on the parent table of the partition set to ensure constraint exclusion works. This analyze can be skipped by setting this to false and help increase the speed of moving large amounts of data. If this is set to false, it is highly recommended that a manual analyze of the partition set be done upon completion to ensure statistics are updated properly.
288289
* `p_source_table` - This option can be used when you need to move data into a partitioned table. Pass a schema qualified tablename to this parameter and any data in that table will be MOVED to the partition set designated by p_parent_table, creating any child tables as needed.
289290
* `p_ignored_columns` - This option allows for filtering out specific columns when moving data from the default/source to the target child table(s). This is generally only required when using columns with a GENERATED ALWAYS value since directly inserting a value would fail when moving the data. Value is a text array of column names.
291+
* `p_override_system_value` - When moving data from the default or another source table to a partition set that has GENERATED ALWAYS column values, you may want to keep the values from the source vs having newly generated values. This allows you to set the `OVERRIDING SYSTEM VALUE` flag when inserting data. Note that you may need to reset the underlying sequence for the target generated columns when overriding inserted data.
290292
* Returns the number of rows that were moved from the parent table to partitions. Returns zero when source table is empty and partitioning is complete.
291293

292294

@@ -301,6 +303,7 @@ partition_data_id(p_parent_table text
301303
, p_analyze boolean DEFAULT true
302304
, p_source_table text DEFAULT NULL
303305
, p_ignored_columns text[] DEFAULT NULL
306+
, p_override_system_value boolean DEFAULT false
304307
)
305308
RETURNS bigint
306309
```
@@ -317,6 +320,7 @@ RETURNS bigint
317320
* `p_analyze` - optional argument, by default whenever a new child table is created, an analyze is run on the parent table of the partition set to ensure constraint exclusion works. This analyze can be skipped by setting this to false and help increase the speed of moving large amounts of data. If this is set to false, it is highly recommended that a manual analyze of the partition set be done upon completion to ensure statistics are updated properly.
318321
* `p_source_table` - This option can be used when you need to move data into a partitioned table. Pass a schema qualified tablename to this parameter and any data in that table will be MOVED to the partition set designated by p_parent_table, creating any child tables as needed.
319322
* `p_ignored_columns` - This option allows for filtering out specific columns when moving data from the default/source to the target child table(s). This is generally only required when using columns with a GENERATED ALWAYS value since directly inserting a value would fail when moving the data. Value is a text array of column names.
323+
* `p_override_system_value` - When moving data from the default or another source table to a partition set that has GENERATED ALWAYS column values, you may want to keep the values from the source vs having newly generated values. This allows you to set the `OVERRIDING SYSTEM VALUE` flag when inserting data. Note that you may need to reset the underlying sequence for the target generated columns when overriding inserted data.
320324
* Returns the number of rows that were moved from the parent table to partitions. Returns zero when source table is empty and partitioning is complete.
321325

322326

@@ -339,7 +343,7 @@ partition_data_proc (
339343
* A procedure that can partition data in distinct commit batches to avoid long running transactions and data contention issues.
340344
* Calls either partition_data_time() or partition_data_id() in a loop depending on partitioning type.
341345
* `p_parent_table` - Parent table of an already created partition set.
342-
* `p_loop_count` - How many times to loop through the value given for p_interval. If p_interval not set, will use default partition interval and make at most this many partition(s). Procedure commits at the end of each loop (NOT passed as p_batch_count to partitioning function). If not set, all data in the parent/source table will be partitioned in a single run of the procedure.
346+
* `p_loop_count` - How many times to loop through the value given for p_interval. If p_interval not set, will use default partition interval and make at most this many partition(s). Procedure commits at the end of each loop (NOT passed as p_batch_count to partitioning function). If not set, all data in the default/source table will be partitioned in a single run of the procedure.
343347
* `p_interval` - Parameter that is passed on to the partitioning function as p_batch_interval argument. See underlying functions for further explanation.
344348
* `p_lock_wait` - Parameter that is passed directly through to the underlying partition_data_*() function. Number of seconds to wait on rows that may be locked by another transaction. Default is to wait forever (0).
345349
* `p_lock_wait_tries` - Parameter to set how many times the procedure will attempt waiting the amount of time set for p_lock_wait. Default is 10 tries.
@@ -350,6 +354,50 @@ partition_data_proc (
350354
* `p_quiet` - Procedures cannot return values, so by default it emits NOTICE's to show progress. Set this option to silence these notices.
351355

352356

357+
<a id="partition_data_proc"></a>
358+
```sql
359+
partition_data_async (
360+
p_parent_table text
361+
, p_loop_count int DEFAULT NULL
362+
, p_interval text DEFAULT NULL
363+
, p_lock_wait int DEFAULT 0
364+
, p_lock_wait_tries int DEFAULT 10
365+
, p_wait int DEFAULT 1
366+
, p_order text DEFAULT 'ASC'
367+
, p_ignored_columns text[] DEFAULT NULL
368+
, p_quiet boolean DEFAULT false
369+
)
370+
```
371+
* Note: This procedure currently only works with time-based partitioning as of pg_partman version 5.3.0. Integer/id support is in development.
372+
* A procedure designed to help move data out of the default partition in smaller batches of rows per committed transaction than the partition interval.
373+
* This procedure is ONLY for moving data out of the default. If you're moving data from another source table to the partitioned table, you can already use smaller batch sizes than the partition interval with the `partition_data_proc()` procedure (or standard `partition_data_time/id()` functions).
374+
* The `partition_data_proc()` procedure can still be used to do migrate data out of the default, but the transaction interval size can never be smaller than the partition interval since the new child table cannot be made until all of the relevant data has been moved out of the default.
375+
* IMPORTANT NOTE: This procedure works by first moving all the data for a target child table to another real, working table. The smaller batches are committed, so the data that is in transit before being moved to the target child table is NOT VISIBLE to users of the table. If you cannot afford to have data disappearing for the users of the table, then this asyncrhonous method WILL NOT provide the desired result. In that case, you must use a batch size equal to the interval size of the partition set and you can use `partition_data_proc()`.
376+
* How this procedure works:
377+
* The interval size is the amount of data that is moved in each commit.
378+
* Commits are done when data is moved to the temporary storage location as well as the final child table.
379+
* So the value of the loop count to move all the data for a single child table is the partition set's interval divided by the interval size given to this function times 2.
380+
* For example: A daily partition set (24 hrs) is given the interval of 6 hours to this asynchronous procedure so that it commits after each block of 6 hours is moved. That means there would be 4 batches of data that first get moved to the working table then moved to the final child table for a total of 8 commits. So p_loop_count would be 8 to move all the data for a single child table in this partition set ( (24 / 6) * 2).
381+
* Multiply that value for however many child tables you expect to be moved.
382+
* If no loop count is given, the entire default table will be emptied out using the batch interval given.
383+
* A real (not temporary) table is created as needed to hold intermediate data while it is moved. This table will be dropped whenever a child table has been created and all the data is moved to it.
384+
* The naming pattern of the working table is: `originalschema.partman_tmp_storage_originaltablename`
385+
* While data is being migrated, the `async_partitioning_in_progress` column in the `part_config` table will contain a value that relates to the most recent set of data that has been moved. While this column has a value, and during the running of this procedure, all maintenance for that partition set will be skipped (a warning is left in the PostgreSQL logs). To resume normal maintenance, this column must be NULL. This will automatically be set to NULL after completion of each child table.
386+
* Since a real table is used to migrate data, the state of a migration is preserved between multiple runnings of this procedure. But as stated in the previous bullet, all normal partition maintenance for the partition set will be skipped while a partition set is left in a state where all the data for a given child table has not been fully moved to the target child table.
387+
* `p_parent_table` - Parent table of an already created partition set.
388+
* `p_loop_count` - How many times to loop through the value given for p_interval. See above bullet points for important information for what this loop count actually means when using this procedure. If not set, all data in the default table will be partitioned in a single run of the procedure.
389+
* `p_interval` - Parameter that sets the interval size of how many rows will be committed in a single committed transaction. See above bullet points for further explanations of how this parameter is used.
390+
* `p_lock_wait` - Parameter that is passed directly through to the underlying partition_data_*() function. Number of seconds to wait on rows that may be locked by another transaction. Default is to wait forever (0).
391+
* `p_lock_wait_tries` - Parameter to set how many times the procedure will attempt waiting the amount of time set for p_lock_wait. Default is 10 tries.
392+
* `p_wait` - Cause the procedure to pause for a given number of seconds between commits (batches) to reduce write load
393+
* `p_order` - Same as the p_order option in the called partitioning function
394+
* `p_source_table` - Same as the p_source_table option in the called partitioning function
395+
* `p_ignored_columns` - This option allows for filtering out specific columns when moving data from the default/parent to the proper child table(s). This is generally only required when using columns with a GENERATED ALWAYS value since directly inserting a value would fail when moving the data. Value is a text array of column names.
396+
* `p_quiet` - Procedures cannot return values, so by default it emits NOTICE's to show progress. Set this option to silence these notices.
397+
398+
399+
400+
353401
<a id="create_partition_time"></a>
354402
```sql
355403
create_partition_time(
@@ -760,6 +808,7 @@ Stores all configuration data for partition sets managed by the extension.
760808
, maintenance_order int DEFAULT NULL
761809
, retention_keep_publication boolean NOT NULL DEFAULT false
762810
, maintenance_last_run timestamptz
811+
, async_partitioning_in_progress text
763812

764813
- `parent_table`
765814
- Parent table of the partition set
@@ -826,6 +875,8 @@ Stores all configuration data for partition sets managed by the extension.
826875
- Default value is false
827876
- maintenance_last_run
828877
- Timestamp of the last successful run of maintenance for this partition set. Can be useful as a monitoring metric to ensure partition maintenance is running properly.
878+
- async_partitioning_in_progress
879+
- This column is used to track if an asynchronous partitioning process has been started. It is a text field that contains the value related to the last block of data that was processed. If NOT NULL, all regular maintenance for this table will be stopped until the async partitioning process has been completed successfully. See `partition_data_async()` for more information.
829880

830881

831882
<a id="part_config_sub"></a>

pg_partman.control

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
default_version = '5.2.4'
1+
default_version = '5.3.0'
22
comment = 'Extension to manage partitioned tables by time or ID'
33
relocatable = false
44
superuser = false

0 commit comments

Comments
 (0)