You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement encode/decode for ID columns + tests, docs
This PR allows users to specify the special value of 'func' for p_epoch
to use custom functions to encode/decode time-ordered integers other than
the classic seconds, ms, us or ns since the UNIX epoch.
Resolves#729.
Copy file name to clipboardExpand all lines: doc/pg_partman.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -207,12 +207,12 @@ RETURNS boolean
207
207
* An ACCESS EXCLUSIVE lock is taken on the parent table during the running of this function. No data is moved when running this function, so lock should be brief
208
208
* A default partition and template table are created by default unless otherwise configured
209
209
*`p_parent_table` - the existing parent table. MUST be schema qualified, even if in public schema
210
-
*`p_control` - the column that the partitioning will be based on. Must be a time, integer, text or uuid based column. When control is of type text/uuid, p_time_encoder and p_time_decoder must be set.
210
+
*`p_control` - the column that the partitioning will be based on. Must be a time, integer, text or UUID based column. When control is of type text/UUID, p_time_encoder and p_time_decoder must be set.
211
211
*`p_interval` - the time or integer range interval for each partition. No matter the partitioning type, value must be given as text.
212
212
+*\<interval\>* - Any valid value for the interval data type. Do not type cast the parameter value, just leave as text.
213
213
+*\<integer\>* - For ID based partitions, the integer value range of the ID that should be set per partition. Enter this as an integer in text format ('100' not 100). If the interval is >=2, then the `p_type` must be `range`. If the interval equals 1, then the `p_type` must be `list`. Also note that while numeric values are supported for id-based partitioning, the interval must still be a whole number integer.
214
214
*`p_type` - the type of partitioning to be done. Currently only **range** and **list** are supported. See `p_interval` parameter for special conditions concerning type.
215
-
*`p_epoch` - tells `pg_partman` that the control column is an integer type, but actually represents and epoch time value. Valid values for this option are: 'seconds', 'milliseconds', 'microseconds', 'nanoseconds', and 'none'. The default is 'none'. All table names will be time-based. In addition to a normal index on the control column, be sure you create a functional, time-based index on the control column (to_timestamp(controlcolumn)) as well so this works efficiently.
215
+
*`p_epoch` - tells `pg_partman` that the control column is an integer type, but actually represents an epoch time value or integer containing an encoded timestamp. Valid values for this option are: 'seconds', 'milliseconds', 'microseconds', 'nanoseconds', 'func', and 'none'. The default is 'none'. All table names will be time-based. For 'func', encode/decode functions between the integer type used and `timestamptz` are required. In addition to a normal index on the control column, be sure you create a functional, time-based index on the control column (to_timestamp(controlcolumn)) as well so this works efficiently.
216
216
*`p_premake` - is how many additional partitions to always stay ahead of the current partition. Default value is 4. This will keep at minimum 5 partitions made, including the current one. For example, if today was Sept 6th, and `premake` was set to 4 for a daily partition, then partitions would be made for the 6th as well as the 7th, 8th, 9th and 10th. Note some intervals may occasionally cause an extra partition to be premade or one to be missed due to leap years, differing month lengths, etc. This usually won't hurt anything and should self-correct (see **About** section concerning timezones and non-UTC). If partitioning ever falls behind the `premake` value, normal running of `run_maintenance()` and data insertion should automatically catch things up.
217
217
*`p_start_partition` - allows the first partition of a set to be specified instead of it being automatically determined. Must be a valid timestamp (for time-based) or positive integer (for id-based) value. Be aware, though, the actual parameter data type is text. For time-based partitioning, all partitions starting with the given timestamp up to CURRENT_TIMESTAMP (plus `premake`) will be created. For id-based partitioning, only the partition starting at the given value (plus `premake`) will be made. Note that for subpartitioning, this only applies during initial setup and not during ongoing maintenance.
218
218
*`p_default_table` - boolean flag to determine whether a default table is created. Defaults to true.
@@ -222,8 +222,8 @@ RETURNS boolean
222
222
*`p_jobmon` - allow `pg_partman` to use the `pg_jobmon` extension to monitor that partitioning is working correctly. Defaults to TRUE.
223
223
* `p_date_trunc_interval` - By default, pg_partman's time-based partitioning will truncate the child table starting values to line up at the beginning of typical boundaries (midnight for daily, day 1 for monthly, Jan 1 for yearly, etc). If a partitioning interval that does not fall on those boundaries is desired, this option may be required to ensure the child table has the expected boundaries (especially if you also set `p_start_partition`). The valid values allowed for this parameter are the interval values accepted by PostgreSQL's built-in `date_trunc()` function (day, week, month, etc). For example, if you set a 9-week interval, by default pg_partman would truncate the tables by month (since the interval is greater than one month but less than 1 year) and unexpectedly start on the first of the month in some cases. Set this parameter value to `week`, so that the child table start values are properly truncated on a weekly basis to line up with the 9-week interval. If you are using a custom time interval, please experiment with this option to get the expected set of child tables you desire or use a more typical partitioning interval to simplify partition management.
224
224
*`p_control_not_null` - By default, this value is true and the control column must be set to NOT NULL. Setting this to false allows the control column to be NULL. Allowing this is not advised without very careful review and an explicit use-case defined as it can cause excessive data in the DEFAULT child partition.
225
-
*`p_time_encoder` - name of function that encodes a timestamp into a string representing your partition bounds. Setting this implicitly enables time based partitioning and is mandatory for text/uuid control column types. This enables partitioning tables using time based identifiers like uuidv7, ulid, snowflake ids and others. The function must handle NULL input safely. See test-time-daily.sql and test-uuid-daily for usage examples.
226
-
*`p_time_decoder` - name of function that decodes a text/uuid control value into a timestamp. Setting this implicitly enables time based partitioning and is mandatory for text/uuid control column types. This enables partitioning tables using time based identifiers like uuidv7, ulid, snowflake ids and others. The function must handle NULL input safely. See test-time-daily.sql and test-uuid-daily for usage examples.
225
+
*`p_time_encoder` - name of function that encodes a `timestamp` into a string or integer representing your partition bounds. Setting this implicitly enables time based partitioning and is mandatory for text/UUID control column types, or integer control column with `p_epoch` = 'func'. This enables partitioning tables using time based identifiers like UUIDv7, ULID, snowflake ids and others. The function must handle NULL input safely. See test-time-daily.sql and test-UUID-daily for usage examples.
226
+
*`p_time_decoder` - name of function that decodes a text/UUID control value into a `timestamptz`. Setting this implicitly enables time based partitioning and is mandatory for text/UUID control column types, or integer control column with `p_epoch` = 'func'. This enables partitioning tables using time based identifiers like UUIDv7, ULID, snowflake ids and others. The function must handle NULL input safely. See test-time-daily.sql and test-uuid-daily for usage examples.
Snowflake IDs are used in some distributed systems to generate unique, time-ordered IDs without centralization or coordination between nodes. X, Discord, Mastodon and Instagram are known to use these identifiers, and this example will use [Discord's scheme](https://discord.com/developers/docs/reference#snowflakes). The timestamp is encoded in the top 42 bits of a 64-bit integer, and the rest is for worker data and a counter. Discord also measures time from 2015 UTC instead of the UNIX epoch of 1970 UTC, a gap of 1420070400 seconds. The BIGINT type is limited to 63 bits since it is a signed integer, but 63 bits is sufficient to hold Discord IDs until September 2084.
316
+
317
+
The following functions respectively encode and decode snowflake IDs from/to timestamps. Note that when encoding the timestamp, the worker/counter bits are zero, so the returned value is useful as a partition boundary, not as a real ID.
318
+
319
+
```sql
320
+
CREATEFUNCTIONpublic.timestamp_to_snowflake(p_timestamp timestamptz, OUT encoded bigint)
321
+
RETURNS bigint
322
+
LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE
323
+
AS $$
324
+
BEGIN
325
+
SELECT1000*(EXTRACT(epoch FROM p_timestamp) -1420070400)::BIGINT<<22 INTO encoded;
326
+
END
327
+
$$;
328
+
329
+
CREATEFUNCTIONpublic.snowflake_to_timestamp(p_snowflake bigint, OUT ts timestamptz)
330
+
RETURNS TIMESTAMPTZ
331
+
LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE
332
+
AS $$
333
+
BEGIN
334
+
SELECT TO_TIMESTAMP((p_snowflake >>22)/1000+1420070400) INTO ts;
335
+
END
336
+
$$;
337
+
```
338
+
339
+
Now we will instruct partman to use the snowflake encoder and decoder functions with the special value 'func' for `p_epoch`.
### Simple Serial ID: 1 Partition Per 10 ID Values
289
391
For this use-case, the template table is not created manually before calling `create_parent()`. So it shows that if a primary/unique key is added later, it does not apply to the currently existing child tables. That will have to be done manually.
0 commit comments