Support copy from and insert..select pushdown with identity and time partitioning by sfc-gh-mslot · Pull Request #322 · Snowflake-Labs/pg_lake

sfc-gh-mslot · 2026-04-20T20:56:29Z

This PR adds support for pushing down INSERT..SELECT and COPY..FROM when the target table is partitioned using identity or time functions. It uses the PARTITION_BY clause in the COPY TO command in DuckDB to generate paths, using a synthetic column that contains the partition value.

Example:

create table test (x int, y timestamptz default now()) using iceberg WITH (partition_by 'year(y)');
insert into test (x) select s from generate_series(1,100) s;

-- underneath
COPY (SELECT *, datediff('day', date '1970-01-01', y::date) AS __part_0 FROM (SELECT x, CASE WHEN y NOT BETWEEN TIMESTAMPTZ '0001-01-01 00:00:00+00' AND TIMESTAMPTZ '9999-12-31 23:59:59.999999+00' THEN CAST(error(printf('timestamptz out of range: %s', y::VARCHAR)) AS TIMESTAMPTZ) ELSE y END AS y FROM ( SELECT "*SELECT*".s AS x,
    '2026-04-20 20:52:31.007613+00'::timestamptz  AS y
   FROM ( SELECT s.s
           FROM generate_series_int(1, 100) s(s)) "*SELECT*"(s)) AS __iceberg_oor) __partitioned_source) TO 's3://marco-iceberg/iceberg/postgres/public/test/88731/data/7e/7e2038cf-a0cf-4047-b308-112c39622f47' WITH (format 'parquet', compression 'snappy', field_ids {'x': 1, 'y': 2}, row_group_size_bytes '512MB', parquet_version 'V1', return_stats, PARTITION_BY (__part_0))

writes files like:

s3://marco-iceberg/iceberg/postgres/public/test/88731/data/7e/7e2038cf-a0cf-4047-b308-112c39622f47/__part_0=20563/data_0.parquet

where the partition value is obtained by parsing the value after __part_0 (the synthetic column name)

sfc-gh-okalaci

Ok, I see that you are already pushing changes, and some of them already fixed. Still sharing as fyi

sfc-gh-okalaci · 2026-04-22T13:33:46Z

 									   ICEBERG_OOR_NONE,
-									   false /* wrapNativeIntervals */ );
+									   false /* wrapNativeIntervals */ ,
+									   NIL /* partitionByExprs */ );


So, VACUUM will spilt data files which are impacted by partitioned writes skipping target_file_size_mb ? Say, a partitioned COPY wrote 100GB file, we'll spilt it into 200 files each 512MB.

That's probably the right action, we want VACUUM to fix it, but I wanted to raise it here:

To make sure you agree and happy about it

Maybe add a comment for future readers

Hmz, that's a good point. This will affect large data dumps.

I was checking if we can implement partition_by + file_size_bytes on DuckDB, and it seems they landed a PR last week, which doesn't solve it but references it as Future Work: duckdb/duckdb#22225 (comment)

Well, that's both good and bad. We cannot implement it for the current DuckDB version (or very hard), but we can (a) wait for the DuckDB maintainers to implement (b) or consider working on that as a patch here, and then send it over to DuckDB upstream.

sfc-gh-okalaci · 2026-04-22T16:01:57Z

Maybe one final note: We don't seem to have any tests for partition evolution. I don't see any risks, but would be good to say start with no partition, then year(ts) then switch to month(ts) then drop it again, and do pushdowns in between.

…tables Partitioned writes previously went through the row-by-row PartitionedDestReceiver, which routes each row to a per-partition CSV file before converting to Parquet. This change enables DuckDB's PARTITION_BY in COPY TO for tables using pushdownable transforms (identity, year, month, day, hour), letting DuckDB split the data in a single pass. Bucket and truncate transforms continue to use the existing path. Key changes: - Add partition_pushdown.c with transform-to-DuckDB-SQL conversion, query wrapping with synthetic partition columns, and Hive-style path parsing for partition values - Extend WriteQueryResultTo with partitionByExprs parameter that wraps the query with synthetic columns AFTER validation/interval wrappers and adds PARTITION_BY to the COPY command - Modify AddQueryResultToTable to detect pushdownable partitions, pass expressions through, and parse per-file partition values from DuckDB output paths - Lift blanket partition blocks in IsPushdownableInsertSelectQuery and IsCopyFromPushdownable to allow pushdown when all transforms are supported - Disable FILE_SIZE_BYTES when PARTITION_BY is used (DuckDB limitation) Signed-off-by: Marco Slot <marco.slot@snowflake.com>

sfc-gh-dachristensen · 2026-05-07T14:01:38Z

Are there any concerns about the data values as path keys (null bytes, slashes etc)? (I haven't read in detail this PR or previous feedback, so I assume there may already be handled via some sort of escaping method, but it was the first thing I thought of when reviewing the PR description.)

sfc-gh-mslot force-pushed the marcoslot/partitioning-copy-pushdown branch 4 times, most recently from 8ff6221 to 986731f Compare April 21, 2026 09:21