You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/usage/working-with-partitions.md
+11-23
Original file line number
Diff line number
Diff line change
@@ -116,17 +116,22 @@ If you have a more fine-grained predicate than a partition filter, you can use t
116
116
117
117
## Updating Partitioned Tables with Merge
118
118
119
-
You can perform merge operations on partitioned tables in the same way you do on non-partitioned ones—simply provide a matching predicate that references partition columns if needed.
119
+
You can perform merge operations on partitioned tables in the same way you do on non-partitioned ones. Simply provide a matching predicate that references partition columns if needed.
120
+
121
+
You can match on both the partition column (country) and some other condition. This example shows a merge operation that checks both the partition column (“country”) and a numeric column (“num”) when merging:
122
+
- The table is partitioned by “country,” so underlying data is physically split by each country value.
123
+
- The merge condition (predicate) matches target rows where both “country” and “num” align with the source.
124
+
- When a match occurs, it updates “letter”; otherwise, it inserts the new row.
125
+
- This approach ensures that only rows in the relevant partition (“US”) are processed, keeping operations efficient.
120
126
121
-
For example, you can match on both the partition column (country) and some other condition:
122
127
```python
123
128
from deltalake import DeltaTable
124
129
import pyarrow as pa
125
130
126
131
dt = DeltaTable("tmp/partitioned-table")
127
132
128
-
#Source data referencing an existing partition "US"
If the partition does not exist (say for a new country value), a new partition folder will be created automatically.
147
-
148
-
(See more in the docs on merging tables.)
149
-
150
-
## Query Optimizations with Partitions
151
-
152
-
Partitions allow data skipping for queries that include the partition columns. For example, if your partition column is date, any query with a clause like WHERE date = '2023-01-01' or WHERE date >= '2023-01-01' AND date < '2023-01-10' can skip reading all files not in those partitions.
This command logically deletes the data by creating a new transaction. (See docs on deleting rows for more.)
160
+
This command logically deletes the data by creating a new transaction.
173
161
174
162
## Maintaining Partitioned Tables
175
163
176
164
### Optimize & Vacuum
177
165
178
-
Partitioned tables can suffer from many small files if frequently appended to. If needed, you can run optimize compaction on a specific partition:
166
+
Partitioned tables can accummulate many small files if a partition is frequently appended to. You can compact these into larger files on a specific partition:
0 commit comments