test(iceberg): cover partition field transform conversion#7024
test(iceberg): cover partition field transform conversion#7024jackylee-ch wants to merge 1 commit into
Conversation
Greptile SummaryThis PR adds a parametrized test (
Confidence Score: 4/5Test-only change that adds correct parametrized coverage for four partition field transforms; safe to merge. All four parametrized cases assert the correct dtype, transform value, and source field name. The only gap is that DayTransform remains in a standalone function that never checks pf.transform, leaving that assertion untested even after this PR. tests/io/iceberg/test_iceberg_metadata.py — the pre-existing test_convert_partition_field_day could be folded into the new parametrized test to close a missing transform assertion. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["test_convert_partition_field_transforms (parametrized)"] --> B["convert_iceberg_partition_field(SIMPLE_SCHEMA, IcebergPartitionField)"]
B --> C{Transform}
C --> D["YearTransform → int32, PartitionTransform.year()"]
C --> E["MonthTransform → int32, PartitionTransform.month()"]
C --> F["HourTransform → int32, PartitionTransform.hour()"]
C --> G["TruncateTransform(8) → string, PartitionTransform.iceberg_truncate(8)"]
G2["test_convert_partition_field_day (standalone)"] --> B2["convert_iceberg_partition_field"]
B2 --> H["DayTransform → date (transform not asserted)"]
Reviews (1): Last reviewed commit: "test(iceberg): cover partition field tra..." | Re-trigger Greptile |
| @pytest.mark.parametrize( | ||
| "source_id, transform, field_name, expected_dtype, expected_transform, expected_source_name", | ||
| [ | ||
| (2, YearTransform(), "ts_year", DataType.int32(), PartitionTransform.year(), "ts"), | ||
| (2, MonthTransform(), "ts_month", DataType.int32(), PartitionTransform.month(), "ts"), | ||
| (2, HourTransform(), "ts_hour", DataType.int32(), PartitionTransform.hour(), "ts"), | ||
| (3, TruncateTransform(8), "name_trunc", DataType.string(), PartitionTransform.iceberg_truncate(8), "name"), | ||
| ], | ||
| ids=["year", "month", "hour", "truncate"], | ||
| ) | ||
| def test_convert_partition_field_transforms( | ||
| source_id, transform, field_name, expected_dtype, expected_transform, expected_source_name | ||
| ): | ||
| pf = convert_iceberg_partition_field( | ||
| SIMPLE_SCHEMA, | ||
| IcebergPartitionField(source_id=source_id, field_id=1003, transform=transform, name=field_name), | ||
| ) | ||
| assert pf.field.name == field_name | ||
| assert pf.field.dtype == expected_dtype | ||
| assert pf.source_field.name == expected_source_name | ||
| assert pf.transform == expected_transform |
There was a problem hiding this comment.
DayTransform left outside the parametrized test
DayTransform is conceptually in the same family as YearTransform, MonthTransform, and HourTransform (all are time-based transforms on ts), yet it lives in the standalone test_convert_partition_field_day function. That older test also never asserts pf.transform, so the transform equality for day is currently untested. Folding the day case into test_convert_partition_field_transforms (with PartitionTransform.day() and DataType.date() as expected values) would give it the same assertion depth as the other time-based transforms and keep all similar cases in one place.
Rule Used: Prefer single parametrized functions over multiple... (source)
Learned From
Eventual-Inc/Daft#5207
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Updated in 9493dce: folded the DayTransform case into the parametrized transform test and added the PartitionTransform.day() assertion.
294635b to
9493dce
Compare
Changes Made
Add coverage for Iceberg partition field conversion across day, year, month, hour, and truncate transforms.
Related Issues
None.