Skip to content

Add roundtrip serialization property tests for types in change sets file#1525

Merged
paraseba merged 38 commits intoearth-mover:mainfrom
ebarylko:add-roundtrip-serialization-property-tests-for-types-in-change-sets-file
Jan 16, 2026
Merged

Add roundtrip serialization property tests for types in change sets file#1525
paraseba merged 38 commits intoearth-mover:mainfrom
ebarylko:add-roundtrip-serialization-property-tests-for-types-in-change-sets-file

Conversation

@ebarylko
Copy link
Contributor

@ebarylko ebarylko commented Dec 25, 2025

Summary

There are property tests in icechunk/src/change_set.rs which check that the composition of serializing and deserializing a ChangeSet is equivalent to the identity function.

Changes

icechunk/src/change_set.rs

  • Added arbitraries for the ChangeSet type
  • Added a roundtrip serialization test for the ChangeSet type

icechunk/src/format/manifest.rs

  • Changed the ManifestExtents strategy used for some of the tests

icechunk/src/strategies.rs

  • Added new strategies used for constructing the ChangeSet type in icechunk/src/change_set.rs

icechunk/src/storage/s3.rs

  • Upgraded the behaviour version for the aws config

Copy link
Collaborator

@paraseba paraseba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, a couple of minor comments


type SplitManifest = BTreeMap<ChunkIndices, Option<ChunkPayload>>;

pub fn chunk_indices2() -> impl Strategy<Value = ChunkIndices> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using the existing strategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it would be an easier way to use the existing chunk_indices function by having the new function generate the inputs used for the old function.

}

pub fn split_manifest() -> impl Strategy<Value = SplitManifest> {
btree_map(chunk_indices2(), option::of(chunk_payload()), 3..10)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An issue here is that all chunk indices in a single split manifest should have the same number of dimensions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will change this so all generated chunk indices are of the same size.

}

pub fn manifest_extents(ndim: usize) -> impl Strategy<Value = ManifestExtents> {
pub fn manifest_extents2(ndim: usize) -> impl Strategy<Value = ManifestExtents> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete this function if we believe the new one is superior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the new function is simpler, it does not work when used in the test_property_extents_widths test in manifest.rs.

I still have not figured out how to replicate the property that the test expects in the new function. That is why I left both versions of the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you happen to know why the delta values fall in [0, 99] for the test_property_extents_width test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to modify that test as needed. There is no profound reason for the values selected there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thank you for clarifying that.

use proptest::prelude::*;

prop_compose! {
fn edit_changes()(num_of_dims in any::<u16>().prop_map(usize::from))(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make it more realistic let's keep number of dimensions in [1, 20]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will shrink the range of values for num_of_dims.

}

pub fn manifest_extents(ndim: usize) -> impl Strategy<Value = ManifestExtents> {
pub fn manifest_extents2(ndim: usize) -> impl Strategy<Value = ManifestExtents> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to modify that test as needed. There is no profound reason for the values selected there.


type SplitManifest = BTreeMap<ChunkIndices, Option<ChunkPayload>>;

// pub fn chunk_indices2() -> impl Strategy<Value = ChunkIndices> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's delete this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete this please @ebarylko

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry. I did not realize I forgot to delete the commented code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just deleted it.

// .prop_flat_map(|(dim, data)| chunk_indices(dim, data))
// }

pub fn chunk_indices2(dim: usize) -> impl Strategy<Value = ChunkIndices> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a name like large_chunk_indices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me. I will apply that change.

@ebarylko ebarylko force-pushed the add-roundtrip-serialization-property-tests-for-types-in-change-sets-file branch from ca70a53 to 7da443d Compare January 7, 2026 20:36
@ebarylko ebarylko force-pushed the add-roundtrip-serialization-property-tests-for-types-in-change-sets-file branch from bd3bfe8 to e7da217 Compare January 13, 2026 22:13
})
}

pub fn manifest_extents(ndim: usize) -> impl Strategy<Value = ManifestExtents> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we unify the two manifest_extents functions? Otherwise let's use better names and document the different between the two

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I have not found a way to do so. I will change the names and add a docstring for each function.


type SplitManifest = BTreeMap<ChunkIndices, Option<ChunkPayload>>;

// pub fn chunk_indices2() -> impl Strategy<Value = ChunkIndices> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete this please @ebarylko

@paraseba paraseba enabled auto-merge (squash) January 16, 2026 02:17
@paraseba paraseba disabled auto-merge January 16, 2026 02:17
@paraseba paraseba merged commit 078280b into earth-mover:main Jan 16, 2026
15 of 17 checks passed
@ebarylko ebarylko deleted the add-roundtrip-serialization-property-tests-for-types-in-change-sets-file branch March 3, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants