Skip to content

Feature: Select validation in evenly spaced blocks#949

Open
melisande-c wants to merge 13 commits into
mainfrom
mc/feat/val-selection
Open

Feature: Select validation in evenly spaced blocks#949
melisande-c wants to merge 13 commits into
mainfrom
mc/feat/val-selection

Conversation

@melisande-c
Copy link
Copy Markdown
Member

@melisande-c melisande-c commented May 12, 2026

Disclaimer

  • I am an AI agent.
  • I have used AI and I thoroughly reviewed every line.
  • I have not used AI extensively.

Description

Note

tldr: Randomly selecting validation patches could result in a configuration that reduces the likelyhood of some regions being selected for training. Additionally the validation might not accurately represent the statistics of the data.

Background - why do we need this PR?

For smallish image sizes selecting the validation patches badly could result in a reduction in the probability that some regions are selected for validation. This happens when the validation patches are too close too each other or the edge, making it unlikely to select the region between the validation patches for training.

Overview - what changed?

Added validation selection functionality that groups validation patches together in evenly spaced blocks so that there is always a gap greater than two patch widths between validation blocks, (and between the edges of the image and the blocks).

If this results in too many patches then validation patches are randomly removed, only if there removal does not result in a gap of 1, until the desired number of validation patches.

Implementation - how did you implement the changes?

  • Find how many validation patches there should be in each dimension.
  • Find the best validation block size (and gap size) in each dimension
  • Create the validation blocks
  • Randomly remove patches until we have the desired number.

Changes Made

New features or files

  • select_validation function + helper functions

Modified features or files

  • create_val_split function -> replaced random val selection with block val selection

How has this been tested?

  • Added tests for each new function.
  • Ran training with validation splitting and results still look good.
  • Included demo script.

Related Issues

Additional Notes and Examples

The included demo script produces the following results:

You can see how the random validation selection is effectively reducing the training data for the larger percentages of validation.

The new validation patch selection
block_validation_selection

The previous random validation selection
random_validation_selection


Please ensure your PR meets the following requirements:

  • Code builds and passes tests locally, including doctests
  • New tests have been added (for bug fixes/features)
  • Documentation has been updated
  • Pre-commit passes

Base automatically changed from dev/v0.2 to main May 13, 2026 12:38
@melisande-c melisande-c marked this pull request as ready for review May 13, 2026 18:48
@melisande-c melisande-c requested a review from a team May 13, 2026 18:48
@melisande-c
Copy link
Copy Markdown
Member Author

Also the effect on the patch distribution for 1 epoch of patches:

Block:
block_validation_selection_1_epoch

Random:
random_validation_selection_1_epoch

Copy link
Copy Markdown
Member

@jdeschamps jdeschamps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not have time to delve too much in the details, but looks good.

Let's improve the error message as mentioned in the chat, and see the benchmark results.

for j in range(ndims)
if i != j and val_coords_1D[j] is None
]
# remaining val patches for the dimensions not calculate yet
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# remaining val patches for the dimensions not calculate yet
# remaining val patches for the dimensions not calculated yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default number of validation patches may cause issues for small number of patches

2 participants