Skip to content

Conversation

@giswqs
Copy link
Member

@giswqs giswqs commented Jan 30, 2026

Summary

Implements the enhancements from #335 to support training with discrete class landcover data.

Changes

New in train.py:

  • FocalLoss class — focal loss for class imbalance (FL = -α(1-pt)^γ log(pt)), with configurable alpha, gamma, ignore_index (int or False to disable), reduction, and per-class weights. Uses F.log_softmax + F.nll_loss internally.
  • get_loss_function() — factory returning configured CrossEntropyLoss or FocalLoss with flexible ignore_index and optional class weights.
  • compute_class_weights() — computes per-class weights from label tiles with inverse-frequency mode, custom multipliers, and weight capping.
  • train_segmentation_model() — wired with new parameters: loss_function, ignore_index, use_class_weights, class_weights, custom_class_multipliers, max_class_weight, use_inverse_frequency, focal_alpha, focal_gamma.

New in utils.py:

  • min_feature_ratio parameter on export_geotiff_tiles() — filters out tiles where the ratio of non-background pixels is below a threshold (only applies when skip_empty_tiles=True and label data is provided). Summary stats are reported at the end.

New in __init__.py:

  • Exports FocalLoss, get_loss_function, compute_class_weights from train.py
  • Exports LandcoverCrossEntropyLoss, landcover_iou, get_landcover_loss_function, train_segmentation_landcover from landcover_train.py
  • Exports export_landcover_tiles from landcover_utils.py

Backward Compatibility

All new parameters have sensible defaults — existing code continues to work without changes.

Closes #335


Closes #335

…cover training

- Add FocalLoss class to train.py with configurable alpha, gamma, ignore_index
  (supports int or False to disable), reduction, and per-class weights
- Add get_loss_function() helper supporting 'crossentropy' and 'focal' losses
  with flexible ignore_index (int or False) and optional class weights
- Add compute_class_weights() to compute per-class weights from label tiles
  with inverse-frequency mode, custom multipliers, and weight capping
- Wire new loss functions into train_segmentation_model() with parameters:
  loss_function, ignore_index, use_class_weights, class_weights,
  custom_class_multipliers, max_class_weight, use_inverse_frequency,
  focal_alpha, focal_gamma
- Add min_feature_ratio parameter to export_geotiff_tiles() for filtering
  tiles with insufficient non-background pixels during tile export
- Export new public functions from __init__.py: FocalLoss, get_loss_function,
  compute_class_weights, plus landcover module exports
- All new parameters have backward-compatible defaults

Closes #335
Copilot AI review requested due to automatic review settings January 30, 2026 05:34
@github-actions
Copy link

github-actions bot commented Jan 30, 2026

@github-actions github-actions bot temporarily deployed to pull request January 30, 2026 05:37 Inactive
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the training and tiling utilities to better support highly imbalanced, discrete landcover segmentation workflows, and exposes the new landcover-specific helpers at the package root.

Changes:

  • Adds a general-purpose FocalLoss, get_loss_function, and compute_class_weights in train.py, and wires them into train_segmentation_model to support focal loss, class weighting, and flexible ignore_index.
  • Extends export_geotiff_tiles in utils.py with a min_feature_ratio parameter to skip tiles with too few non-background pixels and report ratio-based skipping statistics.
  • Updates __init__.py to export the new training helpers and landcover-specific utilities (LandcoverCrossEntropyLoss, landcover_iou, get_landcover_loss_function, train_segmentation_landcover, export_landcover_tiles).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
geoai/train.py Introduces FocalLoss, a configurable loss factory get_loss_function, class-weight computation via compute_class_weights, and extends train_segmentation_model to use these for class-imbalanced landcover training.
geoai/utils.py Adds min_feature_ratio validation, per-tile ratio filtering, and summary stats reporting to export_geotiff_tiles to drop mostly-background tiles.
geoai/__init__.py Re-exports the new loss/weight utilities from train.py and landcover-specific training and tiling helpers for easier package-level access.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return weights


def train_segmentation_model(
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signature declares a return type of torch.nn.Module, but the implementation does not return a model object (it relies on side effects and saving to output_dir), and the docstring states that it returns None. Please align the return type annotation and documentation with the actual behavior (either return the model or change the annotation/docstring to indicate that the function returns None).

Copilot uses AI. Check for mistakes.
geoai/utils.py Outdated
Comment on lines 3921 to 3923
print(
f"Average feature pixels per tile: {stats['feature_pixels']/stats['tiles_with_features']:.1f}"
)
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When in_class_data is a raster, stats['feature_pixels'] is incremented as soon as a tile has any features, but tiles can still be skipped later by the min_feature_ratio filter. As a result, the "Average feature pixels per tile" summary can include pixels from tiles that were not actually exported, which makes this statistic misleading; it would be more accurate to count feature pixels only for tiles that are written.

Suggested change
print(
f"Average feature pixels per tile: {stats['feature_pixels']/stats['tiles_with_features']:.1f}"
)
if min_feature_ratio is False:
print(
f"Average feature pixels per tile: {stats['feature_pixels']/stats['tiles_with_features']:.1f}"
)
else:
print(
"Average feature pixels per tile is not reported when a min_feature_ratio filter is applied."
)

Copilot uses AI. Check for mistakes.
Comment on lines +2726 to +2727
class_counts: Counter = Counter()
total_pixels = 0
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_class_weights uses Counter here but Counter is never imported in this module, so calling this function will raise a NameError at runtime. Please import Counter from the collections module (or remove the annotation and use a plain dict) so that this function works correctly.

Copilot uses AI. Check for mistakes.
@gw0ods
Copy link
Contributor

gw0ods commented Feb 1, 2026

Hello Professor Wu, I tried reinstalling geoai on another system and the modules I added didnt seem to work properly... I had to change a few things in them. Sorry about that, I cant push a working version next week I think.
I also added a better IOU method for this discrete landcover classification. The issue was that when there is sparse data false positives shouldnt be considered in IOU calculation and model selection.

@giswqs
Copy link
Member Author

giswqs commented Feb 1, 2026

@gw0ods Can you open a new issue and describe what is not working?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Geoai Improvement to Support Training Using Discrete Class Landcover Data

3 participants