Skip to content

Conversation

@mariusaurus
Copy link
Collaborator

@mariusaurus mariusaurus commented Sep 15, 2025

Earth2Studio Pull Request

Description

This PR adds a connector to TempestExtremes in the form of a diagnostic model.
Since TE natively only supports command-line interface, the model is implemented by spawning a sub-process call. The command for that call has to be provided by the user, providing flexibility in usage and installation of TE.

Due to the nature of track data (multiple orders of magnitude smaller than field data and unstructured), the diagnostic here does not return the tracks as tensor, but finishes with TE writing its file to disk. That leaves the question if this and potentially other trackers fit in this category. Maybe this PR can be basis for discussion.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.
  • An issue is linked to this pull request.

Dependencies

@NickGeneva NickGeneva mentioned this pull request Sep 22, 2025
5 tasks
@mariusaurus mariusaurus marked this pull request as ready for review October 7, 2025 16:11
@mariusaurus
Copy link
Collaborator Author

hi @NickGeneva @dallasfoster ,
from my side the PR is ready to review.
testing and ensemble size>1 is now supported as well.


print(f"took {(time.time() - then):.1f}s to track cyclones")

return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we make a decision on !494, this PR will require returning an nd array before merging.

@dallasfoster
Copy link
Collaborator

Are there any instructions for installation or checks if the software is not available?

@mariusaurus
Copy link
Collaborator Author

Are there any instructions for installation or checks if the software is not available?

yes, just added install instructions to the docs and checks if software is available is done at class initialisation. Since TE is not available as python package, users have to install it (locally) on thier system. AT class ini, they have the option to provide the path to the executable and until that path is known it is not possible to test install.

The default dependency failure does not work here for the same reason, and also cannot be hacked to write a message which is reasonable for the present case. So I provided a custom one. Hope that's fine

@mariusaurus
Copy link
Collaborator Author

would a small example of using TE help? if so, where should I put it? could do at the end of the tempest_extremes.py file, just not sure if it justifies a full example.

@mariusaurus
Copy link
Collaborator Author

Hi!
have you found a solution where to best put this? thanks!

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR adds TempestExtremes cyclone tracking functionality to Earth2Studio as a diagnostic model. The implementation includes both synchronous and asynchronous versions that integrate with TempestExtremes' command-line interface via subprocess calls. The PR also refactors coordinate utilities by moving the cat_coords function from recipe-specific code to the main earth2studio.utils.coords module and adds a new tile_xx_to_yy function for tensor shape matching. Both functions handle coordinate system operations alongside tensor manipulations, essential for combining different meteorological data types in tracking workflows. The implementation addresses previous review feedback by moving utility functions to appropriate locations and adding comprehensive unit tests.

Important Files Changed

Filename Score Overview
earth2studio/models/dx/init.py 2/5 Added imports for TempestExtremes classes but failed to include them in all list, breaking public API
earth2studio/models/dx/tempest_extremes.py 4/5 Added TempestExtremes diagnostic model with subprocess integration, contains typos and hardcoded paths
earth2studio/utils/coords.py 4/5 Added tile_xx_to_yy and cat_coords utility functions for tensor operations and coordinate system handling
test/models/dx/test_tempest_extremes.py 5/5 Comprehensive test coverage for TempestExtremes functionality with proper mocking and cleanup
test/utils/test_coords.py 4/5 Added unit tests for new coordinate utility functions with basic coverage
recipes/hens/src/hens_utilities.py 5/5 Cleanly removed cat_coords function as part of refactoring to centralize utilities
recipes/hens/src/hens_ensemble.py 5/5 Updated imports to use cat_coords from main utils module instead of local utilities
docs/userguide/about/install.md 4/5 Updated installation documentation to distinguish TempestExtremes from Python-based trackers

Confidence score: 2/5

  • This PR has critical API issues that make the new functionality inaccessible to users
  • Score lowered due to missing all exports in dx/init.py, multiple typos in docstrings, hardcoded system paths, and duplicate documentation
  • Pay close attention to earth2studio/models/dx/init.py and earth2studio/models/dx/tempest_extremes.py for the critical API and code quality issues

Sequence Diagram

sequenceDiagram
    participant User
    participant Workflow as "Ensemble Workflow"
    participant TempestExtremes as "TempestExtremes"
    participant KVStore as "KV Store"
    participant NetCDF as "NetCDF File"
    participant TEDetect as "TE DetectNodes"
    participant TEStitch as "TE StitchNodes"
    participant OutputFile as "Track Output File"

    User->>TempestExtremes: "Initialize with detect_cmd, stitch_cmd, input_vars"
    
    TempestExtremes->>TempestExtremes: "format_tempestextremes_commands()"
    TempestExtremes->>TempestExtremes: "check_tempest_extremes_availability()"
    TempestExtremes->>KVStore: "initialise_te_kvstore()"
    TempestExtremes->>TempestExtremes: "setup_output_directories()"
    
    loop "For each forecast step"
        Workflow->>TempestExtremes: "record_state(tensor, coords)"
        
        alt "static vars present"
            TempestExtremes->>TempestExtremes: "cat_coords() - concatenate static data"
        end
        
        TempestExtremes->>KVStore: "write() - store model output"
    end
    
    User->>TempestExtremes: "track_cyclones()"
    
    TempestExtremes->>TempestExtremes: "setup_files()"
    TempestExtremes->>TempestExtremes: "dump_raw_data()"
    TempestExtremes->>NetCDF: "Create NetCDF file with [time, lat, lon] structure"
    
    loop "For each ensemble member"
        TempestExtremes->>TEDetect: "subprocess call DetectNodes --in_data_list --out_file_list"
        TEDetect->>NetCDF: "Read atmospheric data"
        TEDetect->>TEDetect: "Detect cyclone centers"
        
        TempestExtremes->>TEStitch: "subprocess call StitchNodes --in --out"
        TEStitch->>TEStitch: "Stitch detected centers into tracks"
        TEStitch->>OutputFile: "Write cyclone tracks (CSV format)"
    end
    
    TempestExtremes->>TempestExtremes: "tidy_up() - cleanup temp files"
    
    alt "keep_raw_data = True"
        TempestExtremes->>NetCDF: "Move NetCDF to raw_data directory"
    else
        TempestExtremes->>NetCDF: "Delete NetCDF file"
    end
    
    TempestExtremes->>User: "Cyclone tracking complete"
Loading

8 files reviewed, 21 comments

Edit Code Review Agent Settings | Greptile


# Wait a bit and verify record hasn't completed
record_started.wait(timeout=1)
import time
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Name collision with imported module - local time variable shadows the imported time module

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The most significant change is a critical syntax error in earth2studio/models/dx/__init__.py where a missing comma in the __all__ list causes string concatenation and breaks the module's public API. This makes the new TempestExtremes classes inaccessible through standard imports. Additionally, coordinate utility functions have been enhanced with proper validation logic to prevent coordinate system mismatches during tensor operations, and comprehensive test coverage has been added to ensure these utilities handle edge cases reliably.

Important Files Changed

Filename Score Overview
earth2studio/models/dx/init.py 1/5 Critical syntax error - missing comma in __all__ list breaks module imports
earth2studio/utils/coords.py 4/5 Added coordinate system validation to prevent mismatches during tiling operations
test/utils/test_coords.py 5/5 Added comprehensive test coverage for coordinate utility functions with edge cases
docs/userguide/about/install.md 5/5 Fixed typo: "executibles" corrected to "executables" in documentation
earth2studio/models/dx/tempest_extremes.py 3/5 Complex TempestExtremes implementation with subprocess management and threading

Confidence score: 1/5

  • This PR has a critical syntax error that will immediately break imports and make the new functionality inaccessible
  • Score is severely lowered due to the missing comma in __all__ list which causes string concatenation instead of proper list formatting
  • The TempestExtremes implementation has complex threading and subprocess management that needs careful review for production readiness

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile


__all__ = [
"ClimateNet",
"AsyncTempestExtremes" "ClimateNet",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Missing comma between "AsyncTempestExtremes" and "ClimateNet" causes string concatenation instead of separate list items

Suggested change
"AsyncTempestExtremes" "ClimateNet",
"AsyncTempestExtremes",
"ClimateNet",

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The primary change made is addressing a critical import issue in the test file for TempestExtremes functionality. The developer moved the time import from a local scope within a test function to the module level (line 5), resolving a variable shadowing issue where a local time variable was conflicting with the time module import. This fix ensures that time.sleep(0.1) calls (lines 497-498) correctly reference the imported time module rather than attempting to call a method on a local variable. The comprehensive test suite remains intact, providing thorough coverage for both synchronous and asynchronous TempestExtremes diagnostic model functionality including initialization, command formatting, coordinate systems, data operations, file management, async behavior, error handling, and cleanup operations.

Important Files Changed

Filename Score Overview
test/models/dx/test_tempest_extremes.py 5/5 Fixed critical import issue by moving time import to module level, resolving variable shadowing problem

Confidence score: 5/5

  • This PR is safe to merge with minimal risk as it only fixes a critical import structure issue
  • Score reflects a straightforward fix that resolves a previously identified variable shadowing problem with no negative side effects
  • No files require special attention as this is a simple import reorganization fix

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The recent changes focus on fixing a critical import bug in the diagnostic models module. The developer has corrected a syntax error in the __all__ list within earth2studio/models/dx/__init__.py where "AsyncTempestExtremes" and "ClimateNet" were improperly concatenated due to a missing comma separator. Additionally, the "TempestExtremes" class has been added to the public API exports to ensure both synchronous and asynchronous versions of the TempestExtremes diagnostic model are properly accessible. This change is part of integrating TempestExtremes tracking functionality into the earth2studio framework, allowing users to interface with the TempestExtremes command-line tool through subprocess calls for extreme weather event tracking.

Important Files Changed

Filename Score Overview
earth2studio/models/dx/__init__.py 5/5 Fixed critical syntax error in __all__ list by adding missing comma between "AsyncTempestExtremes" and "ClimateNet", and added "TempestExtremes" to public API exports

Confidence score: 5/5

  • This PR is safe to merge with minimal risk
  • Score reflects a simple but critical syntax fix that resolves module import accessibility without introducing complexity or breaking changes
  • No files require special attention - the change is straightforward and addresses a clear bug

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants