Skip to content

Conversation

@coreyjadams
Copy link
Collaborator

PhysicsNeMo Pull Request

Fixes errors in the release candidate branch.

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.
  • An issue is linked to this pull request.

Dependencies

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Removed scalar fields air_density and stream_velocity from the synthetic_domino_data() function. These scalar values were causing issues when saved to zarr format as they were being treated as zero-length arrays.

The fix is correct because:

  • The test suite already provides these values via the keys_to_read_if_available parameter (line 289-292) with hardcoded tensors [1.225, 10.0]
  • CAEDataset has a built-in mechanism to fill missing keys from keys_to_read_if_available (see cae_dataset.py:116-120)
  • The synthetic test data should only contain array data, not scalar parameters
  • This aligns with the proper separation of concerns: test data files contain mesh/field arrays, while global parameters come from configuration

Confidence Score: 5/5

  • This PR is safe to merge - it's a targeted bug fix that removes problematic scalar values from test data generation
  • The change correctly addresses the zero-length array issue by removing scalar values that were improperly included in the synthetic test dataset. The existing keys_to_read_if_available fallback mechanism already handles these values properly, making the removal safe and correct. The fix is minimal, well-targeted, and doesn't affect production code or real data pipelines.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
test/datapipes/test_domino_datapipe.py 5/5 Removed scalar fields air_density and stream_velocity from synthetic test data generation - these values are already provided via keys_to_read_if_available fallback mechanism

Sequence Diagram

sequenceDiagram
    participant Test as Test Suite
    participant SynthData as synthetic_domino_data()
    participant Zarr as Zarr File
    participant CAEDataset as CAEDataset
    participant DataPipe as DoMINODataPipe
    
    Note over Test,DataPipe: Before: Scalar values causing issues
    Test->>SynthData: Generate test data
    SynthData->>SynthData: Create out_dict with arrays
    SynthData->>SynthData: Add air_density scalar (np.float32)
    SynthData->>SynthData: Add stream_velocity scalar (np.float32)
    SynthData->>Zarr: Save dict with scalars
    Note over Zarr: Scalars treated as zero-length arrays
    Test->>CAEDataset: Create dataset with keys_to_read_if_available
    CAEDataset->>Zarr: Read data files
    CAEDataset->>CAEDataset: Fill missing keys from keys_to_read_if_available
    CAEDataset->>DataPipe: Provide data
    
    Note over Test,DataPipe: After: Clean separation of concerns
    Test->>SynthData: Generate test data
    SynthData->>SynthData: Create out_dict with arrays only
    Note over SynthData: Scalars removed from synthetic data
    SynthData->>Zarr: Save dict (arrays only)
    Test->>CAEDataset: Create dataset with keys_to_read_if_available
    Note over CAEDataset: global_params from keys_to_read_if_available
    CAEDataset->>Zarr: Read data files
    CAEDataset->>CAEDataset: Fill global_params from fallback
    CAEDataset->>DataPipe: Provide complete data
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ktangsali ktangsali changed the base branch from main to 1.3.0-rc October 30, 2025 15:56
@coreyjadams
Copy link
Collaborator Author

/blossom-ci

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Removes two unused scalar fields (air_density and stream_velocity) from the synthetic test dataset generation in test_domino_datapipe.py. These fields were never read by the DoMINO datapipe or used in any tests.

  • The datapipe uses keys_to_read_if_available with default values for global_params_values and global_params_reference (both [1.225, 10.0])
  • The removed fields were only being written to test data files but never accessed
  • This cleanup makes the test dataset more accurate by only including fields that are actually used

Confidence Score: 5/5

  • This PR is safe to merge with no risk
  • The change only removes unused test data fields that were never read by the datapipe or referenced in any tests. The removed fields (air_density and stream_velocity) were dead code that had no functional impact. All necessary data fields remain intact, and the datapipe uses default values from keys_to_read_if_available instead.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
test/datapipes/test_domino_datapipe.py 5/5 Removed unused scalar fields (air_density, stream_velocity) from synthetic test dataset generation - these fields were never read by the datapipe

Sequence Diagram

sequenceDiagram
    participant Test as Test Suite
    participant SyntheticData as synthetic_domino_data()
    participant DataFile as Test Data Files
    participant Dataset as CAEDataset
    participant DataPipe as DoMINODataPipe
    
    Test->>SyntheticData: Generate test data
    SyntheticData->>SyntheticData: Create mesh & field data
    Note over SyntheticData: Removed: air_density, stream_velocity<br/>(unused scalar fields)
    SyntheticData->>DataFile: Save to zarr/npz/npy
    Test->>Dataset: Create CAEDataset
    Note over Dataset: keys_to_read_if_available:<br/>global_params_values=[1.225, 10.0]<br/>global_params_reference=[1.225, 10.0]
    Dataset->>DataFile: Read keys_to_read
    DataFile-->>Dataset: Return data
    Dataset->>Dataset: Fill missing keys with defaults
    Test->>DataPipe: Create DoMINODataPipe
    DataPipe->>Dataset: Request data
    Dataset-->>DataPipe: Return complete dataset
    DataPipe-->>Test: Process and return sample
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ktangsali ktangsali merged commit 14e0874 into NVIDIA:1.3.0-rc Oct 30, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants