Skip to content

Comments

pypromice.utilities.compare submodule added#379

Open
PennyHow wants to merge 8 commits intomainfrom
feature/compare-data-variables
Open

pypromice.utilities.compare submodule added#379
PennyHow wants to merge 8 commits intomainfrom
feature/compare-data-variables

Conversation

@PennyHow
Copy link
Member

@PennyHow PennyHow commented Oct 21, 2025

  • Dataset output comparison added to pypromice.utilities module
  • Previously this functionality lived in .github/workflows as a PR action to compare dataset outputs
  • Comparison now updated to include:
    • mean, max and min absolute difference in variables
    • percentage difference in variables
    • Optional plotting (plot_variable_difference())

To run the routines across all sites, the following lines can be used:

import glob, os
import xarray as xr
from pathlib import Path
from pypromice.utilities import compare

rtol = 1e-6
atol = 1e-12

dir1 = "aws-l3-v1.6.0/sites"
dir2 = "aws-l3-v1.7.0/sites"
outdir1 = "out"

site_names = [Path(s).name for s in glob.glob(f"{dir1}/*")]

for site in site_names:

    out = f"{outdir1}/{site}/"
    os.makedirs(out, exist_ok=True)

    # Load datasets
    ds1 = xr.open_dataset(f"{dir1}/{site}/{site}_hour.nc")
    ds2 = xr.open_dataset(f"{dir2}/{site}/{site}_hour.nc")

    # Compare datasets
    report = compare.compare_datasets(ds1, ds2, rtol=rtol, atol=atol)

    # Generate and save Markdown report
    markdown = compare.format_report_md(report)
    report_file = f"{out}/{site}_hour.md"
    with open(report_file, "w") as f:
        f.write(markdown)
    print(f"\nMarkdown report saved to {report_file}")

    # Optional: Plot variable differences
    compare.plot_variable_differences(ds1, ds2, report, out)

I've called the module "utilities" for now (but happy to change), and the submodule does not have to live here in pypromice necessarily

@PennyHow PennyHow added the enhancement New feature or request label Oct 21, 2025
@PennyHow PennyHow requested a review from ladsmund October 21, 2025 15:00
@github-actions
Copy link

github-actions bot commented Oct 21, 2025

Dataset Comparison Report

✅ Datasets match perfectly!

No differences have been found between the datasets produced using the PR branch and the main branch.
⚠️ This report is generated from a small subset of test data that does not reflect all scenarios. Please check using more input data if you suspect changes to the output data have occurred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant