Update dependencies and enforce dependency checking in physicsnemo by coreyjadams · Pull Request #1357 · NVIDIA/physicsnemo

coreyjadams · 2026-01-28T20:22:20Z

PhysicsNeMo Pull Request

This PR has grown a little bit bigger than anticipated, so let me summarize:

First goal was to move zarr and xarray to optional deps instead of core deps.
Along the way, I realized how many optional deps in datapipes aren't protected against spurious imports. It's not a breaking issue, since we don't import those, but I added protection layers against all of them.
This also let me update some dependencies in pyproject.toml.

For pyproject.toml, a new optional dependency group is created for "performance" centric items. Since, in reality, the best performance is from nvidia's libraries, this section is entirely nvidia packages with some sort of cuda binding.

I updated uv.lock as well.

Description

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

be affected, but optional deps are now aggressively wrapped and protected in datapipes.

greptile-apps · 2026-01-28T20:26:21Z

Greptile Overview

Greptile Summary

This PR successfully moves zarr and xarray from core dependencies to optional dependencies and implements comprehensive import protection for all optional dependencies across datapipes.

Key Changes

Dependency Management: Moved zarr and xarray to optional datapipes-extras group, created new perf group for NVIDIA performance libraries (transformer_engine, cuml-cu13, pylibraft-cu13, cupy-cuda13x)
Import Protection Pattern: Consistently applied check_version_spec() with hard_fail=False and importlib.import_module() across all datapipes to defer import errors until runtime
Protected Dependencies: Added protection for zarr, xarray, DALI, netCDF4, scipy, torch_geometric, pyvista, vtk, tfrecord, dask, wrapt, natten, and sparse_dot_mkl
Centralized Utilities: Created physicsnemo/datapipes/healpix/utils.py to centralize xarray import protection for healpix modules
Import Linter Update: Changed .importlinter exclusion from datapipes to metrics to enforce dependency checking in datapipes

Issues Found

The implementation is thorough and follows a consistent pattern. One minor concern identified below.

Important Files Changed

Filename	Overview
pyproject.toml	Moved zarr and xarray to optional deps, created new 'perf' group for NVIDIA performance libraries
.importlinter	Changed exclusion from datapipes to metrics for external imports, fixed whitespace formatting
physicsnemo/datapipes/readers/zarr.py	Added proper import protection for zarr using check_version_spec and importlib
physicsnemo/datapipes/climate/climate.py	Protected DALI, netCDF4, and scipy imports with version checking and helper functions
physicsnemo/datapipes/healpix/utils.py	New utility module centralizing xarray import protection for healpix datapipes
physicsnemo/domain_parallel/shard_utils/natten_patches.py	Protected wrapt and natten imports, conditional patch registration based on availability

greptile-apps

_{6 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-28T20:26:25Z

physicsnemo/domain_parallel/shard_utils/natten_patches.py

+if WRAPT_AVAILABLE:
+    wrapt = importlib.import_module("wrapt")
+
+NATTEN_AVAILABLE = check_version_spec("natten")


Inconsistent with line 35 and the rest of the PR - should use hard_fail=False for optional dependency

Suggested change

NATTEN_AVAILABLE = check_version_spec("natten")

NATTEN_AVAILABLE = check_version_spec("natten", hard_fail=False)

coreyjadams · 2026-01-28T20:28:35Z

This PR also removes fcn_mip_plugin from models. And makes sure that all core packages are part of the import scan.

pzharrington · 2026-01-28T22:03:42Z

physicsnemo/datapipes/climate/utils/invariant.py

+        if not XARRAY_AVAILABLE:
+            raise ImportError(
+                "FileInvariant requires xarray to be installed. "
+                "Install with: pip install xarray"


Do we want these error messages to mention the optional deps groups specified in the toml? May be something for another PR, but I feel like discoverability of those groups is not that good right now

For ex. Nick set up some import handlers/checkers that are based on deps groups and raise errors accordingly -- might clean up a lot of the dedicated check methods that have been added in this PR. See https://github.com/NVIDIA/earth2studio/blob/main/earth2studio/utils/imports.py

It would be nice to have something like that. I was thinking somehow about how we could reduce this boiler plate... I haven't got a solution yet, I'll look at this too. Do you want this in this PR or just in some nebulous future?

In this PR there are a lot of things like the below, which in some cases led to having to create extra files or just some potentially unnecessary bloat:

def _raise_missing_xarray(): raise ImportError( "xarray is required for physicsnemo.datapipes.healpix but is not available. " "Please install xarray with `pip install xarray`." )

I think they could be cleaned up nicely by instead having some sort of e.g. physicsnemo.core.raise_missing_optional_dep method or decorator that could print some ImportError message like

<package> is required for <caller/object/source of import error> but is not available. Please install <package> directly, or install the optional dependency <deps-group>

Where the caller provides the bracketed items in the src code, something like

raise_missing_optional_dep("xarray", "datapipes-extras")

Would also standardize the messaging and herd people towards paying attention to deps groups

We have, in my mind, two problems to solve:

When dependencies are missing, we need to be informative about what is missing and how to get it.

When dependencies are missing, we need to turn ImportError into RunTimeError so that no one ever crashes out with import physicsnemo when they don't even need [healpix | transformer_engine | dali | ...].

Thinking about your suggestion, maybe we can have some sort of fake dependency factory:

class MissingImport(): def __init__(self, library_name): self.lib_name = library_name def __getattr__(self, key): raise RuntimeError("Missing {self.lib_name}, please blah blah blah")

then, we could go a step further with

def maybe_import(package_name: str): # importlib logic to check and import if not available: return MissingImport(package_name) else: return package

It might consolidate the boiler plate and let us automate error raising when it's in python code. It won't help, though, when we do something else like use package.decorator or inherit from something. Those might not be realistic use cases.

Yeah I like that, sounds like it would get the two most important birds. Not sure, but for this:

It won't help, though, when we do something else like use package.decorator or inherit from something

We might also be able to use your idea in some decorator form that would raise the error on the __init__ of an inherited class. Not sure if it would be doable to double-decorate something successfully though

physicsnemo/datapipes/healpix/data_modules.py

…lerplate as much as possible.

coreyjadams · 2026-01-30T02:07:37Z

I blew up this PR, a bit, based on some of the discussions with @pzharrington. It looks bigger than it is. Major changes:

New OptionalImport object from core.version_spec. This can be used as a lazy import, but also will raise a runtime error at first access and not import time.
Allow type checking ifs in import linting. So, we can use if TYPE_CHECKING for static checking without breaking the optional import paradigm. This I don't reallllly love since it's duplicating imports, but it's not necessary except in a couple places and it's not complicated.
I made a registry of useful install hints in OptionalImport. So, we don't need to worry much about writing good error messages, since it's consolidated. It specifies the uv dependency group, where to pip install, other instruction locations, etc. We can elaborate on this - I haven't gone through closely, yet, let's make sure it's the right model first.

about 5000 lines of the +5k / -5k here are just replacing boilerplate with better syntax and un-indenting blocks of code that no longer need to be in an if/else.

pzharrington · 2026-01-30T02:22:04Z

physicsnemo/core/version_check.py

+        "dgl",
+        group="gnns",
+        docs_url="https://www.dgl.ai/pages/start.html",
+    ),


dgl is deprecated so we can drop this

dallasfoster · 2026-01-30T17:52:36Z

physicsnemo/datapipes/climate/era5_hdf5.py

 from pathlib import Path
 from typing import Dict, Iterable, List, Tuple, Union

+import h5py


While we are at it, I would argue that h5py could also be made an optional dependency.

dallasfoster · 2026-01-30T17:54:11Z

pyproject.toml

-    "xarray>=2025.6.1",
    "einops>=0.8.1",
    "h5py>=3.15.1",
    "cftime>=1.6.5",


I would target cftime as another dependency that appears optional, it only appears to be used in examples.

Good call. Used to be used in some functionality but it's been removed since.

dallasfoster · 2026-01-30T17:54:52Z

pyproject.toml

@@ -35,16 +35,15 @@
    "s3fs>=2023.5.0",


@NickGeneva do we really need to keep s3fs?

dallasfoster · 2026-01-30T17:57:23Z

pyproject.toml

 dependencies = [
    "onnx>=1.14.0",
    "warp-lang>=1.5.0",
    "pandas>=2.2.0",


What do we think about pandas? @pzharrington it appears that it might be used in healpix utilities which could be made optional-model dependent?

Hmm, it's also used by the insolation function and the drivaernet datapipe, so not totally clear which optional deps group it would belong in. Since it's not causing install problems (AFAIK) and not too bulky/slow to build, I wouldn't mind just leaving it in core.

dallasfoster

A few other suggestions for dependencies.

…t more tests.

* Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Refactor (#1208) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Add FIGConvNet to crash example (#1207) * Add FIGConvNet to crash example. * Add FIGConvNet to crash example * Update model config * propose fix some typos (#1209) Signed-off-by: John E <jeis4wpi@outlook.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. --------- Signed-off-by: John E <jeis4wpi@outlook.com> Co-authored-by: Alexey Kamenev <alex.kamenev@gmail.com> Co-authored-by: John Eismeier <42679190+jeis4wpi@users.noreply.github.com> * Unmigrate the insolation utils (#1211) * unmigrate the insolation utils * Revert test and compat map * Update importlinter * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Refactor (#1216) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Update activations path in dlwp tests (#1217) * Update activations path in dlwp tests * Update example paths * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Refactor (#1224) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * Refactor (#1231) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * Replace 'License' link with 'Dev blog' link (#1215) Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Validation fu added to examples/structural_mechanics/crash/train.py (#1204) * validation added: works for multi-node job. * rename and rearrange validation function * validate_every_n_epochs, save_ckpt_every_n_epochs added in config * corrected bug (args of model) in inference * args in validation code updated * val path added and args name changed * validation split added -> write_vtp=False * fixed inference bug * bug fix: write_vtp * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Add saikrishnanc-nv to github actors (#1225) * Integrate Curator instructions to the Crash example (#1213) * Integrate Curator instructions * Update docs * Formatting changes * Adding code of conduct (#1214) * Adding code of conduct Adopting the code of conduct from the https://www.contributor-covenant.org/ * Update CODE_OF_CONDUCT.MD Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Create .markdownlintignore * Revise README for PhysicsNeMo resources and guidance Updated the 'Getting Started' section and added new resources for learning AI Physics. * Update README.md --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> Co-authored-by: ram-cherukuri <104155145+ram-cherukuri@users.noreply.github.com> Co-authored-by: Deepak Akhare <dakhare@nvidia.com> Co-authored-by: Sai Krishnan Chandrasekar <157182662+saikrishnanc-nv@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * update import paths * Starting to clean up dependency tree. * Refactor (#1233) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * Replace 'License' link with 'Dev blog' link (#1215) Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Validation fu added to examples/structural_mechanics/crash/train.py (#1204) * validation added: works for multi-node job. * rename and rearrange validation function * validate_every_n_epochs, save_ckpt_every_n_epochs added in config * corrected bug (args of model) in inference * args in validation code updated * val path added and args name changed * validation split added -> write_vtp=False * fixed inference bug * bug fix: write_vtp * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Add saikrishnanc-nv to github actors (#1225) * Integrate Curator instructions to the Crash example (#1213) * Integrate Curator instructions * Update docs * Formatting changes * Adding code of conduct (#1214) * Adding code of conduct Adopting the code of conduct from the https://www.contributor-covenant.org/ * Update CODE_OF_CONDUCT.MD Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Create .markdownlintignore * Revise README for PhysicsNeMo resources and guidance Updated the 'Getting Started' section and added new resources for learning AI Physics. * Update README.md --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Fixed minor bug in shape validation in SongUNet (#1230) Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Add Zarr reader for Crash (#1228) * Add Zarr reader for Crash * Update README * Update validation logic of point data in Zarr reader Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add a test for 2D feature arrays * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> Co-authored-by: ram-cherukuri <104155145+ram-cherukuri@users.noreply.github.com> Co-authored-by: Deepak Akhare <dakhare@nvidia.com> Co-authored-by: Sai Krishnan Chandrasekar <157182662+saikrishnanc-nv@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> * Added coding standards for model implementations as a custom context for greptile (#1219) * Added initial set of coding standards for model implementations Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixed typos + review comments + added details Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added more rules for models Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added model rules to PR checklist Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added cusror rules for models Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Linked the wiki page to the PR template Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixed typo in PR checklist Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Refactor (#1234) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * Replace 'License' link with 'Dev blog' link (#1215) Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Validation fu added to examples/structural_mechanics/crash/train.py (#1204) * validation added: works for multi-node job. * rename and rearrange validation function * validate_every_n_epochs, save_ckpt_every_n_epochs added in config * corrected bug (args of model) in inference * args in validation code updated * val path added and args name changed * validation split added -> write_vtp=False * fixed inference bug * bug fix: write_vtp * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Add saikrishnanc-nv to github actors (#1225) * Integrate Curator instructions to the Crash example (#1213) * Integrate Curator instructions * Update docs * Formatting changes * Adding code of conduct (#1214) * Adding code of conduct Adopting the code of conduct from the https://www.contributor-covenant.org/ * Update CODE_OF_CONDUCT.MD Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Create .markdownlintignore * Revise README for PhysicsNeMo resources and guidance Updated the 'Getting Started' section and added new resources for learning AI Physics. * Update README.md --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Fixed minor bug in shape validation in SongUNet (#1230) Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Add Zarr reader for Crash (#1228) * Add Zarr reader for Crash * Update README * Update validation logic of point data in Zarr reader Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add a test for 2D feature arrays * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Add AR RT and OT schemes to Crash FIGConvNet (#1232) * Add AR and OT schemes for FIGConvNet * Add tests * Soothe the linter * Fix the tests * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> Co-authored-by: ram-cherukuri <104155145+ram-cherukuri@users.noreply.github.com> Co-authored-by: Deepak Akhare <dakhare@nvidia.com> Co-authored-by: Sai Krishnan Chandrasekar <157182662+saikrishnanc-nv@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> Co-authored-by: Alexey Kamenev <alex.kamenev@gmail.com> * Not seeing any errors in testing ... * Breakdown of rules into smaller rules (#1236) * Breakdown of rules into smaller rules Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fix mismatches in rule IDs referenced in rule text Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Refactor (#1240) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * Formatting active learning module docstrings (#1238) * docs: fixing Protocol class reference formatting Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: removing mermaid diagram from protocols Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding active learning index * docs: revising docstrings for sphinx formatting * docs: fix placeholder URL for active learning main docs --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Kelvin Lee <kin.long.kelvin.lee@gmail.com> * Refactor (#1247) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * A new X-MeshGraphNet example for reservoir simulation. (#1186) * X-MGN for reservoir simulation Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * installation bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * more well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve path_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix while space in config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix version inconsistency in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add versions for some libs in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetiem in mlflow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * formatting Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor loop Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * grad accum bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * total loss bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * added some safe guard for connection indexing Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup * cleanup * update configs * Update README.md style guide rule changes * Update README.md * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve docstring fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update license yr Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup well Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cimprove infrence fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetime Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve train.py fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve requirement Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * ilcense header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * license header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder (parallel) + added results to readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * delete some unsed files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * address PR comments Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference grdecl header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * support time series Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor update Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace pickle with json Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add license headers Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unused png files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unsed import Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove emojis Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace print with logger Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update docstring Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor updates Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Add knn to autodoc table. (#1244) --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: tonishi-nv <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet * Refactor (#1249) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * A new X-MeshGraphNet example for reservoir simulation. (#1186) * X-MGN for reservoir simulation Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * installation bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * more well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve path_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix while space in config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix version inconsistency in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add versions for some libs in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetiem in mlflow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * formatting Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor loop Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * grad accum bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * total loss bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * added some safe guard for connection indexing Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup * cleanup * update configs * Update README.md style guide rule changes * Update README.md * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve docstring fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update license yr Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup well Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cimprove infrence fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetime Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve train.py fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve requirement Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * ilcense header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * license header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder (parallel) + added results to readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * delete some unsed files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * address PR comments Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference grdecl header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * support time series Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor update Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace pickle with json Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add license headers Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unused png files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unsed import Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove emojis Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace print with logger Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update docstring Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor updates Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Add knn to autodoc table. (#1244) * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: tonishi-nv <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Automated model registry (#1252) * Deleted RegistreableModule Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Removed 'PhysicsNeMo' suffix in Module.from_torch method Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Implemented automatic registration for Module subclasses Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixed unused name Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Metadata name deprecation (#1257) * Initiated deprecation of field 'name' in ModelMetaData Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Removed all occurences of 'name' field in ModelMetaData Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Refactor (#1258) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * A new X-MeshGraphNet example for reservoir simulation. (#1186) * X-MGN for reservoir simulation Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * installation bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * more well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve path_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix while space in config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix version inconsistency in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add versions for some libs in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetiem in mlflow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * formatting Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor loop Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * grad accum bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * total loss bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * added some safe guard for connection indexing Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup * cleanup * update configs * Update README.md style guide rule changes * Update README.md * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve docstring fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update license yr Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup well Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cimprove infrence fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetime Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve train.py fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve requirement Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * ilcense header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * license header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder (parallel) + added results to readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * delete some unsed files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * address PR comments Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference grdecl header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * support time series Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor update Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace pickle with json Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add license headers Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unused png files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unsed import Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove emojis Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace print with logger Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update docstring Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor updates Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Add knn to autodoc table. (#1244) * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet * Update version (#1193) * Fix depenedncies to enable hello world (#1195) * Remove zero-len arrays from test dataset (#1198) * Merge updates to Gray Scott example (#1239) * Remove pyevtk * update dependency * update dimensions * ci issues * Interpolation model example (#1149) * Temporal interpolation training recipe * Add README * Docs changes based on comments * Update docstrings and README * Add temporal interpolation animation * Add animation link * Add shape check in loss * Updates of configs + trainer * Update config comments * Update README.md style guide edits * Added wandb logging Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Reformated sections in docstring for GeometricL2Loss Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Update README and configs * README changes + type hint fixes * Update README.md * Draft of validation script * Update validation and README * Fixed command in README.md for temporal_interpolation example Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Removed unused import in datapipe/climate_interp.py Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Updated license headers in temporal_interpolation example Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Renamed methods to avoid implicit shadowing in Trainer class Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Cosmetic changes in train.py and removed unused import in validate.py Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added clamp in validate.py to make sure step does not go out of bounds Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added the temporal_interpolation example to the docs + updated CHANGELOG.md Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Addressing remaining comments * Merged two data source classes in climate_interp.py Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> * update versions --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: tonishi-nv <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> Co-authored-by: Kaustubh Tangsali <71059996+ktangsali@users.noreply.github.com> Co-authored-by: Jussi Leinonen <jleinonen@nvidia.com> Co-authored-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> Co-authored-by: Kaustubh Tangsali <ktangsali@nvidia.com> * Remove IPDB * Few more dep fixes. * Refactor (#1261) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet * Remove IPDB * Few more dep fixes. * Enhance checkpoint configuration for DLWP Healpix and GraphCast (#1253) * feat(weather): Improve configuration for DLWP Healpix and GraphCast examples - Added configurable checkpoint directory to DLWP Healpix config and training script. - Implemented Trainer logic to use specific checkpoint directory. - Updated utils.py to respect exact checkpoint path. - Made Weights & Biases entity and project configurable in GraphCast example. * fix(dlwp_healpix): remove deprecated configs - Removed the deprecated `verbose` parameter from the `CosineAnnealingLR` configuration in DLWP HEALPix, which was causing a TypeError. - Removed unused configs from examples/weather/dlwp_healpix/ * Transolver volume (#1242) * Implement transolver ++ physics attention * Enable ++ in Transolver. * Fix temperature correction terms. * Starting work adapting the domino datapipe techniques to transolver. * Working towards transolver volume training by mergeing with domino dataset. Surface dataloading is prototyped, not finished yet. * Updating * Remove printout * Enable transolver for volumetric data * Update transolver training script to support either surface or volume data. Applied some cleanup to make the datapipe similar to domino, which is a step towards unification. * Updating datapipe * Tweak transolver volume configs * Add transolverX model * Enable nearly-uniform sampling of very very large arrays * limit benchmarking to train epoch, enable profiler in config * Update volume config slightly * Update training scripts to properly enable data preloading * Working towards adding a muon optimzier in transolver * Add peter's implementation of muon with a combined optimizer. switch to a flat LR. * Add updated inference script that can also calculate drag and lift * Add better docstrings for typhon * Move typhon to experimental * Move forwards docstring * Adding typhon model and configs. * Update readme. * Update * Remove extra model. Update recipes. * Update cae_dataset.py Implement abstract methods in base classes. * Update Physics_Attention.py Ensure plus parameter is passed to base class. * Update test_mesh_datapipe.py Update import path for mesh datapipe. * Fix ruff issues --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Dileep Ranganathan <8152399+dran-dev@users.noreply.github.com> * Add external import coding standards. * Update external import standards. * Ensure vtk functions are protected. * Protect pyvista import * Closing more import gaps * Remove DGL from meshgraphkan * All models now comply with external import linting. * Remove DGL datapipes * cae datapipes in compliance * Update pyproject.toml * Add version numbers to deps * Refactor (#1261) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * …

* Update for model standards * Migrate loss to metrics * format * Fix CI test

) * chore: initial structure for so2 and so3 layers Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: added warp functions for wigner D-matrices up to l=5 * test: added placeholder unit tests for wigner functions * feat: adding utility function for masking l,m * docs: updating changelog with SO2Convolution mention * feat: adding SO2Convolution definition * feat: defined namespace for symmetry ops * refactor: adding optional edge modulation * chore: removing unused modules Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: adding init in experimental nn space Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: adding unit tests for SO2 convolution Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making SO2 convolution outputs more holistic Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding gate activation layer Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs & refactor: adding note on reference implementation and adding shape validation Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: finalizing unit test suite Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: removing unused kernels file Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: clean up activation unit tests Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding meta MIT license as third party Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding option to specify activation function Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: updating docstrings to use general nonlinearity instead of hard coded SiLU Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making classes inherit from physicsnemo Module Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: moving forward and outputs docs to class docstring Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: increasing tolerances for single precision tests * test: increasing tolerances again --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com>

* update license headers- second try * update end year in license headers * update copyright.txt * Update CONTRIBUTING.md * Update run_benchmarks.sh * resolve conflicts

* Update for model standards * Migrate loss to metrics * format * Fix CI test

) * chore: initial structure for so2 and so3 layers Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: added warp functions for wigner D-matrices up to l=5 * test: added placeholder unit tests for wigner functions * feat: adding utility function for masking l,m * docs: updating changelog with SO2Convolution mention * feat: adding SO2Convolution definition * feat: defined namespace for symmetry ops * refactor: adding optional edge modulation * chore: removing unused modules Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: adding init in experimental nn space Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: adding unit tests for SO2 convolution Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making SO2 convolution outputs more holistic Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding gate activation layer Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs & refactor: adding note on reference implementation and adding shape validation Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: finalizing unit test suite Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: removing unused kernels file Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: clean up activation unit tests Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding meta MIT license as third party Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding option to specify activation function Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: updating docstrings to use general nonlinearity instead of hard coded SiLU Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making classes inherit from physicsnemo Module Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: moving forward and outputs docs to class docstring Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: increasing tolerances for single precision tests * test: increasing tolerances again --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com>

* Add concatenation wrapper for legacy diffusion models * docstring fix * safer torch import * Address feedback, polish docstrings * Add warning for tensor arg * lint * Update dit defaults for doctest * Actually update defaults * license headers

* Update transolver to comply with model standards * Updating transolver for more compliance issues. * Finish most transolver updates. * Use ... for abstract method * Updates for docstrings, typehints consistency. * Address checkpoint restore issues from transolver. Update geotransolver for latest changes * Update license headers * Fix one more license check * fix mlp tests

…A#1290) * Much more aggressive testing against entrypoints and registry. * Fixing docstring test: was missing an import, but also failing with a warp deprecation warning. Updated to wp.Device and made warp start up quietly. * Undo removal of context since the CI container is too old. * Fix the stupid EntryPoint issue in docstring tests I hope. * Ensure the license check actually works properly with precommit * Fix header in test file.

* Update for model standards * Migrate loss to metrics * format * Fix CI test

) * chore: initial structure for so2 and so3 layers Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: added warp functions for wigner D-matrices up to l=5 * test: added placeholder unit tests for wigner functions * feat: adding utility function for masking l,m * docs: updating changelog with SO2Convolution mention * feat: adding SO2Convolution definition * feat: defined namespace for symmetry ops * refactor: adding optional edge modulation * chore: removing unused modules Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: adding init in experimental nn space Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: adding unit tests for SO2 convolution Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making SO2 convolution outputs more holistic Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding gate activation layer Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs & refactor: adding note on reference implementation and adding shape validation Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: finalizing unit test suite Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: removing unused kernels file Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: clean up activation unit tests Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding meta MIT license as third party Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding option to specify activation function Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: updating docstrings to use general nonlinearity instead of hard coded SiLU Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making classes inherit from physicsnemo Module Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: moving forward and outputs docs to class docstring Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: increasing tolerances for single precision tests * test: increasing tolerances again --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com>

* Add concatenation wrapper for legacy diffusion models * docstring fix * safer torch import * Address feedback, polish docstrings * Add warning for tensor arg * lint * Update dit defaults for doctest * Actually update defaults * license headers

coreyjadams · 2026-02-06T16:48:03Z

Something happened in this merge to blow this up to 1400+ files. I"m going to rebase this work onto a fresh PR.

coreyjadams added 2 commits January 28, 2026 13:56

Major overhaul of datapipe dependencies. No functionality should

3cc14e2

be affected, but optional deps are now aggressively wrapped and protected in datapipes.

Update datapipe deps

597657c

coreyjadams requested review from NickGeneva, dallasfoster, ktangsali, mnabian and pzharrington as code owners January 28, 2026 20:22

Add last nvidia package in perf

7017449

greptile-apps bot reviewed Jan 28, 2026

View reviewed changes

Ensure metrics is included in the import scan too

3e1fcb7

Ensure zarr import is not bare in tests

e06f424

pzharrington reviewed Jan 28, 2026

View reviewed changes

physicsnemo/datapipes/healpix/data_modules.py Outdated Show resolved Hide resolved

coreyjadams added 4 commits January 28, 2026 16:21

Entirely remove Dask dependency. Yay

f1a3aa1

Merge branch 'main' into zarr_xarray_optional

4ba1ef5

Merge branch 'NVIDIA:main' into zarr_xarray_optional

dee89c1

Major overhaul of the import protection system. Goal is to reduce boi…

a8bd39e

…lerplate as much as possible.

coreyjadams requested review from RishikeshRanade and peterdsharpe as code owners January 30, 2026 01:56

pzharrington reviewed Jan 30, 2026

View reviewed changes

dallasfoster reviewed Jan 30, 2026

View reviewed changes

pyproject.toml

@@ -35,16 +35,15 @@

"s3fs>=2023.5.0",

Copy link

Collaborator

dallasfoster Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NickGeneva do we really need to keep s3fs?

dallasfoster reviewed Jan 30, 2026

View reviewed changes

coreyjadams and others added 2 commits February 3, 2026 16:42

Updating the optional import functionality just a little. Adding a lo…

3f4e41f

…t more tests.

pzharrington and others added 3 commits February 5, 2026 19:42

Update GraphCast for model standards (NVIDIA#1358)

63de7ad

* Update for model standards * Migrate loss to metrics * format * Fix CI test

Update end year in license headers (NVIDIA#1362)

14f8474

* update license headers- second try * update end year in license headers * update copyright.txt * Update CONTRIBUTING.md * Update run_benchmarks.sh * resolve conflicts

coreyjadams requested review from CharlelieLrt, laserkelvin and megnvidia as code owners February 6, 2026 01:43

pzharrington and others added 8 commits February 5, 2026 20:03

Update GraphCast for model standards (NVIDIA#1358)

dad7e14

* Update for model standards * Migrate loss to metrics * format * Fix CI test

Update GraphCast for model standards (NVIDIA#1358)

267b224

* Update for model standards * Migrate loss to metrics * format * Fix CI test

coreyjadams marked this pull request as draft February 6, 2026 16:48

coreyjadams closed this Feb 11, 2026

coreyjadams deleted the zarr_xarray_optional branch February 13, 2026 16:57

	NATTEN_AVAILABLE = check_version_spec("natten")
	NATTEN_AVAILABLE = check_version_spec("natten", hard_fail=False)

Conversation

coreyjadams commented Jan 28, 2026

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

greptile-apps bot commented Jan 28, 2026

Greptile Overview

Greptile Summary

Key Changes

Issues Found

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

coreyjadams commented Jan 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pzharrington Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pzharrington Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coreyjadams commented Jan 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dallasfoster left a comment

Choose a reason for hiding this comment

Uh oh!

coreyjadams commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pzharrington Jan 28, 2026 •

edited

Loading

pzharrington Jan 28, 2026 •

edited

Loading