Update dependencies and enforce dependency checking in physicsnemo#1357
Update dependencies and enforce dependency checking in physicsnemo#1357coreyjadams wants to merge 22 commits intoNVIDIA:mainfrom
Conversation
be affected, but optional deps are now aggressively wrapped and protected in datapipes.
Greptile OverviewGreptile SummaryThis PR successfully moves Key Changes
Issues FoundThe implementation is thorough and follows a consistent pattern. One minor concern identified below. Important Files Changed
|
| if WRAPT_AVAILABLE: | ||
| wrapt = importlib.import_module("wrapt") | ||
|
|
||
| NATTEN_AVAILABLE = check_version_spec("natten") |
There was a problem hiding this comment.
Inconsistent with line 35 and the rest of the PR - should use hard_fail=False for optional dependency
| NATTEN_AVAILABLE = check_version_spec("natten") | |
| NATTEN_AVAILABLE = check_version_spec("natten", hard_fail=False) |
|
This PR also removes |
| if not XARRAY_AVAILABLE: | ||
| raise ImportError( | ||
| "FileInvariant requires xarray to be installed. " | ||
| "Install with: pip install xarray" |
There was a problem hiding this comment.
Do we want these error messages to mention the optional deps groups specified in the toml? May be something for another PR, but I feel like discoverability of those groups is not that good right now
There was a problem hiding this comment.
For ex. Nick set up some import handlers/checkers that are based on deps groups and raise errors accordingly -- might clean up a lot of the dedicated check methods that have been added in this PR. See https://github.com/NVIDIA/earth2studio/blob/main/earth2studio/utils/imports.py
There was a problem hiding this comment.
It would be nice to have something like that. I was thinking somehow about how we could reduce this boiler plate... I haven't got a solution yet, I'll look at this too. Do you want this in this PR or just in some nebulous future?
There was a problem hiding this comment.
In this PR there are a lot of things like the below, which in some cases led to having to create extra files or just some potentially unnecessary bloat:
def _raise_missing_xarray():
raise ImportError(
"xarray is required for physicsnemo.datapipes.healpix but is not available. "
"Please install xarray with `pip install xarray`."
)
I think they could be cleaned up nicely by instead having some sort of e.g. physicsnemo.core.raise_missing_optional_dep method or decorator that could print some ImportError message like
<package> is required for <caller/object/source of import error> but is not available. Please install <package> directly, or install the optional dependency <deps-group>
Where the caller provides the bracketed items in the src code, something like
raise_missing_optional_dep("xarray", "datapipes-extras")
Would also standardize the messaging and herd people towards paying attention to deps groups
There was a problem hiding this comment.
We have, in my mind, two problems to solve:
- When dependencies are missing, we need to be informative about what is missing and how to get it.
- When dependencies are missing, we need to turn
ImportErrorintoRunTimeErrorso that no one ever crashes out withimport physicsnemowhen they don't even need [healpix | transformer_engine | dali | ...].
Thinking about your suggestion, maybe we can have some sort of fake dependency factory:
class MissingImport():
def __init__(self, library_name):
self.lib_name = library_name
def __getattr__(self, key):
raise RuntimeError("Missing {self.lib_name}, please blah blah blah")
then, we could go a step further with
def maybe_import(package_name: str):
# importlib logic to check and import
if not available:
return MissingImport(package_name)
else:
return package
It might consolidate the boiler plate and let us automate error raising when it's in python code. It won't help, though, when we do something else like use package.decorator or inherit from something. Those might not be realistic use cases.
There was a problem hiding this comment.
Yeah I like that, sounds like it would get the two most important birds. Not sure, but for this:
It won't help, though, when we do something else like use package.decorator or inherit from something
We might also be able to use your idea in some decorator form that would raise the error on the __init__ of an inherited class. Not sure if it would be doable to double-decorate something successfully though
|
I blew up this PR, a bit, based on some of the discussions with @pzharrington. It looks bigger than it is. Major changes:
about 5000 lines of the +5k / -5k here are just replacing boilerplate with better syntax and un-indenting blocks of code that no longer need to be in an if/else. |
physicsnemo/core/version_check.py
Outdated
| "dgl", | ||
| group="gnns", | ||
| docs_url="https://www.dgl.ai/pages/start.html", | ||
| ), |
There was a problem hiding this comment.
dgl is deprecated so we can drop this
| from pathlib import Path | ||
| from typing import Dict, Iterable, List, Tuple, Union | ||
|
|
||
| import h5py |
There was a problem hiding this comment.
While we are at it, I would argue that h5py could also be made an optional dependency.
pyproject.toml
Outdated
| "xarray>=2025.6.1", | ||
| "einops>=0.8.1", | ||
| "h5py>=3.15.1", | ||
| "cftime>=1.6.5", |
There was a problem hiding this comment.
I would target cftime as another dependency that appears optional, it only appears to be used in examples.
There was a problem hiding this comment.
Good call. Used to be used in some functionality but it's been removed since.
| @@ -35,16 +35,15 @@ | |||
| "s3fs>=2023.5.0", | |||
| dependencies = [ | ||
| "onnx>=1.14.0", | ||
| "warp-lang>=1.5.0", | ||
| "pandas>=2.2.0", |
There was a problem hiding this comment.
What do we think about pandas? @pzharrington it appears that it might be used in healpix utilities which could be made optional-model dependent?
There was a problem hiding this comment.
Hmm, it's also used by the insolation function and the drivaernet datapipe, so not totally clear which optional deps group it would belong in. Since it's not causing install problems (AFAIK) and not too bulky/slow to build, I wouldn't mind just leaving it in core.
dallasfoster
left a comment
There was a problem hiding this comment.
A few other suggestions for dependencies.
* Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Refactor (#1208) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Add FIGConvNet to crash example (#1207) * Add FIGConvNet to crash example. * Add FIGConvNet to crash example * Update model config * propose fix some typos (#1209) Signed-off-by: John E <jeis4wpi@outlook.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. --------- Signed-off-by: John E <jeis4wpi@outlook.com> Co-authored-by: Alexey Kamenev <alex.kamenev@gmail.com> Co-authored-by: John Eismeier <42679190+jeis4wpi@users.noreply.github.com> * Unmigrate the insolation utils (#1211) * unmigrate the insolation utils * Revert test and compat map * Update importlinter * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Refactor (#1216) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Update activations path in dlwp tests (#1217) * Update activations path in dlwp tests * Update example paths * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Refactor (#1224) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * Refactor (#1231) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * Replace 'License' link with 'Dev blog' link (#1215) Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Validation fu added to examples/structural_mechanics/crash/train.py (#1204) * validation added: works for multi-node job. * rename and rearrange validation function * validate_every_n_epochs, save_ckpt_every_n_epochs added in config * corrected bug (args of model) in inference * args in validation code updated * val path added and args name changed * validation split added -> write_vtp=False * fixed inference bug * bug fix: write_vtp * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Add saikrishnanc-nv to github actors (#1225) * Integrate Curator instructions to the Crash example (#1213) * Integrate Curator instructions * Update docs * Formatting changes * Adding code of conduct (#1214) * Adding code of conduct Adopting the code of conduct from the https://www.contributor-covenant.org/ * Update CODE_OF_CONDUCT.MD Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Create .markdownlintignore * Revise README for PhysicsNeMo resources and guidance Updated the 'Getting Started' section and added new resources for learning AI Physics. * Update README.md --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> Co-authored-by: ram-cherukuri <104155145+ram-cherukuri@users.noreply.github.com> Co-authored-by: Deepak Akhare <dakhare@nvidia.com> Co-authored-by: Sai Krishnan Chandrasekar <157182662+saikrishnanc-nv@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * update import paths * Starting to clean up dependency tree. * Refactor (#1233) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * Replace 'License' link with 'Dev blog' link (#1215) Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Validation fu added to examples/structural_mechanics/crash/train.py (#1204) * validation added: works for multi-node job. * rename and rearrange validation function * validate_every_n_epochs, save_ckpt_every_n_epochs added in config * corrected bug (args of model) in inference * args in validation code updated * val path added and args name changed * validation split added -> write_vtp=False * fixed inference bug * bug fix: write_vtp * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Add saikrishnanc-nv to github actors (#1225) * Integrate Curator instructions to the Crash example (#1213) * Integrate Curator instructions * Update docs * Formatting changes * Adding code of conduct (#1214) * Adding code of conduct Adopting the code of conduct from the https://www.contributor-covenant.org/ * Update CODE_OF_CONDUCT.MD Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Create .markdownlintignore * Revise README for PhysicsNeMo resources and guidance Updated the 'Getting Started' section and added new resources for learning AI Physics. * Update README.md --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Fixed minor bug in shape validation in SongUNet (#1230) Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Add Zarr reader for Crash (#1228) * Add Zarr reader for Crash * Update README * Update validation logic of point data in Zarr reader Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add a test for 2D feature arrays * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> Co-authored-by: ram-cherukuri <104155145+ram-cherukuri@users.noreply.github.com> Co-authored-by: Deepak Akhare <dakhare@nvidia.com> Co-authored-by: Sai Krishnan Chandrasekar <157182662+saikrishnanc-nv@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> * Added coding standards for model implementations as a custom context for greptile (#1219) * Added initial set of coding standards for model implementations Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixed typos + review comments + added details Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added more rules for models Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added model rules to PR checklist Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added cusror rules for models Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Linked the wiki page to the PR template Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixed typo in PR checklist Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Refactor (#1234) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Update crash readme (#1212) * update license headers- second try * update readme * Bump multi-storage-client to v0.33.0 with rust client (#1156) * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Add jaxtyping to requirements.txt for crash sample (#1218) * update license headers- second try * Update requirements.txt * Updating to address some test issues * Replace 'License' link with 'Dev blog' link (#1215) Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Validation fu added to examples/structural_mechanics/crash/train.py (#1204) * validation added: works for multi-node job. * rename and rearrange validation function * validate_every_n_epochs, save_ckpt_every_n_epochs added in config * corrected bug (args of model) in inference * args in validation code updated * val path added and args name changed * validation split added -> write_vtp=False * fixed inference bug * bug fix: write_vtp * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Add saikrishnanc-nv to github actors (#1225) * Integrate Curator instructions to the Crash example (#1213) * Integrate Curator instructions * Update docs * Formatting changes * Adding code of conduct (#1214) * Adding code of conduct Adopting the code of conduct from the https://www.contributor-covenant.org/ * Update CODE_OF_CONDUCT.MD Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Create .markdownlintignore * Revise README for PhysicsNeMo resources and guidance Updated the 'Getting Started' section and added new resources for learning AI Physics. * Update README.md --------- Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com> * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Fixed minor bug in shape validation in SongUNet (#1230) Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Add Zarr reader for Crash (#1228) * Add Zarr reader for Crash * Update README * Update validation logic of point data in Zarr reader Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Add a test for 2D feature arrays * Update examples/structural_mechanics/crash/zarr_reader.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Add AR RT and OT schemes to Crash FIGConvNet (#1232) * Add AR and OT schemes for FIGConvNet * Add tests * Soothe the linter * Fix the tests * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Mohammad Amin Nabian <m.a.nabiyan@gmail.com> Co-authored-by: Yongming Ding <yongmingd@nvidia.com> Co-authored-by: ram-cherukuri <104155145+ram-cherukuri@users.noreply.github.com> Co-authored-by: Deepak Akhare <dakhare@nvidia.com> Co-authored-by: Sai Krishnan Chandrasekar <157182662+saikrishnanc-nv@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> Co-authored-by: Alexey Kamenev <alex.kamenev@gmail.com> * Not seeing any errors in testing ... * Breakdown of rules into smaller rules (#1236) * Breakdown of rules into smaller rules Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fix mismatches in rule IDs referenced in rule text Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Refactor (#1240) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * Formatting active learning module docstrings (#1238) * docs: fixing Protocol class reference formatting Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: removing mermaid diagram from protocols Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding active learning index * docs: revising docstrings for sphinx formatting * docs: fix placeholder URL for active learning main docs --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Kelvin Lee <kin.long.kelvin.lee@gmail.com> * Refactor (#1247) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * A new X-MeshGraphNet example for reservoir simulation. (#1186) * X-MGN for reservoir simulation Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * installation bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * more well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve path_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix while space in config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix version inconsistency in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add versions for some libs in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetiem in mlflow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * formatting Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor loop Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * grad accum bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * total loss bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * added some safe guard for connection indexing Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup * cleanup * update configs * Update README.md style guide rule changes * Update README.md * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve docstring fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update license yr Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup well Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cimprove infrence fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetime Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve train.py fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve requirement Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * ilcense header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * license header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder (parallel) + added results to readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * delete some unsed files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * address PR comments Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference grdecl header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * support time series Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor update Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace pickle with json Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add license headers Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unused png files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unsed import Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove emojis Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace print with logger Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update docstring Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor updates Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Add knn to autodoc table. (#1244) --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: tonishi-nv <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet * Refactor (#1249) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * A new X-MeshGraphNet example for reservoir simulation. (#1186) * X-MGN for reservoir simulation Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * installation bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * more well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve path_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix while space in config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix version inconsistency in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add versions for some libs in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetiem in mlflow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * formatting Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor loop Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * grad accum bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * total loss bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * added some safe guard for connection indexing Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup * cleanup * update configs * Update README.md style guide rule changes * Update README.md * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve docstring fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update license yr Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup well Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cimprove infrence fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetime Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve train.py fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve requirement Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * ilcense header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * license header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder (parallel) + added results to readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * delete some unsed files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * address PR comments Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference grdecl header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * support time series Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor update Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace pickle with json Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add license headers Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unused png files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unsed import Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove emojis Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace print with logger Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update docstring Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor updates Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Add knn to autodoc table. (#1244) * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: tonishi-nv <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Automated model registry (#1252) * Deleted RegistreableModule Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Removed 'PhysicsNeMo' suffix in Module.from_torch method Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Implemented automatic registration for Module subclasses Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Fixed unused name Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Metadata name deprecation (#1257) * Initiated deprecation of field 'name' in ModelMetaData Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Removed all occurences of 'name' field in ModelMetaData Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Refactor (#1258) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * A new X-MeshGraphNet example for reservoir simulation. (#1186) * X-MGN for reservoir simulation Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * installation bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * more well object docstring fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve path_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix while space in config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fix version inconsistency in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add versions for some libs in requirement.txt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve mldlow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetiem in mlflow_utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve exception handling in inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * formatting Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve preprocessor loop Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * grad accum bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * total loss bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * added some safe guard for connection indexing Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * bug fix Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup utils Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup * cleanup * update configs * Update README.md style guide rule changes * Update README.md * fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve docstring fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update license yr Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup well Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup preproc fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cimprove infrence fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve datetime Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve train.py fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme fmt Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve requirement Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * ilcense header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve ecl reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * cleanup Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * license header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder (parallel) + added results to readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * delete some unsed files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * address PR comments Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve inference grdecl header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * support time series Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update config Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor update Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * improve graph builder Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update ecl_reader logging Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace pickle with json Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * add license headers Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unused png files Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove unsed import Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * remove emojis Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * replace print with logger Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update docstring Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * minor updates Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update readme Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> * update header Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> * Add knn to autodoc table. (#1244) * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet * Update version (#1193) * Fix depenedncies to enable hello world (#1195) * Remove zero-len arrays from test dataset (#1198) * Merge updates to Gray Scott example (#1239) * Remove pyevtk * update dependency * update dimensions * ci issues * Interpolation model example (#1149) * Temporal interpolation training recipe * Add README * Docs changes based on comments * Update docstrings and README * Add temporal interpolation animation * Add animation link * Add shape check in loss * Updates of configs + trainer * Update config comments * Update README.md style guide edits * Added wandb logging Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Reformated sections in docstring for GeometricL2Loss Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Update README and configs * README changes + type hint fixes * Update README.md * Draft of validation script * Update validation and README * Fixed command in README.md for temporal_interpolation example Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Removed unused import in datapipe/climate_interp.py Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Updated license headers in temporal_interpolation example Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Renamed methods to avoid implicit shadowing in Trainer class Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Cosmetic changes in train.py and removed unused import in validate.py Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added clamp in validate.py to make sure step does not go out of bounds Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Added the temporal_interpolation example to the docs + updated CHANGELOG.md Signed-off-by: Charlelie Laurent <claurent@nvidia.com> * Addressing remaining comments * Merged two data source classes in climate_interp.py Signed-off-by: Charlelie Laurent <claurent@nvidia.com> --------- Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> * update versions --------- Signed-off-by: Tsubasa Onishi <tonishi@nvidia.com> Signed-off-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: tonishi-nv <tonishi@nvidia.com> Co-authored-by: megnvidia <mmiranda@nvidia.com> Co-authored-by: Kaustubh Tangsali <71059996+ktangsali@users.noreply.github.com> Co-authored-by: Jussi Leinonen <jleinonen@nvidia.com> Co-authored-by: Charlelie Laurent <claurent@nvidia.com> Co-authored-by: Charlelie Laurent <84199758+CharlelieLrt@users.noreply.github.com> Co-authored-by: Kaustubh Tangsali <ktangsali@nvidia.com> * Remove IPDB * Few more dep fixes. * Refactor (#1261) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * Update import paths: cfd + external_aero 1 * Update import paths: cfd + external_aero 2 * Remove more DGL examples * Remove more DGL examples * cfd examples 3 * Last batch of example import fixes! * Enforce and protect external deps in utils. * Remove DGL. :party: * Don't force models yet * Remove IPDB * Few more dep fixes. * Enhance checkpoint configuration for DLWP Healpix and GraphCast (#1253) * feat(weather): Improve configuration for DLWP Healpix and GraphCast examples - Added configurable checkpoint directory to DLWP Healpix config and training script. - Implemented Trainer logic to use specific checkpoint directory. - Updated utils.py to respect exact checkpoint path. - Made Weights & Biases entity and project configurable in GraphCast example. * fix(dlwp_healpix): remove deprecated configs - Removed the deprecated `verbose` parameter from the `CosineAnnealingLR` configuration in DLWP HEALPix, which was causing a TypeError. - Removed unused configs from examples/weather/dlwp_healpix/ * Transolver volume (#1242) * Implement transolver ++ physics attention * Enable ++ in Transolver. * Fix temperature correction terms. * Starting work adapting the domino datapipe techniques to transolver. * Working towards transolver volume training by mergeing with domino dataset. Surface dataloading is prototyped, not finished yet. * Updating * Remove printout * Enable transolver for volumetric data * Update transolver training script to support either surface or volume data. Applied some cleanup to make the datapipe similar to domino, which is a step towards unification. * Updating datapipe * Tweak transolver volume configs * Add transolverX model * Enable nearly-uniform sampling of very very large arrays * limit benchmarking to train epoch, enable profiler in config * Update volume config slightly * Update training scripts to properly enable data preloading * Working towards adding a muon optimzier in transolver * Add peter's implementation of muon with a combined optimizer. switch to a flat LR. * Add updated inference script that can also calculate drag and lift * Add better docstrings for typhon * Move typhon to experimental * Move forwards docstring * Adding typhon model and configs. * Update readme. * Update * Remove extra model. Update recipes. * Update cae_dataset.py Implement abstract methods in base classes. * Update Physics_Attention.py Ensure plus parameter is passed to base class. * Update test_mesh_datapipe.py Update import path for mesh datapipe. * Fix ruff issues --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Dileep Ranganathan <8152399+dran-dev@users.noreply.github.com> * Add external import coding standards. * Update external import standards. * Ensure vtk functions are protected. * Protect pyvista import * Closing more import gaps * Remove DGL from meshgraphkan * All models now comply with external import linting. * Remove DGL datapipes * cae datapipes in compliance * Update pyproject.toml * Add version numbers to deps * Refactor (#1261) * Move filesystems and version_check to core * Fix version check tests * Reorganize distributed, domain_parallel, and begin nn / utils cleanup. * Move modules and meta to core. Move registry to core. No tests fixed yet. * Add missing init files * Update build system and specify some deps. * Reorganize tests. * Update init files * Clean up neighbor tools. * Update testing * Fix compat tests * Move core model tests to tests/core/ * Add import lint config * Relocate layers * Move graphcast utils into model directory * Relocating util functionalities. * Further clean up and organize tests. * utils tests are passing now * Cleaning up distributed tests * Patching tests working again in nn * Fix sdf test * Fix zenith angle tests * Some organization of tests. Checkpoints is moved into utils. * Remove launch.utils and launch.config. Checkpointing is moved to phsyicsnemo.utils, launch.config is just gone. It was empty. * Most nn tests are passing * Further cleanup. Getting there! * Remove constants file * Add import linting to pre-commit. * Move gnn layers and start to fix several model tests. * AFNO is now passing. * Rnn models passing. * Fix improt * Healpix tests are working * Domino and unet working * Updating to address some test issues * MGN tests passing again * Most graphcast tests passing again * Move nd conv layers. * update fengwu and pangu * Update sfno and pix2pix test * update tests for figconvnet, swinrnn, superresnet * updating more models to pass * Update distributed tests, now passing. * Domain parallel tests now passing. * Fix active learning imports so tests pass in refactor * Fix some metric imports * Remove deploy package * Remove unused test file * unmigrate these files ... again? * Update import linter. * Cleaning up diffusion models. Not quite done yet. * Restore deleted files * Updating more tests. * Further updates to tests. Datapipes almost working. * update import paths * Starting to clean up dependency tree. * Fixing and adjusting a broad suite of tests. * Update test/domain_parallel/conftest.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Minor fix * Not seeing any errors in testing ... * Enable import linting on internal imports. * Remove ensure_available function, it's confusing * Add logging imports to utils, and fix imports in examples. * Update imports in minimal examples * Update structural mechanics examples * Update import paths: reservoir_sim * Update import paths: additive manufacturing * Update import paths: topodiff * Update import paths: weather part 1 * Update import paths: weather part 2 * Update import paths: molecular dynamics * Update import paths: geophysics * …
* Update for model standards * Migrate loss to metrics * format * Fix CI test
) * chore: initial structure for so2 and so3 layers Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: added warp functions for wigner D-matrices up to l=5 * test: added placeholder unit tests for wigner functions * feat: adding utility function for masking l,m * docs: updating changelog with SO2Convolution mention * feat: adding SO2Convolution definition * feat: defined namespace for symmetry ops * refactor: adding optional edge modulation * chore: removing unused modules Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: adding init in experimental nn space Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: adding unit tests for SO2 convolution Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making SO2 convolution outputs more holistic Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding gate activation layer Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs & refactor: adding note on reference implementation and adding shape validation Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: finalizing unit test suite Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: removing unused kernels file Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: clean up activation unit tests Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding meta MIT license as third party Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding option to specify activation function Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: updating docstrings to use general nonlinearity instead of hard coded SiLU Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making classes inherit from physicsnemo Module Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: moving forward and outputs docs to class docstring Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: increasing tolerances for single precision tests * test: increasing tolerances again --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com>
* update license headers- second try * update end year in license headers * update copyright.txt * Update CONTRIBUTING.md * Update run_benchmarks.sh * resolve conflicts
* Update for model standards * Migrate loss to metrics * format * Fix CI test
) * chore: initial structure for so2 and so3 layers Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: added warp functions for wigner D-matrices up to l=5 * test: added placeholder unit tests for wigner functions * feat: adding utility function for masking l,m * docs: updating changelog with SO2Convolution mention * feat: adding SO2Convolution definition * feat: defined namespace for symmetry ops * refactor: adding optional edge modulation * chore: removing unused modules Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: adding init in experimental nn space Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: adding unit tests for SO2 convolution Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making SO2 convolution outputs more holistic Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding gate activation layer Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs & refactor: adding note on reference implementation and adding shape validation Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: finalizing unit test suite Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: removing unused kernels file Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: clean up activation unit tests Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding meta MIT license as third party Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding option to specify activation function Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: updating docstrings to use general nonlinearity instead of hard coded SiLU Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making classes inherit from physicsnemo Module Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: moving forward and outputs docs to class docstring Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: increasing tolerances for single precision tests * test: increasing tolerances again --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com>
* Add concatenation wrapper for legacy diffusion models * docstring fix * safer torch import * Address feedback, polish docstrings * Add warning for tensor arg * lint * Update dit defaults for doctest * Actually update defaults * license headers
* Update transolver to comply with model standards * Updating transolver for more compliance issues. * Finish most transolver updates. * Use ... for abstract method * Updates for docstrings, typehints consistency. * Address checkpoint restore issues from transolver. Update geotransolver for latest changes * Update license headers * Fix one more license check * fix mlp tests
…A#1290) * Much more aggressive testing against entrypoints and registry. * Fixing docstring test: was missing an import, but also failing with a warp deprecation warning. Updated to wp.Device and made warp start up quietly. * Undo removal of context since the CI container is too old. * Fix the stupid EntryPoint issue in docstring tests I hope. * Ensure the license check actually works properly with precommit * Fix header in test file.
* Update for model standards * Migrate loss to metrics * format * Fix CI test
) * chore: initial structure for so2 and so3 layers Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: added warp functions for wigner D-matrices up to l=5 * test: added placeholder unit tests for wigner functions * feat: adding utility function for masking l,m * docs: updating changelog with SO2Convolution mention * feat: adding SO2Convolution definition * feat: defined namespace for symmetry ops * refactor: adding optional edge modulation * chore: removing unused modules Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: adding init in experimental nn space Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: adding unit tests for SO2 convolution Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making SO2 convolution outputs more holistic Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding gate activation layer Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs & refactor: adding note on reference implementation and adding shape validation Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: finalizing unit test suite Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * chore: removing unused kernels file Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: clean up activation unit tests Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: adding meta MIT license as third party Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * feat: adding option to specify activation function Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: updating docstrings to use general nonlinearity instead of hard coded SiLU Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * refactor: making classes inherit from physicsnemo Module Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * docs: moving forward and outputs docs to class docstring Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com> * test: increasing tolerances for single precision tests * test: increasing tolerances again --------- Signed-off-by: Kelvin Lee <kinlongkelvi@nvidia.com>
* Add concatenation wrapper for legacy diffusion models * docstring fix * safer torch import * Address feedback, polish docstrings * Add warning for tensor arg * lint * Update dit defaults for doctest * Actually update defaults * license headers
|
Something happened in this merge to blow this up to 1400+ files. I"m going to rebase this work onto a fresh PR. |
PhysicsNeMo Pull Request
This PR has grown a little bit bigger than anticipated, so let me summarize:
For pyproject.toml, a new optional dependency group is created for "performance" centric items. Since, in reality, the best performance is from nvidia's libraries, this section is entirely nvidia packages with some sort of cuda binding.
I updated uv.lock as well.
Description
Checklist
Dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.