Training data ontology#76
Conversation
…ogy-for-model-training-data
added MAD dataset description.
|
@PythonFZ, writing the following points so that we don't forget
|
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a training data ontology for MLIPX by adding new test cases, enhancing spec comparison capabilities, and providing VS Code schema integration. Key changes include:
- New tests for MLIP spec comparisons and relaxation comparisons.
- Expanded MLIP specification support with new YAML files and improved spec resolving.
- A new CLI command to install VS Code schemas and an update to dependency versions.
Reviewed Changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/test_spec.py | Added tests to validate MLIP spec comparisons and dataset resolution. |
| tests/test_relax_compare.py | Introduced tests for comparing relaxation nodes and warning generation. |
| tests/conftest.py | Added a temporary project fixture for DVC integration tests. |
| pyproject.toml | Updated version and added a pydantic dependency. |
| mlipx/spec/spec.py | Expanded MLIP spec definitions and dataset loader functionality. |
| mlipx/spec/compare.py | Implemented recursive spec comparison with metadata stripping. |
| mlipx/nodes/structure_optimization.py | Integrated spec comparison into node comparison logic. |
| mlipx/nodes/generic_ase.py | Added a spec field and get_spec method to support YAML-based MLIP specs. |
| mlipx/cli/main.py | Introduced an install_vscode_schema command for VS Code integration. |
| mlipx/abc.py | Added docstrings for the new get_spec protocol method. |
| .vscode/settings.json | Configured YAML schema paths for MLIP specs. |
| .github/workflows/pytest.yaml | Added a CI workflow for testing across multiple Python versions. |
| docs/source/contributing.rst | Updated documentation to include tips for training data metadata. |
| mlipx/init.pyi | Exported the new spec module in the public API. |
Comments suppressed due to low confidence (1)
mlipx/cli/main.py:202
- The install_vscode_schema command uses json.dumps but does not import the json module. Adding 'import json' at the top of the file should resolve this issue.
mlips_schema_path.write_text(json.dumps(MLIPS.model_json_schema(), indent=2))
|
@sandipde The PR at this point does not contain different levels of compatibility. This is something I would like to spent a little more time on how to add it. Besides this, I think the main structure is there and the compare functionality for the I'd like to implement them in seperate PRs later on and get this version into main asap. Please let me know if you think something important is missing that should be added to this version now. |
|
@PythonFZ fine by me. |
instead of developing a new theme, look into
https://github.com/MolSSI/QCSchema
consider https://github.com/apax-hub/apax/blob/202796914de20f90c78d98912f25ebec8953e220/apax/cli/apax_app.py#L98 over the current code loading from remote.