-
Notifications
You must be signed in to change notification settings - Fork 65
Add notebooks, tests, and environment setup with improved project metadata for refactor initial release #591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 10 commits
61723a5
4948e41
46f5d4f
293918c
2c29608
aac6a4d
865336a
f31179e
13a1eaa
f51a8ee
e2ea30e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,22 +1,26 @@ | ||
| name: tests | ||
|
|
||
| on: [push, pull_request] | ||
|
|
||
| jobs: | ||
| run-tests: | ||
| runs-on: ubuntu-22.04 | ||
| defaults: | ||
| run: | ||
| shell: bash -el {0} | ||
|
|
||
| steps: | ||
| - name: checkout repository | ||
| uses: actions/checkout@v4.1.2 | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: create environment | ||
| uses: conda-incubator/setup-miniconda@v3 | ||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| mamba-version: "*" | ||
| activate-environment: geo_deep_env | ||
| environment-file: environment.yml | ||
| python-version: "3.10" | ||
| cache: "pip" | ||
|
|
||
| - name: test with pytest | ||
| - name: Install dependencies | ||
| run: | | ||
| pytest tests/ | ||
| python -m pip install --upgrade pip | ||
| pip install -r requirements.txt | ||
|
|
||
| - name: Install pytest | ||
| run: pip install pytest | ||
|
|
||
| - name: Run tests | ||
| run: pytest tests/ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,11 @@ | ||
| *__pycache__** | ||
| *.idea** | ||
| *.vscode** | ||
| *geo_deep_learning/notebooks* | ||
|
|
||
| # Specific folders name | ||
| waterloo_subset_512/ | ||
| mlruns/ | ||
| .ipynb_checkpoints/ | ||
|
|
||
| # Specific files | ||
| environment_full_conda_bckp.yml |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,3 +15,8 @@ repos: | |
| - id: check-yaml | ||
| - id: check-json | ||
| - id: check-added-large-files | ||
|
|
||
| exclude: | | ||
| (?x)( | ||
| ^notebooks/.*\.ipynb$ | ||
| ) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| """ | ||
| Geo Deep Learning (GDL). | ||
|
|
||
| A geospatial deep learning framework for segmentation and related tasks. | ||
| Provides utilities for dataset preparation, model training, evaluation, | ||
| and deployment with PyTorch Lightning. | ||
|
|
||
| Modules include: | ||
| - datasets: data loading and preprocessing for geospatial sources | ||
| - models: deep learning architectures for segmentation | ||
| - datamodules: PyTorch Lightning DataModules for training pipelines | ||
| - tasks_with_models: high-level training tasks coupled with models | ||
| - tools: utilities for data handling and workflow management | ||
| """ |
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's useful to show the working results for each cell. For example, the training example was terminated. I ran the notebook cells and it all works, so best to show the working result. Also, set " accelerator="gpu" " if you can, it's way faster, I suspect you terminated early because of the slow cpu epoch runs. Lastly, add ckpt_path to use the best model when evaluating it gives way better results, bringing hope! "trainer.test(model, datamodule=dm, ckpt_path=trainer.checkpoint_callback.best_model_path)"
|
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| # Notebooks | ||
|
|
||
| This folder contains example notebooks to help you get started with **Geo Deep Learning (GDL)**. | ||
|
|
||
| ## Available Notebooks | ||
|
|
||
| - **`00_quickstart.ipynb`** | ||
| Minimal end-to-end demo: | ||
| 1. Prepare a small sample dataset | ||
| 2. Train a UNet++ model on CPU | ||
| 3. Run inference & visualize predictions | ||
|
|
||
| This version calls **GDL’s core classes directly** (no config files). | ||
| It is meant as the simplest entry point to verify everything works. | ||
|
|
||
| - **`01_quickstart_config.ipynb`** *(coming soon)* | ||
| Same workflow as above, but using **LightningCLI** and GDL’s config files. | ||
| This is the recommended way for reproducible experiments. | ||
|
|
||
| ## Requirements | ||
| - GDL repository cloned locally | ||
| - Environment with proper dependencies (see `requirements.txt` or `pyproject.toml` | ||
|
|
||
| ## Troubleshooting | ||
| `ModuleNotFoundError: No module named 'geo_deep_learning'` (or other module) | ||
|
|
||
| In general, this problem occurs when the paths are not properly defined. Make sure | ||
| to add the repo to your PYTHONPATH. | ||
|
|
||
| Example inside a notebook: | ||
|
|
||
| ```python | ||
| import sys | ||
| sys.path.append("..") | ||
| ``` |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| """Notebooks package for demo and examples.""" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,47 @@ | ||
| [build-system] | ||
| requires = ["setuptools>=61"] | ||
| build-backend = "setuptools.build_meta" | ||
|
|
||
| [project] | ||
| name = "geo-deep-learning" | ||
| version = "0.1.0a0" | ||
| description = "Geospatial deep learning framework for segmentation tasks" | ||
| readme = "README.md" | ||
| authors = [ | ||
| { name = "Victor Alhassan", email = "victor.alhassan@NRCan-RNCan.gc.ca" }, | ||
| { name = "Luca Romanini", email = "luca.romanini@NRCan-RNCan.gc.ca" }, | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need to list all developers from our team who worked on that unless it's a fully new product
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was not sure about that part because. I treated this section as 'who to contact'. As Victor pretty much did a full overhaul of the code, I thought he could be the sole author until we contribute bigger PRs, but also did not want him to have the burden to be the only person to be contacted. There is also a maintainers tag we could use, let me know what feels best/fair for all. |
||
| ] | ||
| requires-python = ">=3.10" | ||
| license = { file = "LICENSE" } | ||
| classifiers = [ | ||
| "Development Status :: 3 - Alpha", | ||
| "Intended Audience :: Science/Research", | ||
| "License :: OSI Approved :: MIT License", | ||
| "Programming Language :: Python :: 3.10", | ||
| "Topic :: Scientific/Engineering :: Artificial Intelligence", | ||
| "Topic :: Scientific/Engineering :: GIS", | ||
| ] | ||
| keywords = ["pytorch", "deep learning", "machine learning", "remote sensing", "satellite imagery", "earth observation", "geospatial"] | ||
|
|
||
| # Dependencies are pulled from requirements.txt | ||
| dynamic = ["dependencies"] | ||
|
|
||
| [tool.setuptools.dynamic] | ||
| dependencies = { file = ["requirements.txt"] } | ||
|
|
||
| [project.optional-dependencies] | ||
| dev = ["pytest", "ruff", "pre-commit"] | ||
|
|
||
| [project.urls] | ||
| Homepage = "https://github.com/NRCan/geo-deep-learning" | ||
| Repository = "https://github.com/NRCan/geo-deep-learning" | ||
| Issues = "https://github.com/NRCan/geo-deep-learning/issues" | ||
|
|
||
|
|
||
| # -------------------------- | ||
| # Ruff configuration | ||
| # -------------------------- | ||
|
|
||
| [tool.ruff] | ||
| exclude = [ | ||
| ".bzr", ".direnv", ".eggs", ".git", ".git-rewrite", ".hg", | ||
|
|
@@ -6,6 +50,7 @@ exclude = [ | |
| ".vscode", "__pypackages__", "_build", "buck-out", "build", "dist", | ||
| "node_modules", "site-packages", "venv" | ||
| ] | ||
| src = ["geo_deep_learning"] | ||
| line-length = 88 | ||
| indent-width = 4 | ||
| target-version = "py310" | ||
|
|
@@ -18,11 +63,9 @@ ignore = [ | |
| "ANN101", "ANN102", # allow skipping `self`, `cls` annotations | ||
| "EXE002", # ignore missing executable bit on scripts with shebangs | ||
| "ERA001", # ignore commented out code | ||
| "TC002", # allow third-party imports in type annotations without TYPE_CHECKING | ||
| ] | ||
|
|
||
| # You can limit to specific rule groups instead of ALL: | ||
| # select = ["E", "F", "W", "I", "N", "UP", "B", "C4", "SIM", "D", "PT"] | ||
|
|
||
| [tool.ruff.lint] | ||
| fixable = ["ALL"] | ||
| unfixable = [] | ||
|
|
@@ -34,3 +77,17 @@ quote-style = "double" | |
| indent-style = "space" | ||
| skip-magic-trailing-comma = false | ||
| line-ending = "auto" | ||
|
|
||
| [tool.ruff.lint.isort] | ||
| # Treat both the package and legacy alias names as first-party | ||
| known-first-party = [ | ||
| "geo_deep_learning", | ||
| "tools", | ||
| "models", | ||
| "datasets", | ||
| "datamodules", | ||
| "tasks_with_models", | ||
| ] | ||
|
|
||
| [tool.ruff.lint.per-file-ignores] | ||
| "tests/*" = ["S101"] | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| ipykernel==6.29.5 | ||
| ipywidgets==8.1.5 | ||
| kornia==0.8.1 | ||
| matplotlib==3.10.5 | ||
| mlflow==2.22.0 | ||
| notebook==7.4.5 | ||
| numpy==1.26.4 | ||
| pandas==2.3.2 | ||
| psutil==7.0.0 | ||
| pytorch-lightning==2.5.0.post0 | ||
| pyyaml==6.0.2 | ||
| rasterio==1.4.3 | ||
| segmentation-models-pytorch==0.5.0 | ||
| timm==1.0.19 | ||
| torchgeo==0.5.2 | ||
| torchmetrics==1.6.0 | ||
| torchvision==0.19.1 | ||
| whitebox==2.3.6 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| """Unit test package for geo-deep-learning.""" |


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already use Miniconda as the environment for running our GDL scripts, should we also run the tests inside the same Miniconda environment? This would help ensure consistency between how the scripts are run and how they’re tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was my first intention, but it kept failing for some reason. I will try and mess around on another branch to see if I can do fix it. Otherwise, I think it can be tackle later with a dedicated issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming back on this as I forgot the real reasoning behing this when first answering.
The repo is built using a pyproject.toml file that complies with PyPi requirements for building a package. With this, you can either specify the dependices directly in the toml file or pin to an external file (here requirements.txt). In both cases, the toml file will rely on pip. Theres a compatibility issue with how the version are 'locked' in the requirements between pip and conda, so a common txt/yaml is not straightfoward.
Another option would be to also include an environment.yaml for conda installs, but it feels like a duplication to me and a risk to not update them both each time.
Anyways, I guess it will be a good topic to discuss on next GDL bi-weekly.