Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces support for the ODIAC (Open-Data Inventory for Anthropogenic Carbon dioxide) Fossil Fuel CO2 Emissions dataset to TorchGeo.
Motivation
The ODIAC dataset provides valuable high-resolution (1km, monthly) global CO2 emission data, which is crucial for climate change research, atmospheric modeling, and environmental machine learning applications. Adding it to TorchGeo makes this important dataset easily accessible within the PyTorch ecosystem for geospatial tasks.
Dataset Details
.tif
), downloaded as individual compressed archives (.tif.gz
) per month.Implementation
torchgeo.datasets.ODIAC
inheriting fromtorchgeo.datasets.RasterDataset
.version
(e.g., 2023, 2022),years
, andmonths
. Defaults to the latest supported version and all available years/months for that version.download=True
) of the required monthly.tif.gz
files directly from the source URLs.checksum=True
) for downloaded.tif.gz
files (checksums populated based on provided lists).gzip
) of downloaded files into a structured directory format (<root>/<year>/<filename.tif>
).RasterDataset
machinery for file indexing (R-tree), querying (__getitem__
), CRS/resolution handling, and caching. The index is built manually in_build_index
to handle the specific file structure and date parsing.plot
method using the 'magma' colormap for visualizing emission intensity.Checklist
RasterDataset
(which extendsGeoDataset
).torchgeo/datasets/odiac.py
.from .odiac import ODIAC
) and addedODIAC
to__all__
intorchgeo/datasets/__init__.py
.tests/data/odiac/data.py
script that generates fake test data with the correct directory structure and filenames.tests/datasets/test_odiac.py
.docs/api/datasets.rst
under the headingODIAC CO2 Emissions
.docs/api/datasets/geo_datasets.csv
.pre-commit run --all-files
) and addressed issues.cd docs && make html
) and verified rendering.Known Issues / Limitations
md5s
dictionary currently contains checksums primarily for the ODIAC2023 and ODIAC2022 monthly files. Checksums for all historical versions/years/months are not included but can be added if required. The download will proceed without checksum verification if a specific checksum is missing.Local documentation build ((Note: Remove this line if you resolved the theme issues or if they didn't appear in the final build).make html
) might show unrelated warnings originating from thepytorch-sphinx-theme
submodule demo files.References