Getting started with OlmoEarth: from embeddings to fine-tuning for land use / land cover classification, using an African Wildlife Foundation (AWF) dataset in southern Kenya near Amboseli National Park.
OlmoEarthTutorial.ipynb— follow-along notebook with code cells to run (recommended).OlmoEarthTutorialCompleted.ipynb— fully executed notebook with outputs, for reference.OlmoEarthTutorial.pdf— static PDF of the executed notebook.
The tutorial is designed for Google Colab with a GPU runtime (T4 is sufficient). It also runs locally on a machine with a CUDA GPU or Apple Silicon (MPS).
- Open colab.research.google.com.
- File → Upload notebook and select
OlmoEarthTutorial.ipynbfrom this repo. - Runtime → Change runtime type → GPU.
- Run the cells top to bottom. The first cell installs dependencies; the dataset (~1.8 GB) is downloaded from HuggingFace inside the notebook.
git clone https://github.com/allenai/olmoearth_ml4rs_tutorial.git
cd olmoearth_ml4rs_tutorial
python -m venv .venv && source .venv/bin/activate
pip install olmoearth_pretrain rslearn scikit-learn matplotlib einops \
huggingface_hub 'jsonargparse[signatures]>=4.27.7' \
jupyter rasterio scipy
jupyter notebook OlmoEarthTutorial.ipynb| Approach | Time | GPU memory |
|---|---|---|
| Embeddings + kNN / linear probe | minutes | ~2–3 GB |
| Fine-tune (4 epochs) | ~15–20 min | ~4–6 GB |
| Fine-tune (30 epochs) | ~2–3 hours | ~4–6 GB |
Default settings complete in roughly 30–45 minutes end-to-end.