Leaf Lens

Ivan Felipe Rodriguez^1🌿, Thomas Fel^1,2🌿, Gaurav Gaonkar¹, Mohit Vaishnav¹, Herbert Meyer³,
Peter Wilf⁴, Thomas Serre^1🍂

¹ Center for Computational Brain Science, Brown University
² Kempner Institute, Harvard University
³ Florissant Fossil Beds National Monument, National Park Service
⁴ Department of Geosciences, Pennsylvania State University

🌿 Joint first authors | 🍂 Corresponding author

Explore concepts and family classification | Identify unknown fossils

Overview

Leaf Lens is the companion platform to our study "Decoding Fossil Leaves with Artificial Intelligence: An application to the Florissant Formation". This website provides an interactive exploration of how deep neural networks learn to classify fossil angiosperm leaves—one of paleobotany's most persistent challenges.

Our deep learning framework overcomes data scarcity by augmenting sparse fossil data with synthetic examples and aligning extant and fossil leaf domains through representational learning. We demonstrate this approach on the late Eocene Florissant flora of Colorado, achieving well over 90% accuracy for family-level classification across 142 dicot angiosperm families—compared to a chance level of just 3.5%.

Project goals

Our primary objective is to leverage Explainable AI techniques to understand the concepts that matter most for neural networks when classifying leaves. By revealing these concepts, we aim to provide:

Insights into the model's decision-making process, identifying the key features used for classification
A deeper understanding of the relationships between biological taxonomy and computational representations
Visual and interactive tools for exploring how concepts and families are structured within the learned representations

Our system addresses a fundamental challenge: the extreme scarcity of taxonomically vetted fossil specimens. While modern leaf specimens are abundant, fossilization processes—compression, mineralization, fragmentation—create a challenging domain shift between living and fossil forms.

Key highlights

Number of families: 142 dicot angiosperm families
Total dataset: Over 34,000 images (extant and fossil leaves)
Florissant fossils: 3,200 taxonomically vetted specimens spanning 23 families
Classification performance: Well over 90% top-5 accuracy (chance: 3.5%)
Discovered concepts: 2,000+ unique visual concepts extracted via sparse dictionary learning

Features

Interactive visualizations of over 2,000 learned concepts and their relations in embedding space
Family-level exploration of 142 dicot families with representative samples and explanatory maps
Concept pages presenting feature visualizations, top activating examples, and their taxonomic relevance
Comparisons between real fossils and high-fidelity synthetic fossils used for generative augmentation

Broader implications

This research advances one of paleobotany's central challenges—accurate identification of fossil angiosperm leaves—and demonstrates how state-of-the-art AI can be applied to scientific domains with limited training data. Using concept-based interpretability methods, our system surfaces botanically meaningful cues by visually summarizing subtle morphological features that define families across fossil and extant specimens, suggesting new diagnostic characters.

Beyond the Florissant Formation, this cross-domain strategy is readily generalizable to other fossil deposits, positioning this approach for broad use in understanding the evolution and ecological dynamics of ancient terrestrial ecosystems.

Funding and acknowledgments

This material is based upon work supported by the U.S. National Science Foundation under Award No. EAR-1925481 (T.S.) and EAR-1925755 (P.W.), and by ANR-3IA Artificial and Natural Intelligence Toulouse Institute (ANR-19-PI3A-0004).

Computing support was provided by the Center for Computation and Visualization (CCV) at Brown University (via NIH Office of the Director grant S10OD025181). We also acknowledge Google's Cloud TPU hardware resources via the TensorFlow Research Cloud (TFRC) program.

Citations

If you make use of Leaf Lens in your research, please cite:

Main paper:

Rodriguez, I.F., Fel, T., Gaonkar, G., Vaishnav, M., Meyer, H., Wilf, P., & Serre, T. (2025). Decoding Fossil Leaves with Artificial Intelligence: An application to the Florissant Formation.

@article{rodriguez2025fossils,
  title  = {Decoding Fossil Leaves with Artificial Intelligence: 
            An application to the Florissant Formation},
  author = {Rodriguez, Ivan Felipe and Fel, Thomas and Gaonkar, Gaurav and 
            Vaishnav, Mohit and Meyer, Herbert and Wilf, Peter and Serre, Thomas},
  year   = {2025}
}

Dataset:

Wilf, P., Wing, S.L., Meyer, H.W., Rose, J.A., Saha, R., Serre, T., Cúneo, N.R., Donovan, M.P., Erwin, D.M., Gandolfo, M.A., Gonzalez-Akre, E., Herrera, F., Hu, S., Iglesias, A., Johnson, K.R., Karim, T.S., & Zou, X. (2021). An image dataset of cleared, x-rayed, and fossil leaves vetted to plant family for human and machine learning. PhytoKeys, 187, 93–128. https://doi.org/10.3897/phytokeys.187.72350

@article{wilf2021leaves,
  title   = {An image dataset of cleared, x-rayed, and fossil leaves vetted 
             to plant family for human and machine learning},
  author  = {Wilf, Peter and Wing, Scott L. and Meyer, Herbert W. and 
             Rose, Jacob A. and Saha, Rohit and Serre, Thomas and 
             Cúneo, N. Rubén and Donovan, Michael P. and Erwin, Diane M. and 
             Gandolfo, Maria A. and Gonzalez-Akre, Erika and Herrera, Fabiany and 
             Hu, Shusheng and Iglesias, Ari and Johnson, Kirk R. and 
             Karim, Talia S. and Zou, Xiaoyu},
  journal = {PhytoKeys},
  volume  = {187},
  pages   = {93--128},
  year    = {2021},
  doi     = {10.3897/phytokeys.187.72350}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
docs		docs
.gitignore		.gitignore
README.md		README.md
generate_pages.ipynb		generate_pages.ipynb
ids_alive.npy		ids_alive.npy
little_dico.jpg		little_dico.jpg
logo.png		logo.png
mkdocs.yml		mkdocs.yml
mkdocs_template.yml		mkdocs_template.yml
umap_cls.npy		umap_cls.npy
umap_dictionary.npy		umap_dictionary.npy
website_data.npy		website_data.npy
website_data_concepts.npy		website_data_concepts.npy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Leaf Lens

Overview

Project goals

Key highlights

Features

Broader implications

Funding and acknowledgments

Citations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

serre-lab/LeafLens

Folders and files

Latest commit

History

Repository files navigation

Leaf Lens

Overview

Project goals

Key highlights

Features

Broader implications

Funding and acknowledgments

Citations

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages