Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

💻 💾 📊 Original data, code and results related to the study

📖 Manuscript DOI: Does machine learning outperform logistic regression in predicting individual tree mortality?

📂 Repository DOI:

✨ Highlights

6 different Machine Learning algorithms were compared in predicting individual tree mortality.
Effects of dataset size, variable set, thinning, inventory length, and cross-validation were studied.
Random Forest reached a higher performance level in all the case studies proposed except on cross-validation.
Logistic binomial Regression seems to be a more robust algorithm regarding cross-validation.

📖 Abstract

Tree mortality is a crucial process in forest dynamics and a key component of forest growth models and simulators. Factors like competition, drought, and pathogens drive tree mortality, but the underlying mechanism is challenging to model. The current environmental changes are even complicating model approaches as they influence and alter all the factors involving mortality. However, innovative classification algorithms can go deep into data to find patterns that can model or even explain their relationship. We use Logistic binomial Regression as the reference algorithm for predicting individual tree mortality. However, different machine learning (ML) alternatives already applied to other forest modeling topics can be used for this purpose. Here, we compare the performance of five different ML algorithms (Decision Trees, Random Forest, Naive Bayes, K-Nearest Neighbour, and Support Vector Machine) against Logistic binomial Regression in individual tree mortality classification under 40 different case studies and a cross-validation case study. The data used corresponds to Norway spruce long-term experimental plots, which have a total of 75,522 tree records and a 10.28% mortality rate on average. Through different case studies, when more variables were used, general performance improved as expected, while more extensive datasets decreased the performance level of the algorithms. Performance was also higher when plots remained without management compared to thinned ones. Random Forest outperformed the other algorithms in all the cases except cross-validation, where it was the weaker one. Our results demonstrate the potential of ML in assessing tree mortality. When the model application is not clearly defined and/or model interpretability is needed, Logistic binomial Regression is still the best tool for evaluating individual tree mortality.

📁 Repository Contents

📂 1_data: raw and processed data, check here for a detailed description
📂 2_code: compilation of the code used for data curation, analysis and outputs included in the document, check here for a detailed description
📂 3_figures: figures, charts, tables and additional resources included in the document, check here for a detailed description
📂 4_bibliography: compilation of all the literature cited or consulted during the creation of the document

🤔 How to use the resouces of that repository

💫 To download the information of that repository, you can follow this guide.

♻️ To reproduce the analysis, users must:

💾 Data:
- WorldClim data required for the simulations must be downloaded from its official website
💻 Prerequisites: installation and code: R must be installed to run the code with the used libraries across each script (RStudio was also used to develop the code). Some analyses (specifically when training RF models) will request high computation power, which can provoke out-of-memory in a normal computer. Access to high-computing services is highly recommended in those cases.
📜 Usage: follow the numerical order of the scripts to reproduce each step correctly

🔗 About the authors

Aitor Vázquez Veloso:

Astor Toraño Caicoya:

Felipe Bravo Oviedo:

Peter Biber:

Enno Uhl:

Hans Pretzsch:

ℹ License

The content of this repository is under the MIT license.

📝 How to cite this repository?

You can use the citation file or copy the citation directly into APA or BibTeX using the bottom Cite this repository on the right hand side of the repository content, here are more details.

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
1_data		1_data
2_code		2_code
3_figures		3_figures
4_bibliography		4_bibliography
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

💻 💾 📊 Original data, code and results related to the study

📖 Manuscript DOI: Does machine learning outperform logistic regression in predicting individual tree mortality?

📂 Repository DOI:

✨ Highlights

📖 Abstract

📁 Repository Contents

🤔 How to use the resouces of that repository

🔗 About the authors

Aitor Vázquez Veloso:

Astor Toraño Caicoya:

Felipe Bravo Oviedo:

Peter Biber:

Enno Uhl:

Hans Pretzsch:

ℹ License

📝 How to cite this repository?

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

💻 💾 📊 Original data, code and results related to the study

📖 Manuscript DOI: Does machine learning outperform logistic regression in predicting individual tree mortality?

📂 Repository DOI:

✨ Highlights

📖 Abstract

📁 Repository Contents

🤔 How to use the resouces of that repository

🔗 About the authors

Aitor Vázquez Veloso:

Astor Toraño Caicoya:

Felipe Bravo Oviedo:

Peter Biber:

Enno Uhl:

Hans Pretzsch:

ℹ License

📝 How to cite this repository?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages