Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

💻 💾 📊 Original data, code and results related to the study


💡🧠 Each file and/or folder code corresponds to the script used to generate it, ensuring that all (dataset + script + output) share the same code

⚠️ 📜 Remember to update the script paths in your working directory if you plan to use that code


📁 Folder Content

  • 📜 0_data_curation.r

    • 💡 Purpose: Manage initial data: structure, IDs, input missing data.
    • 💾 ➡️ 💻 Input: 1_raw/final/VF_daten.xlsx, 1_raw/final/Fi-Daten__age.xlsx
    • 💻 ➡️ 💾 Output: 1_processed/0_initial_df_clean/initial_df_clean.csv
  • 📜 1.0_neighborhood_main.r, 1.1_neighborhood_functions.r

    • 💡 Purpose: Calculate variables needed for the analysis using a subplot of 0.33*h radii around each tree.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/0_initial_df_clean/initial_df_clean.csv
    • 💻 ➡️ 💾 Output: 1_data/1_processed/1_neighborhood/*, trees_r33.csv, subplot_stats_r33.csv, neighborhood_stats_r33.csv
  • 📜 2_climate_data.r

    • 💡 Purpose: Calculate climate variables by plot location. ⚠️ Requires prior download of climate data from WorldClim, contact authors for details if needed.
    • 💾 ➡️ 💻 Input: 1_raw/final/Koordinaten.xlsx
    • 💻 ➡️ 💾 Output: 1_data/1_processed/2_clima/df_complete_r33.csv
  • 📜 3_feature_visualization.r

    • 💡 Purpose: Generate graphs and manually study variable relationships.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/2_clima/df_complete_r33.csv
    • 💻 ➡️ 💾 Output: None
  • 📜 4.0_split_dataset.r, 4.1_split_variables.r, 4.2_functions_var_combis.r

    • 💡 Purpose: Split datasets (size and thinning) and variables for case studies.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/2_clima/df_complete_r33.csv
    • 💻 ➡️ 💾 Output: 1_data/1_processed/4_datasets/*
  • 📜 5.0_run_analysis.r, 5.1_LR_analysis.r, 5.2_DT_analysis.r, 5.3_RF_analysis.r, 5.4_NB_analysis.r, 5.5_KNN_analysis.r, 5.6_SVM_analysis.r

    • 💡 Purpose: Run analysis (except ANN) in R for different case studies.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/4_datasets/*
    • 💻 ➡️ 💾 Output: 1_data/1_processed/5_analysis/**case_study**/*, metrics.RData, models.RData
  • 📜 6_HPC

    • 💡 Purpose: Run simulations on iuFOR HPC, split by study case.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/4_datasets/*
    • 💻 ➡️ 💾 Output: 1_data/1_processed/5_analysis/**case_study**/*, metrics.RData, models.RData
  • 📜 7_metrics_compilation.r

    • 💡 Purpose: Extract R analysis metrics and create a checkpoint.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/5_analysis/*, **case_study**/metrics.RData, ann/preds/**case_study**/*, ann/timer/**case_study**/*
    • 💻 ➡️ 💾 Output: 1_data/1_processed/6_final_results/**case_study**/final_metrics.RData
  • 📜 8.0_performance_graphs.r, 8.1_functions_performance_graphs.r, 8.2_classifiers_comparison.r, 8.3_graph_functions.r, 8.4_application_thinning.r, 8.5_application_thinning_comparison.r, 8.6_time_and_performance_graphs.r

    • 💡 Purpose: Compare analysis metrics across case studies using graphs. Graphs in the original paper use 8.5 and 8.6.
    • 💾 ➡️ 💻 Input: 1_data/1_processed/6_final_results/**case_study**/final_metrics.RData
    • 💻 ➡️ 💾 Output: 2_scripts/4_figures/*
  • 📜 9.0_location_map.r, 9.1_neighbour_graphs.r, 9.2_mortality_graphs.r, 9.3_df_mortality_rates.r, 9.4_paper_tables.r

    • 💡 Purpose: Generate graphs (location, neighborhood) and obtain tables for the original paper.
    • 💾 ➡️ 💻 Input: 1_raw/final/Koordinaten.xlsx, 1_data/1_processed/2_clima/df_complete_r33.csv, 1_data/1_processed/1_neighborhood/trees_r33.csv
    • 💻 ➡️ 💾 Output: 2_scripts/4_figures/*

📚 Additional Information

A flowchart detailing the training and testing process (scripts from groups 5 and 6) is shown here:

flowchart