Weverton C. F. Trindade, Luis F. Arias-Giraldo, Luis Osorio-Olvera, A.
Townsend Peterson, and Marlon E. Cobos

- Package description
- Installing the package
- Workflow in kuenm2
- Checking the vignettes
- Note on AI usage
- Contributing
kuenm2 is an new version of kuenm Cobos et al. 2019, an R package designed to make the process of ecological niche modeling (ENM) easier, faster, and more reproducible, and at the same time more robust. The aim of this package is to facilitate crucial steps in the ENM process: data preparation, model calibration, selected model exploration, model projections, and analyses of uncertainty and variability.
This new version of the package reduces the dependency on a strictly
organized working directory (now required only if projections to
multiple scenarios are needed). Instead, kuenm2 functions generate
specific R objects that store all the necessary information for
subsequent steps. kuenm2 fits maximum entropy (Maxnet) models or
generalized linear models (GLMs). Maxnet models are created as described
in Phillips et al. (2017),
and GLMs are constructed as in Cobos and Peterson
(2023). The ENM workflow
requires at minimum a data.frame containing occurrence record
coordinates (longitude and latitude) and a SpatRaster object with
predictor variables. Users could also use a data.frame containing a
column indicating presence and background records (0/1), and other
columns with predictor variable values.
Note: Internet connection is required to install the package.
To install the latest release of kuenm2 use the following line of code:
# Installing from CRAN
install.packages("kuenm2")The development version of kuenm2 can be installed using the code below.
# Installing and loading packages
if(!require(remotes)) {
install.packages("remotes")
}
# To install the package use
remotes::install_github("marlonecobos/kuenm2")
# To install the package and its vignettes use (if needed use: force = TRUE)
remotes::install_github("marlonecobos/kuenm2", build_vignettes = TRUE)Having problems?
If you have any problems during installation of the development version
from GitHub, restart R session, close other RStudio sessions you may
have open, and try again. If during the installation you are asked to
update packages, do so if you don’t need a specific version of one or
more of the packages to be installed. If any of the packages gives an
error when updating, please install it alone using install.packages(),
then try installing kuenm2 again.
To load the package use:
# Load kuenm2
library(kuenm2)The kuenm2 package facilitates the following steps in the ENM process: basic data cleaning, data preparation, model calibration, model exploration, model projections, projection comparisons, and exploration of variability and uncertainty. The figure below shows a schematic view of how the package works. A brief description of the steps that can be performed with the package is presented below. For a complete description and demonstration of the steps, see the package vignettes listed in the section Checking the vignettes.
kuenm2 automates essential pre-processing steps for ecological niche modeling. Data cleaning tools help sort columns, remove missing values and duplicates (including duplicates within raster cells), exclude coordinates at (0,0), filter based on coordinate decimal precision, and adjust occurrences that fall just outside valid raster boundaries.
For data preparation, users can input raw occurrence records and raster
layers or provide a pre-processed data frame. Two primary functions,
prepare_data() and prepare_user_data(), guide users through the
process of choosing modeling algorithms, defining parameter sets (e.g.,
feature classes, regularization multipliers, and predictor variables),
and selecting data partitioning strategies. Additional tools allow users
to visualize environmental conditions in the calibration data and assess
similarities among training and testing partitions, an exclusive feature
of kuenm2 among ENM tools.
The data that has been prepared can be explored to understand how it
looks like in environmental and geographic spaces. This can give users
an idea of what to expect from models, or whether some changes in the
way data has been prepared are necessary. The main functions in the step
of the workflow are explore_calibration_hist(),
plot_calibration_hist(), explore_partition_geo(),
explore_partition_extrapolation(), and plot_explore_partition().
Model calibration in kuenm2 is computationally intensive, involving
training and testing of candidate models using various partitioning
strategies such as k-fold cross-validation, subsampling, and
bootstrapping. Custom partitions generated using external tools (e.g.,
block or checkerboard methods) can also be imported. Models are
evaluated using multiple performance criteria that emphasize sensitivity
due to the frequent lack of true absence data. These criteria include:
bimodality in response curves, partial ROC for statistical significance,
omission rates for predictive performance, and Akaike Information
Criterion (AIC) for model complexity. Model calibration is executed
using the calibration() function.
To support a deeper understanding of the calibration process, a step to
explore extrapolation risks during the process of training and testing
models can be performed in kuenm2. The function
partition_response_curves() helps to produce visualizations of
response curves from training data overlapped with environmental
conditions at testing sites for selected models (another option
exclusive to kuenm2).
Once best models are selected, they must be fitted using
fit_selected() to analyze their characteristics and behavior. Fitted
models can be explored to assess variable_importance() and examine
response curves (response_curve() and all_response_curves()). This
offers insights into predictor influence in models and ecological
interpretation.
If an independent set of records (not included for model calibration) is
available, it can be used to re-evaluate the predictive performance of
selected models using independent_evaluation(). Partial ROC and
omission rates are calculated for these new records to measure model
performance, but extrapolation risks also are assessed for these points
using the MOP metric (exclusive to kuenm2). Results from this evaluation
can help further filter the selected models or inform the decision to
incorporate the independent records into the calibration dataset.
Fitted models can be projected to new environmental scenarios, including
larger geographic areas or different temporal conditions (e.g., past or
future climates). For single scenario projections, use
predict_selected(); for multiple scenarios, use project_selected().
Before projecting to complex future scenarios (e.g., multiple time
periods, emission pathways, or GCMs), users should first prepare and
organize data using functions in kuenm2 that facilitate the process
(e.g., organize_future_worldclim()).
When projecting to multiple scenarios, kuenm2 provides tools to quantify
and characterize differences using projection_changes(). This function
evaluates spatial changes and agreement levels among projections, aiding
the interpretation of temporal or scenario-based shifts in suitability.
To assess the consistency and reliability of model outputs, users can
explore projection variability (projection_variability()), which
accounts for differences arising from model replicates, parameter sets,
and climate models. With the MOP metric (projection_mop()) users can
evaluate extrapolation risks by identifying novel environmental
conditions in projection areas, offering a spatially explicit
representation of model uncertainty.
Users can check kuenm2 vignettes for a full explanation of the package
functionality. The vignettes can be checked online at the kuenm2
site under the menu Articles.
To build the vignettes when installing the package, if installing form
GitHub, make sure to use the argument build_vignettes = TRUE.
Check each of the vignettes with the code below:
# Guide to basic data cleaning before the ENM process
vignette("basic_data_cleaning")
# Guide to prepare data for the ENM process
vignette("prepare_data")
# Guide to train and evaluate candidate models, and select based on performance
vignette("model_calibration")
# Guide to explore selected models, variable importance, response curves
vignette("model_exploration")
# Guide to predict models in geographic space (single scenarios)
vignette("model_predictions")
# Guide to project models in geographic space (multiple scenarios)
vignette("model_projections")
# Guide to explore variability and uncertainty in projections (multiple scenarios)
vignette("variability_and_uncertainty")
# Guide to make projections to multiple scenarios with layers from CHELSA Climate
vignette("projections_chelsa")To maintain high standards of code quality and documentation, we have used AI LLM tools in our package. We used these tools for grammatical polishing and exploring technical implementation strategies for specialized functions. We manually checked and tested all code and documentation refined with these tools.
We welcome contributions to improve kuenm2! To maintain the integrity
and performance of the package, we follow a few core principles:
- Quality over Quantity: We prioritize well-thought-out, stable improvements over frequent, minor changes. Please ensure your code is well-documented and follows the existing style of the package.
- Minimal Dependencies: One of the goals of
kuenm2is to remain efficient. We prefer solutions that use base R or existing dependencies. Proposals that introduce new package dependencies will be strictly evaluated for their necessity. - AI-Assisted Code: If you use AI agents (like Copilot, ChatGPT, or
Claude) to generate code alternatives or improvements, please:
- Manually verify the logic and accuracy of the output.
- Demonstrate the benefit via a benchmark for speed or a clear explanation of how it fixes a bug/improves readability. Describe this in your Pull Request.
- Testing: New features should include examples, and tests should be performed to ensure they work as intended and do not break existing workflows.
If you have an idea for a major change, please open an Issue first to discuss it with the maintainers.
