-
Notifications
You must be signed in to change notification settings - Fork 78
Machine Learning Using R
Machine learning is becoming increasingly important, not only in data science, but also in bioinformatics. Machine learning gives you the opportunity to find hidden patterns in your data and draw predictive conclusions. In this workshop, you’ll learn key machine learning concepts and approaches, and get hands-on practice with simple examples.
You’ll be introduced to topics including:
- Supervised and unsupervised machine learning
- Training data and test data sets
- Bias and variance
- Cross validation
- Performance evaluation
The hands-on practice will focus on the algorithms:
- K-means clustering (unsupervised)
- Random forest (supervised)
Prior experience with R is required, please review the Introduction to R for data analysis
Coming soon ...
Before the workshop, please ensure that R and RStudio are installed, and all required packages are set up. Completing the steps below will ensure a smooth experience during the hands-on sessions.
-
Install R and RStudio
- R (version 4.0 or higher): https://www.r-project.org/
- RStudio (free Desktop version): https://posit.co/download/rstudio-desktop/
-
Install Required R Packages
Open RStudio and run the following commands in the console to install all required packages. If prompted to install from source, type 'y' to confirm# Core packages install.packages("tidyverse") # Data manipulation and visualization install.packages("skimr") # Summary statistics install.packages("naniar") # Missing data visualization # Machine learning packages install.packages("caret") # ML framework and model training install.packages("randomForest") # Random Forest algorithm # Clustering packages install.packages("cluster") # Clustering algorithms install.packages("factoextra") # Clustering visualization # Bioconductor packages if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("SamSPECTRAL") # Spectral clustering
-
Verify installations:
After installation, verify that all packages load correctly by running the following commands. If all packages load without errors, you're ready!# Load all packages to check for errors library(tidyverse) library(skimr) library(naniar) library(caret) library(randomForest) library(cluster) library(factoextra) library(SamSPECTRAL)
- If you encounter any errors, please restart RStudio and try re-installing the problematic package(s).
-
If
caretinstallation fails, you may need to install an additional dependency:install.packages("e1071")
- Restart R after installing multiple packages to ensure all dependencies load correctly.
- If a package fails to install, copy the exact error message and email us before the workshop.