Skip to content

Machine Learning Using R

Ayushi Agrawal edited this page Jan 9, 2026 · 6 revisions

Description

Machine learning is becoming increasingly important, not only in data science, but also in bioinformatics. Machine learning gives you the opportunity to find hidden patterns in your data and draw predictive conclusions. In this workshop, you’ll learn key machine learning concepts and approaches, and get hands-on practice with simple examples.

You’ll be introduced to topics including:

  • Supervised and unsupervised machine learning
  • Training data and test data sets
  • Bias and variance
  • Cross validation
  • Performance evaluation

The hands-on practice will focus on the algorithms:

  • K-means clustering (unsupervised)
  • Random forest (supervised)

Prior experience with R is required, please review the Introduction to R for data analysis


Learning Path

Intermediate  


Materials

Coming soon ...


Pre-Workshop Instructions

Before the workshop, please ensure that R and RStudio are installed, and all required packages are set up. Completing the steps below will ensure a smooth experience during the hands-on sessions.

  1. Install R and RStudio

  2. Install Required R Packages
    Open RStudio and run the following commands in the console to install all required packages. If prompted to install from source, type 'y' to confirm

    # Core packages
    install.packages("tidyverse")      # Data manipulation and visualization
    install.packages("skimr")          # Summary statistics
    install.packages("naniar")         # Missing data visualization
    
    # Machine learning packages
    install.packages("caret")          # ML framework and model training
    install.packages("randomForest")   # Random Forest algorithm
    
    # Clustering packages
    install.packages("cluster")        # Clustering algorithms
    install.packages("factoextra")     # Clustering visualization
    
    # Bioconductor packages
    if (!requireNamespace("BiocManager", quietly = TRUE))
        install.packages("BiocManager")
    BiocManager::install("SamSPECTRAL") # Spectral clustering
  3. Verify installations:
    After installation, verify that all packages load correctly by running the following commands. If all packages load without errors, you're ready!

    # Load all packages to check for errors
    library(tidyverse)
    library(skimr)
    library(naniar)
    library(caret)
    library(randomForest)
    library(cluster)
    library(factoextra)
    library(SamSPECTRAL)

Troubleshooting

  • If you encounter any errors, please restart RStudio and try re-installing the problematic package(s).
  • If caret installation fails, you may need to install an additional dependency:
    install.packages("e1071")

✅ Tips

  • Restart R after installing multiple packages to ensure all dependencies load correctly.
  • If a package fails to install, copy the exact error message and email us before the workshop.

Clone this wiki locally