AI_Omics_Internship_2025

Intro to R for Bioinformatics 🚀

This repository documents my weekly progress in learning R for bioinformatics programming as part of the AI Omics Research Internship (2025).

📂 Lecture 1 (Class Ib): Getting Started with R

🔑 Topics Covered

Setting the working directory properly
Creating and organizing project folders
How R code works: functions, syntax, and execution
Variables and data types in R (numeric, integer, character, factor, logical)
Importing CSV files and working with categorical data
Saving scripts, outputs, and the R workspace

🎥 Lecture Recording
📌 Course GitHub Repo

📖 Key Learnings

Working Directory: how to set and use working folders so R knows where to look for files and save outputs.
Project Organization: creating structured subfolders (data/, scripts/, results/) for reproducible research.
R Basics:
- Functions (mean(), plot(), hist(), etc.)
- Variables and assignment (<-)
- Simple data visualizations (scatterplot, histogram, barplot)
Data Types in R:
- Numeric vs Integer
- Character / String
- Factors for categorical variables
- Logical data (TRUE/FALSE)
Data Handling:
- Importing .csv files with read.csv()
- Checking structure with str()
- Converting variables into factors or numeric codes (as.factor(), ifelse())
Saving Outputs:
- Export cleaned datasets with write.csv()
- Save workspace and objects (save(), save.image())

🧬 Assignment / Tasks

Set Working Directory
- Create a new folder AI_Omics_Internship_2025.
Create Project Folder
- In RStudio, make a new project called Module_I.
- Inside, create subfolders: raw_data/, clean_data/, scripts/, results/, plots/.
Data Cleaning Task
- Download patient_info.csv from GitHub.
- Import the dataset into R.
- Inspect structure (str()).
- Identify variables with incorrect data types.
- Convert them to appropriate formats (e.g., factors, numeric).
Feature Engineering
- Create a new binary variable for smoking status:
  - 1 = Yes
  - 0 = No
Save Outputs
- Save cleaned dataset as clean_data/patient_info_clean.csv.
- Save script as scripts/class_Ib.R.
- Upload both into this GitHub repository.

📌 Example Code Snippets

# Set working directory
setwd("C:/Users/YourName/Documents/AI_Omics_Internship_2025")

# Import CSV
data <- read.csv("raw_data/patient_info.csv")

# Inspect structure
str(data)

# Convert gender to factor
data$gender_fac <- as.factor(data$gender)

# Create binary smoking variable
data$smoking_binary <- ifelse(data$smoking == "Yes", 1, 0)

# Save cleaned dataset
write.csv(data, file = "clean_data/patient_info_clean.csv", row.names = FALSE)



# Intro to R for Bioinformatics 🚀

This repository documents my weekly progress in learning **R for bioinformatics programming** as part of the AI Omics Internship (2025).  

## 📂 Contents
- **Lecture Notes & Scripts**: R scripts from weekly lessons.  
- **Assignments**: My solutions to assignments with explanations.  
- **Projects**: Applications of R in bioinformatics data analysis.  

## 📖 This Week's Focus
### Topic: Differential Expression Analysis & Gene Classification
- Learned how to:
  - Define and use **functions in R**.
  - Apply logical conditions to classify genes as *Upregulated*, *Downregulated*, or *Not Significant*.
  - Handle **missing data** (`NA`) using replacement strategies.
  - Add new columns to data frames (`$status`) for classification results.
  - Save and organize results into a dedicated folder (`Results/`).
  - Summarize results using `table()` to count gene categories.

### Assignment
Classify genes based on `logFC` and `padj` values:
- **Upregulated**: `logFC > 1 & padj < 0.05`
- **Downregulated**: `logFC < -1 & padj < 0.05`
- **Not Significant**: otherwise  

📌 Example function implemented:  

```r
classify_gene <- function(logFC, padj){
  if (logFC > 1 & padj < 0.05){
    return("Upregulated")
  } else if (logFC < -1 & padj < 0.05){
    return("Down regulated")
  } else {
    return("Not significant")
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
codes		codes
Differential_Gene_Expression.R		Differential_Gene_Expression.R
DorothySagoe_Class_2_Assignment.RData.R		DorothySagoe_Class_2_Assignment.RData.R
Preprocessing and Normalization of Microarray Data in R.R		Preprocessing and Normalization of Microarray Data in R.R
README.md		README.md
class_Ib.R		class_Ib.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI_Omics_Internship_2025

Intro to R for Bioinformatics 🚀

📂 Lecture 1 (Class Ib): Getting Started with R

🔑 Topics Covered

📖 Key Learnings

🧬 Assignment / Tasks

📌 Example Code Snippets

About

Uh oh!

Releases

Packages

Languages

dorothysagoe/AI_Omics_Internship_2025

Folders and files

Latest commit

History

Repository files navigation

AI_Omics_Internship_2025

Intro to R for Bioinformatics 🚀

📂 Lecture 1 (Class Ib): Getting Started with R

🔑 Topics Covered

📖 Key Learnings

🧬 Assignment / Tasks

📌 Example Code Snippets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages