Graduate Student in Biotechnology | Computational Biologist & Bioinformatics Analyst
Building reproducible pipelines and interactive tools for genomics and clinical data analysis
I am a passionate computational biologist who combines strong biological understanding with robust data science and programming skills. In the last two months, I have significantly advanced my expertise by completing three major projects — including an end-to-end reproducible RNA-Seq pipeline from raw FASTQ files and clinical patient data analysis.
I specialize in turning complex biological and clinical data into actionable insights through interactive dashboards, reproducible workflows, and clear scientific communication. My work spans infectious disease genomics, immunology, and healthcare data analysis.
Currently open to research positions, internships, and full-time roles in Bioinformatics, Computational Biology, and Healthcare Data Analysis.
Bioinformatics & Genomics
- Advanced R & Bioconductor (DESeq2, apeglm, org.Mm.eg.db, AnnotationDbi, pheatmap)
- RNA-Seq analysis (raw FASTQ → alignment → quantification → differential expression)
- Interactive dashboard development (Shiny + shinydashboard)
- Genomic data retrieval (GEO, SRA, NCBI)
Data Science & Programming
- Python (Pandas, NumPy, Matplotlib, Seaborn)
- SQL & Database querying (SQLite)
- Data visualization (ggplot2, plotly, Seaborn)
- Reproducible research (renv, session_info, set.seed, RDS objects)
- Tools: Galaxy, STAR, featureCounts, Git/GitHub, Posit Cloud, WSL
1. RNA-Seq DESeq2 Pipeline + Shiny Dashboard — V2 (True Raw Counts)
→ Repository
Rebuilt the entire analysis from raw SRA FASTQ files (HISAT2 alignment + featureCounts) for a high-impact Immunity (2020) paper on IL-10 receptor-deficient microglia. Applied modern DESeq2 with apeglm shrinkage.
Key Achievements:
- Identified 1,597 DEGs with clean mutant/control separation (PC1 = 79.7% variance)
- Fully reproducible pipeline with three layers of reproducibility
- Deployed interactive Shiny dashboard (volcano plots, PCA, heatmaps, searchable table, CSV download) on Posit Cloud
2. RNA-Seq DESeq2 Pipeline + Shiny Dashboard — V1
→ Same Repository
Initial reproduction of Figure 3E from Shemer et al., Immunity 2020, despite missing raw counts on GEO. Built and publicly deployed a production-grade interactive dashboard.
Impact:
- Reproduced key findings (1,563 DEGs) and validated core biological markers
- Generated 6,300+ LinkedIn impressions and engagement from researchers at Illumina, Novartis, Pfizer, and GSK
- Transparently documented GEO data availability issues affecting the bioinformatics community
The V1 → V2 progression demonstrates my commitment to iterative scientific improvement and methodological rigor.
3. Patient Data Analysis — Heart Disease Clinical Dataset
→ Repository
Exploratory analysis of 1,025 patient records to identify risk factors for heart disease diagnosis.
Key Achievements:
- Combined Python + SQL workflow (8 standalone SQL queries on SQLite)
- Built a six-panel diagnostic visualization dashboard
- Identified maximum heart rate as the strongest differentiator and challenged assumptions around cholesterol levels
- Produced a fully documented, readable Jupyter Notebook with interpretation
- Bangladesh Dengue Genomic Surveillance (Shiny Dashboard)
- Yeast Genome Composition Analysis
- Palmer Penguins Reproducible Data Science
- RNA-Seq & Multi-omics Data Analysis
- Infectious Disease Genomics & Variant Surveillance
- Clinical & Healthcare Data Analysis
- Development of Reproducible Bioinformatics Tools
- Computational Biology & Public Health
- Email: faiyaj.mdabrar@gmail.com
- ORCID: 0009-0005-9646-4508
Open to collaborations, thesis/internship opportunities, or full-time roles. Happy to discuss how my recent RNA-Seq pipelines or clinical data experience can add value to your team.
⭐️ Thank you for visiting! My recent projects reflect continuous growth in delivering production-quality, reproducible bioinformatics and data analysis work. Feel free to explore the repositories and reach out anytime.
Last updated: May 2026