-
Notifications
You must be signed in to change notification settings - Fork 1
Home
This workshop provides graduate students in public universities with the necessary skills and tools to analyze biological data using high-performance computing resources.
Participants will acquire hands-on experience with industry-standard command-line tools (CLI) for DNA and RNA sequencing analysis, sequence manipulation and alignment, and pipeline management for automating complex workflows. They will also learn about differential expression analysis for identifying genes with altered expression levels, data visualization techniques for effectively presenting results, and the basics of artificial intelligence (AI) and machine learning (ML) in bioinformatics.
Upon completion of this workshop, graduates will be capable of using these powerful tools and methods to address real-world biological challenges and make significant contributions to bioinformatics research.
Required Skills
| Skill | Description |
|---|---|
| Basic understanding of biology | This workshop assumes a basic understanding of biological concepts, such as DNA, RNA, genes, and genomes. |
| Familiarity with the command line (optional, but helpful) | While not required, familiarity with the command line will help navigate the tools covered in the workshop. |
| Enthusiasm for learning new computational skills | A strong interest in learning new computational skills is essential for success in this workshop. |
- Time: Thursdays @2PM (please register through the U of A Data Science Institute DataLab website or click on this link to fill the form)
- Where: Science & Engineering Library Room 212.
- Zoom link: https://arizona.zoom.us/j/89667081542
All sessions are recorded and uploaded to the University of Arizona's DataLab YouTube channel, where you can also find the other DataLab series: Natural Language Processing (NLP), Generative AI, NextGen Geospatial.
| Date | Title | Description | Instructors | Material Link/Recording |
|---|---|---|---|---|
| (01/30) | Sequence manipulation, alignment, and assessment | Enhance your knowledge of sequence manipulation for sequencing data preparation; Improve your alignment skills using various tools, and understand the assessment of alignment quality. | Michele Cosi | Material, Recording TBD |
| (02/06) | A Beginner's Guide to RNA-seq with DESeq2 | This workshop will guide you through DESeq, a powerful tool to identify genes exhibiting significant expression changes between different biological conditions. Attendees will gain hands-on experience in running a functional differential expression analyses pipeline using real world data and learning how to create figures and interpret results. | Michele Cosi | Material, Recording TBD |
| (02/13) | RNA-Seq Data Analysis in R: From Raw Counts to Differential Expression Analysis | RNA-Seq is a widely used method for analyzing gene expression. This workshop focuses on differential expression analysis using R, guiding participants through essential steps from raw count data to identifying differentially expressed genes. | Simona Merlini, Michele Cosi | Material, Recording TBD |
| (02/20) | Downstream Analysis of RNA-Seq Results in R: GSEA, PPI Networks, and Biological Interpretation | Building on the foundation of identifying differentially expressed genes (DEGs) in the previous workshop, it's time to unlock the biological meaning behind those findings. In this session, we'll dive into powerful downstream analyses to uncover the rich insights hidden in your RNA-seq data. | Simona Merlini, Michele Cosi | Material, Recording TBD |
| (02/27) | QTL mapping with qtl2 | This session introduces genetic mapping using qtl2, an R package that allows researchers to identify specific chromosomal regions that contribute to variation in phenotypes (Quantitative Trait Loci, QTL), identifying the action, interaction, number, and locations of these regions. | Michele Cosi | Material TBD, Recording TBD |
| (03/06) | Introduction to GWAS | Genome Wide Association Study (GWAS) allows researchers to find links between genetic variants, like single nucleotide polymorphisms (SNPs), and phenotypic traits. This workshop covers the key concepts of GWAS, focusing on applications with PLINK, PRSice, and R. Learn how to conduct GWAS, interpret results, and apply polygenic risk score (PRS) analysis, aggregating SNP data for a genetic assessment. | Michele Cosi | Material TBD, Recording TBD |
| (03/13) | No workshop, Spring Break | |||
| (03/20) | De-novo Detection and Annotation of Transposable Elements | Transposable elements (TEs) are repetitive sequences that drive genomic innovation but pose challenges in annotation. This workshop introduces TE biology and a pipeline for de-novo detection and annotation using RepeatModeler2 and RepeatMasker. Participants will learn to identify TE-derived sequences, address bioinformatics challenges, and explore resources like TE-hub, gaining practical skills to integrate TE annotation into genomic analyses. | Clément Goubert / Michele Cosi | Material TBD, Recording TBD |
| (03/27) | Explore Current AI/ML Trends and Tools in Bioinformatics | Explore AI and ML applications in bioinformatics; Understand their role in revolutionizing biological research and solving complex challenges. | Clément Goubert / Michele Cosi | Material TBD, Recording TBD |
- A Bioinformatics Wiki. C. Lizarraga. Data Science Institute. UArizona.
- Artificial Intelligence and Machine Learning in Bioinformatics.
- A survey of best practices for RNA-seq data analysis. Conesa, A., Madrigal, P., Tarazona, S. et al. A survey of best practices for RNA-seq data analysis. Genome Biol 17, 13 (2016). https://doi.org/10.1186/s13059-016-0881-8.
- awesome-bioinformatics
- awesome-biological-visualizations
- From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, Banks E, Garimella KV, Altshuler D, Gabriel S, DePristo MA. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics. 2013;43(1110):11.10.1-11.10.33. doi: 10.1002/0471250953.bi1110s43. PMID: 25431634; PMCID: PMC4243306.
- Genome Browser
- RNA-seq and Differential Expression. High Performance Research Computing. Texas A&M University.
- TeSS (Training eSupport System).
Updated: 01/21/2025 (M. Cosi)
UArizona Data Lab, Data Science Institute, University of Arizona.
UArizona DataLab, Data Science Institute, University of Arizona, 2024.