-
Notifications
You must be signed in to change notification settings - Fork 0
Home

This workshop provides graduate students in public universities with developing skills and learning tools required in today's AI/ML-focused science.
Ranging from covering the basic moving parts to understanding AI's role in Open Science, this workshop aims to lend an understanding where to obtain compute, covering software environments and reproducibility, the role of workflows, and aiming to create an end-to-end Machine Learning (ML) workflow.
Required Skills
Skill | Description |
---|---|
Basic understanding of Linux | This workshop assumes a basic understanding of the Linux Operating System |
Familiarity with Machine Learning (and related software) | While not required, it is suggested to have a basic understanding of Machine Learning and its concepts. |
Enthusiasm for learning new computational skills | A strong interest in learning new computational skills is essential for success in this workshop. |
Time: Tuesdays @2PM
REGISTRATION: Link
(for Zoom link/in person information, please sign up at the U of A Data Science Institute DataLab website)
All sessions are recorded and uploaded to the University of Arizona's DataLab YouTube channel, where you can also find the other DataLab series: Natural Language Processing (NLP), Generative AI, NextGen Geospatial.
Date | Title/Topic | Description | Instructors | Material Link/Recording |
---|---|---|---|---|
(01/28) | The moving parts of Functional Open Science | Explore the essential components of Open Science, including reproducibility, version control with Git, the importance of workflows, and tools and resources such as Hugging Face. This session provides an introduction to the ecosystem that enables modern science to be collaborative, transparent, and scalable. Participants will learn of containers to ensure reproducibility, leveraging Git for version control, and applying platforms like Hugging Face for machine learning workflows. | Michele Cosi / Carlos Lizárraga | Material, Recording |
(02/04) | AI's Role and Tools in Open Science | Learn how AI is changing the world of Open Science. This session covers the principles of Open Science and the transformative role of AI in driving research forward, discussing key AI/ML tools such as PyTorch alongside open datasets and community-driven resources. Attendees will explore how AI enhances reproducibility, promotes transparency, and accelerates discovery. | Michele Cosi / Carlos Lizárraga / Enrique Noriega | Material, Recording |
(02/11) | Learning to Working in the Cloud: JetStream2 and Reproducibility | Oftentimes, researchers may have all the knowledge necessary to for their work, however they may lack a key component: compute. In this session, attendees will learn of JetStream2 in order to address the need of GPUs and required compute as well as addressing reproducibility. Learn how to access and utilize JetStream2, the cloud computing for scalable data processing, training ML models, and managing collaborative projects. This session covers the basics of cloud infrastructure, setting up accounts, and using JetStream2 effectively for scientific research. | Michele Cosi | Material, Recording |
(02/18) | Handling Images & Videos pt. 1 | Discover techniques for processing and analyzing image and video data. This session introduces foundational tools and libraries, such as OpenCV and Gradio, for handling visual data in machine learning workflows. Participants will learn how to preprocess images, handle different file formats, and extract meaningful features for analysis. | Michele Cosi | Material, Recording |
(02/25) | Handling Images & Videos pt. 2 | Continuing from the previous workshop, this session aims to solidify concepts and techniques applicable to handling images and videos in order to train and test AI/ML models. | Michele Cosi | Material, Recording |
(03/04) | Training and Testing Models | This session covers critical concepts like data splitting (training, validation, and test sets), evaluating model performance, and hyperparameter tuning. Participants will explore common pitfalls and best practices for achieving reliable results, using concepts and code developed in previous sessions. | Michele Cosi, Mithün Paul | Material, Recording |
(03/18) | End-to-end ML Workflow pt.1 | The core of the workshop: attendees will apply the tools and techniques acquired thus far. In this and the following session, attendees will learn how to build an complete AI/ML pipeline, from data preparation, labeling, training, testing and real world applications. This session will be focused on the first part of the pipeline: data preparation and labelling. | Michele Cosi, Carlos Lizárraga, Leonardo Soto Hernandez, Mithün Paul | Material, Recording |
(03/25) | End-to-end ML Workflow pt.2 | Continuing from the previous session, this workshop aims to use the previously labelled data in order to train, test and run a model. | Michele Cosi, Carlos Lizárraga, Leonardo Soto Hernandez, Mithün Paul | Material, Recording |
- A Bioinformatics Wiki. C. Lizarraga. Data Science Institute. UArizona.
- Artificial Intelligence and Machine Learning in Bioinformatics.
Updated: 03/18/2025 (M. Cosi)
UArizona Data Lab, Data Science Institute, University of Arizona.