Home

Functional Open Science Skills for AI/ML Applications

Spring 2025 Workshop: Functional Open Science Skills for AI/ML Applications

This workshop provides graduate students in public universities with developing skills and learning tools required in today's AI/ML-focused science.

Ranging from covering the basic moving parts to understanding AI's role in Open Science, this workshop aims to lend an understanding where to obtain compute, covering software environments and reproducibility, the role of workflows, and aiming to create an end-to-end Machine Learning (ML) workflow.

Required Skills

Skill	Description
Basic understanding of Linux	This workshop assumes a basic understanding of the Linux Operating System
Familiarity with Machine Learning (and related software)	While not required, it is suggested to have a basic understanding of Machine Learning and its concepts.
Enthusiasm for learning new computational skills	A strong interest in learning new computational skills is essential for success in this workshop.

Workshop Program

Time: Tuesdays @2PM

REGISTRATION: Link

(for Zoom link/in person information, please sign up at the U of A Data Science Institute DataLab website)

All sessions are recorded and uploaded to the University of Arizona's DataLab YouTube channel, where you can also find the other DataLab series: Natural Language Processing (NLP), Generative AI, NextGen Geospatial.

Date	Title/Topic	Description	Instructors	Material Link/Recording
(01/28)	The moving parts of Functional Open Science	Explore the essential components of Open Science, including reproducibility, version control with Git, the importance of workflows, and tools and resources such as Hugging Face. This session provides an introduction to the ecosystem that enables modern science to be collaborative, transparent, and scalable. Participants will learn of containers to ensure reproducibility, leveraging Git for version control, and applying platforms like Hugging Face for machine learning workflows.	Michele Cosi / Carlos Lizárraga	Material, Recording
(02/04)	AI's Role and Tools in Open Science	Learn how AI is changing the world of Open Science. This session covers the principles of Open Science and the transformative role of AI in driving research forward, discussing key AI/ML tools such as PyTorch alongside open datasets and community-driven resources. Attendees will explore how AI enhances reproducibility, promotes transparency, and accelerates discovery.	Michele Cosi / Carlos Lizárraga / Enrique Noriega	Material, Recording
(02/11)	Learning to Work in the Cloud: JetStream2 and Reproducibility	Oftentimes, researchers may have all the knowledge necessary to for their work, however they may lack a key component: compute. In this session, attendees will learn of JetStream2 in order to address the need of GPUs and required compute as well as addressing reproducibility. Learn how to access and utilize JetStream2, the cloud computing for scalable data processing, training ML models, and managing collaborative projects. This session covers the basics of cloud infrastructure, setting up accounts, and using JetStream2 effectively for scientific research.	Michele Cosi	Material, Recording
(02/18)	Handling Images & Videos pt. 1	Discover techniques for processing and analyzing image and video data. This session introduces foundational tools and libraries, such as OpenCV and Gradio, for handling visual data in machine learning workflows. Participants will learn how to preprocess images, handle different file formats, and extract meaningful features for analysis.	Michele Cosi	Material, Recording
(02/25)	Handling Images & Videos pt. 2	Continuing from the previous workshop, this session aims to solidify concepts and techniques applicable to handling images and videos in order to train and test AI/ML models.	Michele Cosi	Material, Recording
(03/04)	Training and Testing Models	This session covers critical concepts like data splitting (training, validation, and test sets), evaluating model performance, and hyperparameter tuning. Participants will explore common pitfalls and best practices for achieving reliable results, using concepts and code developed in previous sessions.	Michele Cosi, Mithün Paul	Material, Recording
(03/18)	End-to-end ML Workflow pt.1	The core of the workshop: attendees will apply the tools and techniques acquired thus far. In this and the following session, attendees will learn how to build an complete AI/ML pipeline, from data preparation, labeling, training, testing and real world applications. This session will be focused on the first part of the pipeline: data preparation and labelling.	Michele Cosi, Carlos Lizárraga, Leonardo Soto Hernandez, Mithün Paul	Material, Recording

References:

A Bioinformatics Wiki. C. Lizarraga. Data Science Institute. UArizona.
Artificial Intelligence and Machine Learning in Bioinformatics.

Updated: 03/18/2025 (M. Cosi)

UArizona Data Lab, Data Science Institute, University of Arizona.

CC BY-NC-SA 4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Home

Functional Open Science Skills for AI/ML Applications

Spring 2025 Workshop: Functional Open Science Skills for AI/ML Applications

Workshop Program

References:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally