SC25 Tutorial tut158s3
Welcome to the workshop materials for "Orchestrating Complex HPC and AI/ML Workflows on Kubernetes Using Flux and AWS". This tutorial will teach you how Flux's hierarchical resource management and graph-based scheduling capabilities extend Kubernetes to support diverse workflows.
This workshop is designed to be completed using VS Code in your browser. All materials are provided as markdown files that you can view in VS Code's built-in preview mode.
- Opening Files: Click on any
.mdfile in the VS Code file explorer - Preview Mode: Press
Ctrl+Shift+V(orCmd+Shift+Von Mac) to open markdown preview - Side-by-Side: Press
Ctrl+K Vto open preview alongside the source - Following Links: Click any link in the preview to navigate between sections
- Integrated Terminal: Use `Ctrl+`` (backtick) to open the terminal
- File Explorer: Use
Ctrl+Shift+Eto focus on the file explorer - Quick Open: Use
Ctrl+Pto quickly open files by name - Zoom: Use
Ctrl++andCtrl+-to adjust text size
This workshop progresses from foundational infrastructure concepts to advanced Flux capabilities, culminating in deploying MuMMI (Multiscale Machine-learned Modeling Infrastructure)—a scientific workflow exemplifying emerging complexity through combined large-scale simulations and machine learning.
| Module | Topic | Estimated Time | Status |
|---|---|---|---|
| Workshop Setup | VS Code connection and navigation | 5-10 minutes | ✅ Available |
| Module 1 | HPC on Kubernetes (Amazon EKS) | 30 min - 1 hour | ✅ Available |
| Module 2 | Flux and LAMMPS | TBD | 🚧 Coming Soon |
| Module 3 | MuMMI Workflows | TBD | 🚧 Coming Soon |
- Install Tools (eksctl, kubectl, helm)
- Create and validate EKS cluster
- Create persistent volume with FSx for Lustre
- Setup monitoring
- Deploy MPI Operator
- Run GROMACS MPI job
- Cleanup and optional scale-out
The workshop demonstrates how to orchestrate complex workflows using:
- Amazon EKS: Managed Kubernetes service
- Flux: Advanced job scheduling and resource management
- FSx for Lustre: High-performance parallel file system
- GROMACS: Molecular dynamics simulation software
- MuMMI: Multiscale machine learning infrastructure
By the end of this workshop, you will:
- Understand how to deploy and manage HPC workloads on Kubernetes
- Learn Flux's hierarchical resource management capabilities
- Experience running real scientific applications (GROMACS, LAMMPS)
- Explore integration of HPC simulations with machine learning workflows
- Gain hands-on experience with AWS services for HPC
- Basic familiarity with containers and Kubernetes concepts
- Understanding of command-line interfaces
- AWS account access (provided during the workshop)
- No prior experience with Flux or HPC required
Ready to begin? Start with the Workshop Setup to connect to your development environment and learn how to navigate these materials.
If you encounter issues during the workshop:
- Check the troubleshooting sections in each module
- Ask questions in the workshop chat or raise your hand
- Consult the workshop instructors
Ready to start? → Begin with Workshop Setup
