A comprehensive workshop demonstrating AI-driven genomics workflow automation using AWS HealthOmics, Strands Agents, and multi-agent systems.
This workshop teaches you to build intelligent AI agents that can automate genomics workflows on AWS HealthOmics. You'll learn to create agents that can manage workflows, monitor runs, analyze results, and troubleshoot issues autonomously.
- Strands Agents Framework - Python framework for building AI agents
- AWS HealthOmics - Managed genomics service for workflow execution
- Model Context Protocol (MCP) - Tool connectivity for external systems
- Multi-Agent Systems - Coordinated agents for complex genomics pipelines
- Core concepts and architecture
- Building your first HealthOmics agent
- MCP integration and tool connectivity
- Interactive experimentation
- Advanced agent orchestration
- Workflow management and monitoring
- Performance optimization strategies
- Coordinated multi-agent systems
- Specialized agents for different pipeline stages
- End-to-end automation workflows
- AWS Account with HealthOmics access
- Python 3.12+
- Basic understanding of genomics workflows
- Familiarity with WDL/CWL workflow languages
-
Clone the repository
git clone <repository-url> cd sample-healthomics-automation-with-ai-agents
-
Install dependencies
pip install -r notebooks/requirements.txt
-
Build workflow
cd somatic_variant_calling zip mutect2.zip main.wdl aws s3 cp mutect2.zip s3://<your-bucket>/<your-prefix>/mutect2.zip
-
Deploy infrastructure
aws cloudformation deploy \ --template-file infrastructure/infrastructure_cfn.yaml \ --stack-name genomics-ai-workshop \ --capabilities CAPABILITY_NAMED_IAM \ --parameter-overrides \ OmicsResourcesS3Bucket=<your-bucket> \ OmicsResourcesS3Prefix=<your-prefix> \ OmicsWorkflowDefinitionZipS3=mutect2.zip -
Start the workshop
- Open
notebooks/01-strands-agents-introduction.ipynb - Follow the step-by-step instructions
- Open
βββ notebooks/ # Interactive Jupyter notebooks
β βββ 01-strands-agents-introduction.ipynb
β βββ 02-genomics-supervisor-agent.ipynb
β βββ 03-multi-agent-genomics-pipeline.ipynb
β βββ civic-data/ # Sample genomics data
β β βββ AssertionSummaries.tsv
β β βββ ClinicalEvidenceSummaries.tsv
β β βββ FeatureSummaries.tsv
β β βββ VariantSummaries.tsv
β βββ data_discovery_agent.py # Data discovery agent implementation
β βββ interpretation_and_reporting_agent.py # Reporting agent
β βββ mcp_clients.py # MCP client configurations
β βββ qc_agent.py # Quality control agent
β βββ run_graph_agent.py # Run monitoring agent
β βββ workflow_orchestrator_agent.py # Workflow orchestration agent
β βββ test_workflow_orchestrator.py # Test utilities
β βββ requirements.txt # Python dependencies
βββ infrastructure/ # AWS CloudFormation templates
β βββ infrastructure_cfn.yaml # Main infrastructure
β βββ start_workflow/ # Lambda functions
β βββ start_workflow_lambda.py # Workflow starter Lambda
β βββ build.sh # Build script
dependencies
βββ somatic-variant-calling-pipeline/ # Sample WDL workflow
β βββ main.wdl # Mutect2 workflow
βββ CODE_OF_CONDUCT.md
βββ CONTRIBUTING.md
βββ LICENSE
βββ README.md
- Data Discovery Agent - Find and catalog genomics datasets
- QC Agent - Quality control and validation
- Workflow Orchestrator - Manage workflow execution
- Interpretation & Reporting - Analyze results and generate reports
- Workflow Management - Create, deploy, and version workflows
- Real-time Monitoring - Track execution with automatic polling
- Performance Analysis - Resource optimization recommendations
- Failure Diagnostics - Automated troubleshooting
- Validation - WDL/CWL syntax checking and best practices
- HealthOmics Workflows - Pre-configured Mutect2 somatic variant calling
- HealthOmics Workflow Run -- Run a test Mutect2 workflow with publicly available data
- S3 Storage - Workflow results and genomics data
- IAM Roles - Secure access management
- SageMaker Notebook - Interactive development environment
- ECR Repositories - Container image management
- Tumor/normal pair analysis
- Scatter-gather parallelization
- VCF to MAF conversion
- Configurable "cooking show" mode for demonstrations
By completing this workshop, you will:
- Build Production AI Agents - Create robust agents using Strands framework
- Integrate MCP Tools - Connect agents to external systems seamlessly
- Automate Genomics Workflows - End-to-end pipeline automation
- Implement Multi-Agent Systems - Coordinate specialized agents
- Optimize Performance - Resource usage and cost optimization
- Handle Failures - Automated error detection and recovery
- Strands Agents - AI agent framework
- AWS HealthOmics - Genomics workflow service
- Amazon Bedrock - Foundation models (Claude)
- Model Context Protocol - Tool integration standard
- WDL - Workflow description languages
- GATK - Genomics analysis toolkit
strands-agents>=1.0.0
boto3
pandas>=2.3.0
bedrock-agentcore
awslabs-aws-healthomics-mcp-server
awslabs.aws-api-mcp-server>=0.0.13
uv
For workshop-related questions:
- Check the notebook documentation
- Review the infrastructure logs
- Consult AWS HealthOmics documentation