BSub Wizard

Interactive wizard for creating LSF bsub commands for the Janelia Research Campus compute cluster.

Overview

The BSub Wizard is a modern, terminal-based application that guides users through creating properly formatted bsub commands for job submission on the Janelia Research Campus compute cluster. It eliminates the complexity of remembering LSF syntax and cluster-specific options while ensuring optimal resource allocation and cost efficiency.

Features

🎯 Guided Workflow

8-step wizard covering all aspects of job submission
Smart defaults based on job type and cluster best practices
Real-time validation with helpful error messages
Context-sensitive help explaining each option

🖥️ Job Types Supported

CPU Jobs - General computational tasks
GPU Jobs - ML/AI workloads with comprehensive GPU selection
Interactive Sessions - Development and debugging
MPI/Parallel Jobs - Multi-node parallel processing

⚙️ Resource Configuration

CPU allocation with automatic memory calculation (15GB per slot)
GPU selection from GH200, H200, H100, A100, L4, T4
Architecture requirements (AVX2, AVX512, AMX)
Queue optimization with cost and runtime information

📊 Advanced Features

Cost estimation before job submission
Array job support with automatic task calculation
File management with Janelia storage integration
Environment variables and license management
Command preview and script generation

💾 Data Management

Configuration saving/loading for reusable job templates
Script export for integration with existing workflows
Copy to clipboard functionality
JSON configuration format for automation

Installation

Prerequisites

Python 3.8 or higher
Terminal with Unicode support
Access to Janelia compute cluster

Install from Source

git clone https://github.com/janelia/bsub-wizard.git
cd bsub-wizard
pip install -r requirements.txt
python main.py

Install as Package

pip install bsub-wizard
bsub-wizard

Quick Start

1. Launch the Wizard

python main.py

2. Follow the Steps

Welcome - Choose quick start or custom configuration
Job Type - Select CPU, GPU, Interactive, or MPI
Resources - Configure cores, memory, and GPUs
Queue - Select optimal queue with cost information
Runtime - Set time limits and job scheduling
Files - Configure input/output and working directories
Advanced - Set architecture, licenses, environment variables
Review - Generate and copy the final command

3. Use Generated Command

# Example output
bsub -J "my_ml_job" -n 12 -gpu "num=1:gmodel=NVIDIAA100_SXM4_80GB" -q gpu_a100 -W 4:00 -o /groups/mylab/output.log 'python train.py'

Cluster Information

Available Resources

CPU Nodes

Sky Lake (e10): 48 cores, 768GB RAM per node
Cascade Lake (h07): 48 cores, 768GB RAM per node
Sapphire Rapids (H06): 64 cores, 1TB RAM per node

GPU Types

GPU Model	VRAM	Slots/GPU	Cost/Hour	Best For
GH200 Super Chip	96GB	72	$0.80	Cutting-edge AI research
H200 SXM5	141GB	12	$0.80	Large model training
H100 SXM5	80GB	12	$0.50	High-performance ML
A100 SXM4	80GB	12	$0.20	General ML/AI workloads
Tesla L4	24GB	8-64	$0.10	Cost-effective inference
Tesla T4	16GB	48	$0.10	Development/testing

Queue Types

interactive - Real-time sessions (8h default, 48h max)
local - Long-running CPU jobs (30 days max)
short - Quick jobs under 1 hour
gpu_* - Specialized GPU queues by hardware type
mpi - Multi-node parallel processing

Storage Locations

/groups/ - Primary research file system (backed up)
/nrs/ - Non-redundant storage for large datasets
/scratch/ - Node-local high-speed temporary storage

Examples

CPU Job

bsub -J "data_analysis" -n 8 -q local -W 12:00 -o /groups/mylab/analysis.log 'python analyze_data.py'

GPU Training Job

bsub -J "model_training" -n 12 -gpu "num=1:gmodel=NVIDIAA100_SXM4_80GB" -q gpu_a100 -W 24:00 -o /groups/mylab/training.log 'python train_model.py'

Interactive Session

bsub -n 4 -W 8:00 -Is -XF /bin/bash

Array Job

bsub -J "parallel_analysis[1-100]%10" -n 4 -q local -o /groups/mylab/output_%I.log 'python process.py $LSB_JOBINDEX'

Configuration Files

Saving Configurations

The wizard can save job configurations as JSON files for reuse:

{
  "job_type": "gpu",
  "job_name": "my_training_job",
  "slots": 12,
  "queue": "gpu_a100",
  "runtime_limit": "24:00",
  "gpu_config": {
    "gpu_type": "NVIDIAA100_SXM4_80GB",
    "num_gpus": 1,
    "gpu_mode": "exclusive_process"
  },
  "command": "python train.py"
}

Loading Configurations

python main.py --config my_job_config.json

Keyboard Shortcuts

Enter - Next step
Escape - Previous step
Ctrl+S - Save configuration
Ctrl+L - Load configuration
F1 - Help
Q or Ctrl+C - Quit

Advanced Usage

Environment Variables

Set custom environment variables for your jobs:

export CUDA_VISIBLE_DEVICES=0
export OMP_NUM_THREADS=4

Custom Resource Requirements

Use LSF resource expressions:

-R"select[mem>64000]"
-R"rusage[idl=6]"

Architecture Targeting

-R"select[avx512]"  # Require AVX512 support

Troubleshooting

Common Issues

"Permission Denied" Errors

Ensure your account is enabled for cluster access
Submit a helpdesk ticket for account activation

Jobs Not Starting

Check queue limits and availability
Verify resource requirements are reasonable
Use bjobs to monitor job status

High Costs

Review GPU selection (L4/T4 are more economical)
Set appropriate runtime limits
Consider using short queue for quick jobs

Getting Help

Built-in Help - Press F1 in the wizard
Cluster Documentation - Available on Janelia wiki
Scientific Computing Team - Submit helpdesk ticket
GitHub Issues - Report bugs and feature requests

Development

Setting Up Development Environment

git clone https://github.com/janelia/bsub-wizard.git
cd bsub-wizard
pip install -e ".[dev]"

Running Tests

pytest tests/

Code Formatting

black wizard/
flake8 wizard/
mypy wizard/

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Areas for Contribution

Additional job templates
Queue optimization algorithms
Integration with other cluster tools
Documentation improvements
Testing and bug reports

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Janelia Scientific Computing Team for cluster expertise
Textual framework for the excellent TUI capabilities
Janelia Research Campus for supporting open source development

Changelog

Version 1.0.0

Initial release with full wizard functionality
Support for all Janelia cluster resources
Comprehensive validation and cost estimation
Configuration save/load capabilities
Complete job script generation

Note: This tool is specifically designed for the Janelia Research Campus compute cluster. For other LSF clusters, configuration may need to be adapted.

For the latest updates and documentation, visit: https://github.com/janelia/bsub-wizard

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
jrc-cluster-docs		jrc-cluster-docs
prompts		prompts
tests		tests
wizard		wizard
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
SETUP.md		SETUP.md
bsub-wizard-plan.md		bsub-wizard-plan.md
diagnose_wizard.py		diagnose_wizard.py
main.py		main.py
pixi.lock		pixi.lock
pixi.toml		pixi.toml
requirements.txt		requirements.txt
setup.py		setup.py
test_app_structure.py		test_app_structure.py
test_basic.py		test_basic.py
test_enter_key.py		test_enter_key.py
test_navigation.py		test_navigation.py
test_quick_run.py		test_quick_run.py
test_simple.py		test_simple.py
test_startup.py		test_startup.py
test_wizard_startup.py		test_wizard_startup.py

Folders and files

Latest commit

History

Repository files navigation

BSub Wizard

Overview

Features

🎯 Guided Workflow

🖥️ Job Types Supported

⚙️ Resource Configuration

📊 Advanced Features

💾 Data Management

Installation

Prerequisites

Install from Source

Install as Package

Quick Start

1. Launch the Wizard

2. Follow the Steps

3. Use Generated Command

Cluster Information

Available Resources

CPU Nodes

GPU Types

Queue Types

Storage Locations

Examples

CPU Job

GPU Training Job

Interactive Session

Array Job

Configuration Files

Saving Configurations

Loading Configurations

Keyboard Shortcuts

Advanced Usage

Environment Variables

Custom Resource Requirements

Architecture Targeting

Troubleshooting

Common Issues

"Permission Denied" Errors

Jobs Not Starting

High Costs

Getting Help

Development

Setting Up Development Environment

Running Tests

Code Formatting

Contributing

Areas for Contribution

License

Acknowledgments

Changelog

Version 1.0.0

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages