Skip to content

Latest commit

 

History

History
76 lines (56 loc) · 2.61 KB

File metadata and controls

76 lines (56 loc) · 2.61 KB

Running Long-Term Jobs on a Server Using screen or nohup

When you need to run Python scripts on a server for an extended period—such as downloading ERA5 or CMEMS datasets—you can use the screen or nohup commands.
This allows tasks to continue running even if the network connection to the server is interrupted.


Commonly Used screen Commands (interactive debugging)

  • Create a new screen session Bash
      # Create a new screen session
      screen -S <session_name>
      conda activate <your_envirnment_name>
      python your_python_script.py
    
      # Detach from the current screen session (the task keeps running)
      # Press the following key sequence:
      Ctrl + A, then press D
    
      # List all existing screen sessions
      screen -ls
    
      # Reattach to a specific screen session
      screen -r <session_id_or_name>
    
      # Terminate a screen session
      screen -X -S <session_id_or_name> quit
      # (Alternatively, you can reattach to the session and type `exit` to close it)
    

A More Professional bash + nohup + log Workflow (production runs)

For long-running Python jobs (e.g., downloading ERA5 or CMEMS data, model training, or large-scale batch processing), a bash + nohup + explicit logging workflow is more professional, reproducible, and suitable for production or HPC environments.

  • 1. Create a Dedicated Bash Script run_job.sh
    #!/bin/bash
    
    # Environment setup
    source ~/.bashrc
    conda activate <your_environment_name>
    
    # Working directory
    cd /path/to/your/project || exit 1
    
    # Run Python script
    python your_python_script.py
    
  • 2. Make the script executable and Launch the Job Using nohup with Log Redirection Bash
    # Make the script executable
    chmod +x run_job.sh 
    
    nohup ./run_job.sh > job_$(date +%Y%m%d_%H%M%S).log 2>&1 &
    # 'nohup' keeps the job running after SSH disconnects
    # '>' edirects standard output (stdout) to a log file
    # '2>&1' edirects standard error (stderr) to the same log file
    # '&' runs the job in the background
    # Timestamped log filenames prevent overwriting and improve traceability
    
  • 3. Check Running Processes & Monitor Logs in Real Time Bash
    ps -u $USER | grep your_python_script.py
    tail -f logs/job_YYYYMMDD_HHMMSS.log
    
  • 4. Stop a Running Job Bash
    ps -ef | grep your_python_script.py
    kill <PID>