Skip to content

SkyPilot v0.3.1

Compare
Choose a tag to compare
@concretevitamin concretevitamin released this 04 Jun 17:29

This is a patch release to ship several important enhancements and bug fixes:

Enhancements

  • On-demand H100 GPU from Lambda is supported! sky launch --gpus h100
    • To use it, remove any previous Lambda catalog: rm -rf ~/.sky/catalogs/v5/lambda
  • Managed spot: make job cancellation during failover more robust to mitigate a rare FAILED_SETUP error (#1998)

Fixes

  • Provisioner / Backend
    • Fix provision failover encountering FileNotFoundError (#2005)
    • Fix user-level ray cluster causing SkyPilot cluster to be in INIT state (#2020)
  • Logging
    • Fix certain logs of multi-node jobs not being streamed due to Ray 2.4 log dedup (#2026)
    • Fix logs being created in current pwd $PWD/~/sky_logs in some cases (#2009)
  • Managed spot
    • Fix sky spot launch --retry-until-up to make it actually retry until up (#2004)
  • Storage
    • Fix a rare storage cloud check error if sky check has never been called (#2017)
  • On-prem
    • Fix detecting A5000 and A6000 GPUs (#2023)

Full Changelog: v0.3.0...v0.3.1