Skip to content

Official AWS Batch step operator #3919

@SebastianScherer88

Description

@SebastianScherer88

Contact Details [Optional]

[email protected]

Feature Description

It would be great to get an official version of this interesting AWS Batch step operator into the main zenml library.

Problem or Use Case

I think it's quite common for people to have

  • AWS infra and
  • heterogenuous compute requirements in their pipeline steps - not everything needs to run on sagemaker

running a local (docker) orchestrator that can push individual components to powerful remote execution engines like AWS Batch sounds super useful to me - its definitely something i would be interested in (I work in ML as an ops engineer).

Happy to contribute based on the linked reference plugin implementation provided by you guys

Proposed Solution

A hardened, more configurable version of the linked plugin implementation that

  • allows for step resource configuration that get mapped canonically (where possible) onto AWS Batch resource specs
  • default to AWS infra settings that are compatible with current terraform setup utils (where possible)
  • integrates with every orchestrator that honours the canonical steplauncher appoach (i.e. not the LocalDockerOrchestrator)

Alternatives Considered

The official Sagemaker step operator. AWS Batch would be a cheaper (no ml.... instance sagemaker type $ markup) and more flexible way of launching scalable custom compute jobs

Additional Context

Implementation draft (unofficial AWS Batch step operator plugin)

Priority

Low - Nice to have

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Labels

backendImplementations related to the backendcontributionLabels for externally contributed implementations

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions