Skip to content

Create a simple, generic, walkthrough example #14

@llewelld

Description

@llewelld

User story: As a walkthrough developer, I'd like to have a simple example workflow to demonstrate the main stages in an HPC workflow, so that I can build other walkthroughs around it.

The walkthroughs in this repository typically take three forms:

  1. Hints and tips that can be read quickly for specific tasks that relate to specific HPC systems.
  2. Explanations for how to deploy specific packages to a specific HPC systems.
  3. Generic processes that can be applied to HPC systems for performing specific tasks.

In the case of 3, it's typically the case that some generic workflow is needed for the purposes of demonstration. Using the same example workflow across walkthroughs, or at least something generic that can be built on, is helpful for the walkthrough writer as it makes the job of creating the walkthrough easier. It's also helpful for the reader as they can immediately start with a familiar example.

This task is therefore to create an example workflow that can be used for developing future walkthroughs.

The example should:

  1. Be deployable to multiple HPC systems (at least Baskerville, DAWN, Isambard-AI and Azure).
  2. Capture the main features that an HPC workflow typically employs. For example:
  3. Batch scripts for user with Slurm sbatch.
  4. A workflow that also works with srun.
  5. Use of PyTorch.
  6. Potentially both training and inference pipelines.
  7. The ability to distribute across multiple GPUs and multiple nodes.
  8. On top of this, the example should be as lightweight, minimal and easily understandable as possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions