Skip to content

New RFC: EESSI/EasyBuild and nf-core #103

@ocaisa

Description

@ocaisa

Have you read the RFC docs?

  • Yes, I have read and understood the RFC docs

Summary

Leverage the European Environment for Scientific Software Installations (EESSI, pronounced as "easy") to provide portability of pipelines between CPU and GPU architectures. EESSI currently supports 14 different CPU builds across x86 and ARM CPU families and all major CUDA compute capabilities, using a best-fit approach to match the instructions available on the current CPU to the builds that EESSI ships.

Champion

@ocaisa

Background & Motivation

EESSI is a collaborative project that creates a common, high-performance scientific software stack for use on high-performance computing (HPC) systems and other Linux-based devices like laptops, workstations and cloud environments. It provides a way for researchers to access the same software on their laptop, in the cloud, or across different European HPC sites, ensuring portability, performance, and reproducibility. The software is distributed via CernVM-FS (CVMFS), a read-only file system, making it easy to stream and use without complex installation processes.

In the past, EESSI has used Nextflow for demonstrations to show how it would be possible to ship a portable workflow pipeline including the complete execution environment via a GitHub repository. The utility of this is limited however without full support for the nf-core/modules required by the pipeline.

We (EESSI) and others within the EasyBuild community (EasyBuild is the backend that is used to install software within EESSI), are interested in exploring the support for nf-core pipelines via EasyBuild and EESSI.

Goals

Practically speaking, with respect to EasyBuild this would likely mean setting the following targets for specific pipelines (and pipeline versions):

  • Map the nf-core/module requirements of a pipeline to software available in EasyBuild (adding new software if necessary)
    • This would be done inside an easyconfig which is a recipe for an installation (in this case the specific version of the pipeline)
    • Dependencies are explicitly listed (including their verions), required nf-core/module would also likely need to be explicitly listed. We note that https://github.com/nf-core/modules does not have releases, so we would pin the nf-core/module by commit.
    • a mapping is likely required between the EasyBuild naming and nf-core naming (this would be done centrally)
  • Test the installation of the pipeline
    • This is done in a general way via an easyblock (Python code)
    • Requires the testing of the individual nf-core/modules as the versions the nf-core/modules use in their environment files are unlikely to match the current versions shipped via EasyBuild (EasyBuild has a strong preference for the latest versions of packages). We would be relying on the tests for nf-core/modules to verify a specific module works as expected using the EasyBuild supplied software stack.
    • We would also run the tests for pipeline itself before proceeding with the final installation.

Once the pipeline is installed an end user could then use environment modules to load the specific pipeline:

module load nf-core::sarek/3.6.1

Shipping the pipeline via EESSI is a relatively trivial step once a working EasyBuild recipe is in place.

From the Nextflow perspective, we would be interested in understanding how support for EESSI as an execution environment might be supported. Since we would control the installation of Nextflow being used (Nextflow itself would be a dependency of the EasyBuild recipe), there are likely a number of options here.

Non-Goals

  • not looking to support all nf-core pipelines at once, it would be case-by-case with the idea of creating a general mechanism for adding support for a pipeline to EasyBuild/EESSI

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    proposed

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions