Skip to content

Conference call notes 20250226

Kenneth Hoste edited this page Feb 26, 2025 · 1 revision

(back to Conference calls)

Notes on the 265th EasyBuild conference call, Wednesday 26 Feb 2025 (09:00 UTC / 10:00 CET)

Attendees

List of attendees (14):

  • Jasper Grimm (University of York, UK)
  • Alexander Grund (ZIH, Dresden, Germany)
  • Leonardo Honfi Camilo (Wageningen University, The Netherlands)
  • Kenneth Hoste (HPC-UGent, Belgium)
  • Adam Huffman (Big Data Institute, Oxford, UK)
  • Georgios Kafanas (University of Luxembourg)
  • Kurt Lust (UAntwerpen / LUMI)
  • Alan O'Cais (Univ. of Barcelona, CECAM)
  • Jure Pečar (EMBL)
  • Andrea Piserchia (E4)
  • Jörg Saßmannshausen (Imperial College London, UK)
  • Roberto Scipioni (Red Oak Consulting, UK)
  • Alain van Hoof (TU Eindhoven, Netherlands)
  • Cintia Willemyns (Vrije Universiteit Brussel, Belgium)

Agenda

  • overview of recent developments
  • outlook to EasyBuild 5.0 release
  • 2025a update of common toolchains
  • Q&A

Recent developments

  • latest EasyBuild release:
  • next (stable) EasyBuild release:
    • EasyBuild v5.0.0 🔥
      • current target release date: Wed 12 March 2025
        • 5.0.x branches will likely be collapsed into develop after the release of EasyBuild 5.0.0
          • to avoid that --from-pr is broken for everyone with no new EasyBuild release being available in which it still works...
      • next EasyBuild conf call on Wed 12 March may be replaced with a presentation on EasyBuild 5.0
    • additional EasyBuild 4.9.x versions could still be done via 4.9.x branches, but none are planned currently
  • EasyBuild v5.0.0
    • project board: https://github.com/orgs/easybuilders/projects/18/views/2
    • detailed notes on latest developments here
    • to test development version of EasyBuild 5.0:
      # set up Python virtual environment, and jump into it
      python3 -m venv eb5
      source eb5/bin/activate
      
      # install EasyBuild 5.0 development version into it
      pip install https://github.com/easybuilders/easybuild-framework/archive/5.0.x.tar.gz
      pip install https://github.com/easybuilders/easybuild-easyblocks/archive/5.0.x.tar.gz
      pip install https://github.com/easybuilders/easybuild-easyconfigs/archive/5.0.x.tar.gz
      
      # go!
      eb --version

Merged PRs

  • docs (merged PRs)

    • ...
  • framework (merged PRs)

    • bug fixes
      • incompatibility with Lmod 8.7.56 due to module show producing non-zero exit code for non-existent module (issue #4759)
        • fixed in Lmod 8.7.57: module show will (again) always exit with zero exit code when $LMOD_QUIET is set (like EasyBuild has been doing for years)
      • [5.0.x] trouble with huge build directories (issue #22247)
        • fixed by not keeping debug symbols by default (PR #4764)
      • [5.0.x] correctly deal with easyblocks that still use deprecated make_module_req_guess method: remove environment variables if they're not present in guesses (PR #4763)
    • enhancements
      • ...
    • changes
      • [5.0.x] Deprecate use of parallel easyconfig parameter and fix updating the template value (PR #4580)
      • [5.0.x] Let jobs retweak easyconfigs themselves (PR #4669)
    • code cleanup
      • ...
  • easyblocks (merged PRs)

    • bug fixes
      • [5.0.x] Fix building PyTorch when using setup.py as the build command (PR #3574, fixes issue #3570)
      • [5.0.x] Fix $PYTHONPATH for hermetic python in TensorFlow builds with EB 5.x (PR #3568)
      • [develop] Set Cargo variables also for extensions (PR #3576)
      • [5.0.x] fix compatibility with --module-only in AOCC easyblock (PR #3594)
      • [5.0.x] Fix bug in FlexiBLAS easyblock to allow AOCL-BLAS to be default (PR #3605)
      • [5.0.x] fix Molpro easyblock in module-only mode (PR #3615)
    • enhancements
      • [develop] Explicitely mention that the PyTorch easyblock needs updating when failing for this reason (PR #3255)
      • [5.0.x] enhance FlexiBLAS easyblock to add support for AOCL-BLAS backend (PR #3589)
      • [5.0.x] enhance handling of PETSC_ARCH in SLEPc easyblock (PR #3629)
    • updates
      • [5.0.x] revamp NEURON easyblock (PR #3618)
    • changes
      • [5.0.x] adopt easyblocks to use module_load_environment instead of deprecated make_module_req_guess
      • [5.0.x] update easyblocks to use EasyConfig.parallel property (PR #3557)
    • new easyblocks
      • ...
    • code cleanup
      • [5.0.x] remove custom easyblock for MTL4 (PR #3617)
      • [5.0.x] remove unused Primer3 easyblock (PR #3621)
      • [5.0.x] set minimum supported version of PETSc to v3.9 (PR #3627)
  • easyconfigs (merged PRs)

    • ~57 easyconfig PRs were merged since last conf call
    • bug fixes/reports
      • [develop] unset $BUILD_VERSION set by torchvision easyblock in preinstallopts for torchaudio in easyconfigs for PyTorch-bundle 2.1.2 (PR #22258)
      • [develop,5.0.x] add missing dependency on pybind11 for contourpy in matplotlib v3.9.2 (PRs #22294 + #22301)
        • fallout caused by making pybind11 a build dependency of SciPy-bundle v2024.05 (PR #22170)
      • [develop] Avoid using $HOME/.cargo when installing poetry by using CargoPythonBundle easyblock (PR #22257)
    • enhancements
      • [develop] enable plugins that require HDF5 + Boost dependencies for Visit v3.4.1 (PR #22334)
    • (noteworthy) new software
      • ...
    • noteworthy software updates
      • ...
    • cleanup
      • ...
    • changes
      • [5.0.x] migrate easyconfig for NEURON v8.2.6 to use custom easyblock for NEURON (PR #22324)
      • [devleop] use snappy v1.2.1 (instead of v1.1.10) as dependency for Arrow 17.0.0 & MariaDB 11.7.0 (PR #22333)

Open (active) PRs

  • docs (open PRs + issues)

    • ...
  • framework (open PRs + issues)

    • bug fixes
      • [5.0.x] Add context manager for allowing unresolved templates and make the state members private (PR #4735)
      • [develop] show readable error message when applying patch without (extracted) source (PR #4738)
      • [develop] Avoid processing the same EasyConfig multiple times (PR #4767)
    • enhancements
      • [5.0.x] Copy build log and artifacts to a permanent location after failures (PR #4601)
        • mostly new code, so worth considering to include in EasyBuild v5.0.0 (but not a priority)
      • [5.0.x] Problem using $CPATH in modulefiles overwriting system paths (issue #3331)
        • almost done, but we need a way to deal with hardcoded use of 'CPATH' in modextrapaths
        • we'll probably introduce support for something like:
          modextrapaths = {SEARCH_PATH_HEADERS: 'include/example'}
      • [develop] initial work towards integrating easy_update functionality (PR #4714)
      • [develop] enhance apply_regex_substitutions (PR #4758)
        • relevant for enhancements to PyTorch easyblock
      • [develop] ignore other classes if software specific easyblock class was found (PR #4769)
        • relevant for enhancements to PyTorch easyblock
      • [develop] Introduce check_readelf_rpath easyconfig parameter to optionally skip RPATH checks (PR #4768)
      • [develop] add support for specifying dependencies required to obtain source files via source_deps easyconfig parameter (PR #4766)
    • code cleanup
      • ...
    • changes
      • [5.0.x] With new clang based intel compilers (ifx, icx, icpx) we should use -march=native (issue #4744)
  • easyblocks (open PRs + issues)

    • bug fixes/reports
      • ...
    • enhancements
      • [5.0.x] enhance LLVM easyblock for compilation of clang/flang + other llvm-projects (PR #3373)
        • Davide has tested a lot of installations with a pure LLVM-based toolchain on top of this
      • [5.0.x] Add build_target parameter to PythonPackage (PR #3575)
      • [develop] Use unittest XML files to parse PyTorch test results (PR #3633)
    • updates
      • [develop] Adapt cp2k regtest argument (PR #3623)
    • changes
      • [5.0.x] Use context managers for templating changes in Bundle easyblock (PR #3547)
    • code cleanup
      • ...
    • new easyblocks
      • [5.0.x] custom easyblock for VSCode (PR #3638)
  • easyconfigs (open PRs + issues)

2025a common toolchains

  • (2024b is skipped to catch up with original schedule for defining common toolchain versions)
    • EasyBuild v5.0 is ideal excuse for that break in continuity...
  • GCC 14.2 as a base (see easyconfigs PR #21114)
  • we should define candidate toolchains using latest version of all components
  • easyconfigs using GCCcore/14.2.0 toolchain available for Python 3.13.1, Perl + Perl-bundle-CPAN 5.40.0
  • effort sort of on hold until EasyBuild v5.0.0 is released...
  • easyconfigs PR #22125 for FlexiBLAS, OpenBLAS, BLIS

Q&A / others

  • 8th EasyBuild User Survey, please fill it out! (closes 28 Feb'25)
  • agenda for EUM'25 is public: https://easybuild.io/eum25/#program
    • all (50) seats for in-person attendance are taken!
    • remote attendance via Zoom will still be possible, registration will be re-opened soon for that
  • (Alan) libfabric in 2025a toolchains?
    • see also https://hpc.guix.info/blog/2024/11/targeting-the-crayhpe-slingshot-interconnect/ + https://github.com/HewlettPackard/shs-libcxi
    • unclear whether Cray libfabric that was open sourced is bug free
    • also used for intra-node communication
    • cxi plugin for OpenMPI is not enough for GPU-to-GPU communications
    • libfabric isn't as capable as UCX is in EasyBuild, which can be made CUDA-aware through a plugin
      • can something similar be done via $FI_PROVIDER_PATH?
      • likely not enough for OpenMPI
    • paper in the works on getting OpenMPI running on Cray hardware (cfr. future Cray User Group?)
    • warnings on MPI init during startup when using OpenMPI
      • not seen in experiments done in EESSI community?
  • (Jörg) easyconfig for scikit-hep working (easyconfigs PR #22394)
    • Ninja used in awkward-cpp going beserk
    • maybe $NINJAFLAGS would help like it was for Qt (see issue #2076)
    • do we need to consider introducing a wrapper around ninja that actually runs ninja -j X?
      • could be done in a generic way:
        wrap_cmds = {'ninja': "ninja -j %(parallel)s"}
  • (Jure) fun with rocm-smi upstream in EPEL
  • (Alexander) hwloc we use in OpenMPI is not CUDA-aware
    • hwloc is built with --without-cuda
    • do we need a custom hwloc easyconfig that is built CUDA-aware?
      • hwloc-2.9.2-GCCcore-13.2.0-CUDA-12.1.1.eb
      • module load foss
      • module swap hwloc/2.9.2-GCCcore-13.2.0-CUDA-12.1.1
        • this would make Lmod make the OpenMPI module inactive
    • introduce hwloc-CUDA?
      • load along with UCX-CUDA?
    • CUDA as build dep for hwloc?
      • not sure if that could work on CPU-only systems...
      • forces people to install CUDA & accept the EULA
    • maybe via hwloc support for plugins?
      • (Alan) seems like it could work
      • via --enable-plugin=cuda-nvml
  • (Andrea) STREAM homepage is down?
Clone this wiki locally