Skip to content

Conference call notes 20211027

Kenneth Hoste edited this page Oct 26, 2021 · 7 revisions

(back to Conference calls)

Notes on the 184th EasyBuild conference call, Wednesday Oct 27th 2021 (15:00 UTC)

Attendees

Alphabetical list of attendees (XXX):

  • Kenneth Hoste (HPC-UGent, Belgium)

  • Sebastian Achilles (Jülich Supercomputing Centre, Germany)

  • Simon Branford (Univ. of Birmingham, UK)

  • Jasper Grimm (University of York, UK)

  • Kurt Lust (Univ. of Antwerp, Belgium + LUMI User Support Team)

  • Robert Mijakovic (LuxProvide)

  • Mikael Öhman (Chalmers University of Technology, Sweden

  • Bart Oldeman (Compute Canada)

  • Åke Sandgren (Umeå University, Sweden)

  • Alexandre Strube (Jülich Supercomputing Centre, Germany)

Agenda

  • overview of recent developments
  • update on progress towards EasyBuild v4.5.0 release
  • promoting candidates for '2021b' common toolchains
  • Q&A

Recent developments

  • release timeline
    • latest release: EasyBuild v4.4.2 (Sept 7th 2021)
    • next release
  • recent changes
    • framework
      • bug fixes
        • ...
      • enhancements
        • ...
      • changes
        • ...
    • easyblocks
      • bug fixes
        • ...
      • enhancements
        • ...
      • new easyblocks
        • (none)
      • changes
        • (none)
    • easyconfigs
      • ~XXX easyconfig PRs merged since last conf call
      • bug fixes
        • ...
      • enhancements
        • ...
      • new software
        • ...
      • noteworthy software updates
        • ...
      • changes
        • ...
  • to merge/fix/tackle soon
    • framework
      • reported bugs / bug fixes
        • sources for extensions are still downloaded with --module-only (issue #3849)
      • enhancements
        • use separate different progress bars for different aspects of the installations being performed (WIP) (PR #3844)
      • changes
        • ...
    • easyblocks
      • reported bugs / bug fixes
        • restore RPATH wrappers for OpenMPI sanity check (WIP) (PR #2582)
        • avoid that path to CUDA install directory is added to $PATH (PR #2593)
      • enhancements
        • enhance GCC easyblock to add support for AMD GPU offloading (PR #2578)
      • changes
        • don't use --config=mkl for TensorFlow 2.4+ (PR #2583)
          • cfr. reported performance problems for CPU-only TensorFlow installations (issue #2577), which can worked around via export OMP_NUM_THREADS=1
          • blocked by broken TensorFlow tests when not using --config=mkl (see https://github.com/tensorflow/tensorflow/issues/52151)
          • should we make not using --config=mkl opt-in for now, so we can switch to it for selected (latest) TensorFlow versions?
      • new software
        • (nothing major?)
    • easyconfigs
      • bug reports & fixes
        • remove superfluous -DCMAKE_BUILD_TYPE (PR #13384)
        • TensorFlow tf.matmul ends up using CPU backend for 32bit floats (issue #14120)
          • low/wrong performance for matrix multiplication with certain data types (32-bit float, probably also 16-bit)
          • TensorFlow seems to favor MKL on CPU over GPU, unclear why...
          • should we auto-disable use of mkl-dnn when building on GPU?
          • working fine in TensorFlow 2.6.0 (check changelog?)
      • enhancements
        • Add CI check for CMAKE_BUILD_TYPE (PR #14008)
      • changes
        • update to UCX 1.11.2 as dependency for OpenMPI 4.1.1 (PR #14090)
      • new software
      • noteworthy software updates
        • SciPy-bundle with intel/2021a (PR #12964)
          • need to look into handful of failing tests...
        • intel/2021.09 (PR #14085)

Common toolchains

2021b (WIP!)

  • for now: foss/2021.07 and intel/2021.09 (candidates for 2021b after testing confirms they work well)
    • foss/2021.07: included with EasyBuild v4.4.2 release
    • intel/2021.09: WIP at PR #14085
      • includes intel-compilers 2021.4 release, which support GCC 11.2
      • includes impi 2021.4 on top of UCX 1.11.2
  • PR #14090 (+ PR #14091) suggests bumping UCX to 1.11.2 (from 1.11.0) for the OpenMPI involved in foss/2021.07
  • failing tests for SciPy-bundle with foss/2021.07 (PR #13789)
  • toolchain working group to follow up on this (?)

Q&A

  • ...
Clone this wiki locally