@@ -9,21 +9,42 @@ Support for new training algorithms:
99 - Improved documentation of training algorithm infrastructure.
1010
1111Support for new network structures:
12+ - ATOM WAE model - character-based Wasserstein Autoencoder
13+ - Community GAN model for graph data sets
1214
1315Support for new layers:
1416 - "DFTAbs" layer that computes the absolute value of the channel-wise
1517 DFT of the input data
18+ - Adding support for 3D Matrix Multiplication
19+ - Added scatter and gather neural network layers
20+ - CPU-based GRU layers using oneDNN
21+ - Added batch-wise reduce-sum
22+ - ArcFace loss
1623
1724Python front-end:
25+ - Added 3D U-Net Model
26+ - Added Cosmoflow Model
27+ - Ported CANDLE Pilot1 models
28+ - Support nvprof
29+ - Added channelwise fully connected layer
30+ - Added support for non square kernels, padding, stride, and
31+ dilation for the convolution module
32+ - Support for OpenMPI launcher
1833
1934Performance optimizations:
2035 - Use cuDNN 8 RNN API and CUDA Graphs in GRU layer
2136 - Cache CUDA Graphs for each active mini-batch size
2237 - Tuned performance of slice, concatenate, and tessellate layers on
23- ARM processors.
38+ ARM processors
39+ - Parallelize computation of Gaussian random numbers
40+ - Optimizing tessellate, concatenate, and slice layers on CPU
2441
2542Model portability & usability:
2643
44+ Experiments & Applications:
45+ - Added experiment scripts for ATOM cWAE Gordon Bell simulations
46+ - LBANN-ATOM model inference and analysis
47+
2748Internal features:
2849 - Wrapper classes for CUDA Graphs API
2950 - Elementary examples of using complex numbers
@@ -37,8 +58,11 @@ Internal features:
3758 hook) in the name rather than the current execution context.
3859 - Added in-memory binary model exchange for LTFB.
3960 - Added support for ROCm and MIOpen
61+ - Added support for oneDNN
4062 - Updated the bamboo test environment to use local executable rather
4163 than hard coded executables
64+ - Overhauled and refactored serialization throughout code to use
65+ Cereal serialization library
4266 - Significant cleanup and refactoring of code base to improve compile
4367 times. Moving to ensure that code adheres to standard split of
4468 header between declaration and implementation functions (for
@@ -47,17 +71,25 @@ Internal features:
4771 inclusions.
4872 - The relationship of execution_contexts and training_algorithms was
4973 clarified. There is still work to do here.
74+ - Added DistConv tests both convolution and pooling layers
75+ - Support padding in distributed embedding layer
76+ - Added dump model graph callback
77+ - Added perturb learning rate callback
78+ - Added batched inference algorithm
79+ - Switched ATOM tests to use CPU embedding and tessellate layers to
80+ minimize noise
5081
5182I/O & data readers:
5283 - Experimental data reader that generates graph random walks with
5384 HavoqGT
5485 - Added explict tournament execution mode
5586 - Added support to split training data reader into validation and
5687 tournament readers
88+ - node2vec data reader
5789
5890Build system:
59- - Hydrogen v1.5.0
60- - Aluminum v0.5.0
91+ - Hydrogen v1.5.0+
92+ - Aluminum v0.5.0+
6193 - DiHydrogen v0.2.0 is required
6294 - C++14 or newer standard with CUDA (CMake: "-DCMAKE_CUDA_STANDARD=14")
6395 - OpenCV is now an optional dependency via CMake "LBANN_WITH_VISION"
@@ -71,6 +103,8 @@ Build system:
71103 build_lbann.sh script to setup good defaults on known systems
72104 - Added application specific build scripts for users such as ATOM
73105 - Added support for pulling from Spack mirrors and setting them up
106+ - Split embedded Python support from Python Front End
107+ - Switched Spack-based build script to use Spack's clingo concretizer
74108
75109Bug fixes:
76110 - Fixed a bug where LBANN didn't set the Hydrogen RNG seed
@@ -80,6 +114,7 @@ Bug fixes:
80114 types
81115 - Fixed calculation of the linearized response size
82116 - Fixed the data coordinator's interface to input_layer
117+ - Fixed error with deterministic execution of dropout layers
83118
84119Retired features:
85120 - Removed deprecated JAG leader mode which was made obsolete when the
0 commit comments