Skip to content

Commit d589936

Browse files
authored
Prepare for release (#2236)
* Updated release notes. * Cleaned up ambiguity in finding lbann::exception class.
1 parent 622aa88 commit d589936

File tree

2 files changed

+62
-13
lines changed

2 files changed

+62
-13
lines changed

ReleaseNotes.txt

Lines changed: 61 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -17,36 +17,42 @@ Support for new training algorithms:
1717
- Truncation selection exchange for LTFB/PBT
1818
- Regularized evolution for LTFB/PBT
1919
- Hyperparameter grid search
20+
- Multi-GAN training algorithm with multiple discriminators
2021

2122
Support for new network structures:
2223
- Edge-conditioned graph neural networks
2324
- RoBERTa with pretrained weights
2425

2526
Support for new layers:
26-
- Added support for 2D Matrices for Scatter and Gather layers
27-
- Added image rotation layer and composite image transformation layer
28-
(rotate, shear, translate)
29-
- Added distributed tensor parallelism with channelwise decomposition for channelwise fully connected layer
30-
- Added "binary-with-constant" operators
31-
- Updated deconvolution layer to match PyTorch's API
32-
- Updated identity layer to copy tensors to enable tensor parallelism in subsequent layers in the compute graph
33-
- Added IdentityZero layer that allows alternating generator/discriminator
34-
updates for training GANs.
35-
- Added an External layer that enables separately-compiled library to be loaded dynamically
36-
- Added support for labels_only mode on data-parallel cross entropy layer
27+
- Added support for 2D Matrices for Scatter and Gather layers
28+
- Added support for distributed Scatter and Gather layers
29+
- DistConv enabled 3D MatMul
30+
- Added image rotation layer and composite image transformation layer
31+
(rotate, shear, translate)
32+
- Added distributed tensor parallelism with channelwise decomposition for channelwise fully connected layer
33+
- Added "binary-with-constant" operators
34+
- Updated deconvolution layer to match PyTorch's API
35+
- Updated identity layer to copy tensors to enable tensor parallelism in subsequent layers in the compute graph
36+
- Added IdentityZero layer that allows alternating generator/discriminator
37+
updates for training GANs.
38+
- Added an External layer that enables separately-compiled library to be loaded dynamically
39+
- Added support for labels_only mode on data-parallel cross entropy layer
3740

3841
Python front-end:
39-
- Added support for buidling and launching jobs on Fugaku
42+
- Added support for building and launching jobs on Fugaku
4043
- Added Riken as a known compute center
44+
- Added Perlmutter as a known compute center
4145
- Added support for PJM as job launcher
4246
- Unified convolution/deconvolution interface to better approximate PyTorch.
4347
- Added circular (periodic) padding transformation for 2D and 3D tensors
48+
- Added support for Flux job scheduler
4449

4550
Performance optimizations:
4651
- Enabled the input layers to use a view of the I/O buffers in the
4752
buffered data coordinator
4853
- Use default-allocated GPU memory for long-lived buffers
4954
- Optimized GPU kernels for entry-wise operators
55+
- Optionally use default-allocated GPU memory for long-lived buffers
5056

5157
Model portability & usability:
5258
- Weight initialization from NumPy files
@@ -58,12 +64,24 @@ Experiments & Applications:
5864
- Cosmo 3D GAN
5965
- MNIST GAN
6066
- Image GAN
67+
- Example Distributed Graph Convolutions Networks
68+
- NASNet
69+
- RoBERTa
6170

6271
Internal features:
6372
- Added operator class
6473
- Added AlternateUpdates callback to be used with IdentityZero layers for
6574
training GANs.
6675
- Added support for serializing network architectures to protobuf format.
76+
- Reformatted headers and implementation files for a more IWYU paradigm.
77+
- General support for ROCm-enabled DistConv
78+
- Support fo use of libfabric plugin for RCCL and NCCL
79+
- Framework-wide improvements in support for ROCm and MIOpen
80+
- Callback for alternating optimizer layer update
81+
- Command line argument to hang the LBANN application for debuggin
82+
- Add a cuTT/hipTT backend to the permute layer
83+
- Add a permute layer utilizing cuTENSOR for the permute implementation
84+
- Weight initializer from NumPy file
6785

6886
I/O & data readers:
6987
- Updated SMILES data reader to use sample lists
@@ -77,6 +95,7 @@ I/O & data readers:
7795
- Changed the input layer to take a data field and only produce a
7896
single output. Currently valid Data fields are samples, labels,
7997
and responses.
98+
- Added support for using arbitrary field names with HDF5 data reader.
8099
- Updated the data coordinator and data readers to
81100
take dynamic data fields rather than fixed fields. Input buffers
82101
are no long allocated for fields that are not used in active
@@ -87,8 +106,14 @@ I/O & data readers:
87106
Data Coordinator.
88107
- Data coordinator can now directly return packed data fields to
89108
input layers.
109+
- Added padding and cutout transformations
90110

91111
Build system:
112+
- Added support for using uptream Spack repositories
113+
- Added support to reuse existing Spack environments, which
114+
significantly decreases the startup time of running a CI job
115+
- Enforce consistent GPU targets in Spack environment
116+
- Switched from Bamboo to GitLab CI framework
92117

93118
Bug fixes:
94119
- Fixed GPU kernels that launched with more blocks than allowed
@@ -97,6 +122,30 @@ Bug fixes:
97122
Transformer
98123
- Fixed a bug where the input layer performed unnecessary memory
99124
allocations.
125+
- Bug fixes within Cosmoflow and U-Net models
126+
- Fixed a bug in the GPU-based computation of the batchnorm
127+
statistics
128+
- Patch for when distconv'd input layer is followed by non-distconv layer
129+
- Bugfix input layer activations: Fixed the input layer so that it
130+
would only resize the activation matrix if it wasn't already setup
131+
to be a view of the data_coordinator's matrix. This addresses a
132+
signficant performance bug in the data ingestion where the
133+
activation matrix was a view into the data coordinator's internal buffers.
134+
- Fixed bad convolution parameters producing incorrect layer shapes.
135+
- Enabling tensor copy on distconv-enabled Identity layer
136+
- General cleanup and improvement in the coverage and robustness of
137+
CI testing
138+
- Fix buffer overflow in SMILES data reader
139+
- Fix a bug in TSE
140+
- Do not construct bias weights when not needed in conv and FC modules
141+
- Use tournament set in LTFB with truncation selection exchange
142+
- Cleanup data reader tests memory leaks
143+
- Fixed a buffer overrun, heap overflow, and double allocation of the
144+
data store in the SMILES data reader
145+
- Match LayerNorm and InstanceNorm layers to PyTorch
146+
- Make sure GPU grid dims are valid in slice/concat layers
147+
- Fixed incorrect matrix ording in K-FAC for conv layer
148+
- Bugfix for polynomial learning rate schedule
100149

101150
Retired features:
102151

model_zoo/lbann.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ int main(int argc, char* argv[])
231231
stack_profiler::get()->print();
232232
}
233233
}
234-
catch (exception& e) {
234+
catch (lbann::exception& e) {
235235
if (arg_parser.get<bool>(LBANN_OPTION_STACK_TRACE_TO_FILE)) {
236236
std::ostringstream ss("stack_trace");
237237
const auto& rank = get_rank_in_world();

0 commit comments

Comments
 (0)