Prepare for release (#2236)

bvanessen · web-flow · commit d58993601f4d · 2023-03-24T20:45:17.000-07:00
* Updated release notes.

* Cleaned up ambiguity in finding lbann::exception class.
diff --git a/ReleaseNotes.txt b/ReleaseNotes.txt
@@ -17,36 +17,42 @@ Support for new training algorithms:
  - Truncation selection exchange for LTFB/PBT
  - Regularized evolution for LTFB/PBT
  - Hyperparameter grid search
+ - Multi-GAN training algorithm with multiple discriminators
 
 Support for new network structures:
  - Edge-conditioned graph neural networks
  - RoBERTa with pretrained weights
 
 Support for new layers:
-- Added support for 2D Matrices for Scatter and Gather layers
-- Added image rotation layer and composite image transformation layer
-  (rotate, shear, translate)
-- Added distributed tensor parallelism with channelwise decomposition for channelwise fully connected layer
-- Added "binary-with-constant" operators
-- Updated deconvolution layer to match PyTorch's API
-- Updated identity layer to copy tensors to enable tensor parallelism in subsequent layers in the compute graph
-- Added IdentityZero layer that allows alternating generator/discriminator
-  updates for training GANs.
-- Added an External layer that enables separately-compiled library to be loaded dynamically
-- Added support for labels_only mode on data-parallel cross entropy layer
+ - Added support for 2D Matrices for Scatter and Gather layers
+ - Added support for distributed Scatter and Gather layers
+ - DistConv enabled 3D MatMul
+ - Added image rotation layer and composite image transformation layer
+   (rotate, shear, translate)
+ - Added distributed tensor parallelism with channelwise decomposition for channelwise fully connected layer
+ - Added "binary-with-constant" operators
+ - Updated deconvolution layer to match PyTorch's API
+ - Updated identity layer to copy tensors to enable tensor parallelism in subsequent layers in the compute graph
+ - Added IdentityZero layer that allows alternating generator/discriminator
+   updates for training GANs.
+ - Added an External layer that enables separately-compiled library to be loaded dynamically
+ - Added support for labels_only mode on data-parallel cross entropy layer
 
 Python front-end:
- - Added support for buidling and launching jobs on Fugaku
+ - Added support for building and launching jobs on Fugaku
  - Added Riken as a known compute center
+ - Added Perlmutter as a known compute center
  - Added support for PJM as job launcher
  - Unified convolution/deconvolution interface to better approximate PyTorch.
  - Added circular (periodic) padding transformation for 2D and 3D tensors
+ - Added support for Flux job scheduler
 
 Performance optimizations:
  - Enabled the input layers to use a view of the I/O buffers in the
    buffered data coordinator
  - Use default-allocated GPU memory for long-lived buffers
  - Optimized GPU kernels for entry-wise operators
+ - Optionally use default-allocated GPU memory for long-lived buffers
 
 Model portability & usability:
  - Weight initialization from NumPy files
@@ -58,12 +64,24 @@ Experiments & Applications:
  - Cosmo 3D GAN
  - MNIST GAN
  - Image GAN
+ - Example Distributed Graph Convolutions Networks
+ - NASNet
+ - RoBERTa
 
 Internal features:
  - Added operator class
  - Added AlternateUpdates callback to be used with IdentityZero layers for
    training GANs.
  - Added support for serializing network architectures to protobuf format.
+ - Reformatted headers and implementation files for a more IWYU paradigm.
+ - General support for ROCm-enabled DistConv
+ - Support fo use of libfabric plugin for RCCL and NCCL
+ - Framework-wide improvements in support for ROCm and MIOpen
+ - Callback for alternating optimizer layer update
+ - Command line argument to hang the LBANN application for debuggin
+ - Add a cuTT/hipTT backend to the permute layer
+ - Add a permute layer utilizing cuTENSOR for the permute implementation
+ - Weight initializer from NumPy file
 
 I/O & data readers:
  - Updated SMILES data reader to use sample lists
@@ -77,6 +95,7 @@ I/O & data readers:
  - Changed the input layer to take a data field and only produce a
    single output.  Currently valid Data fields are samples, labels,
    and responses.
+ - Added support for using arbitrary field names with HDF5 data reader.
  - Updated the data coordinator and data readers to
    take dynamic data fields rather than fixed fields.  Input buffers
    are no long allocated for fields that are not used in active
@@ -87,8 +106,14 @@ I/O & data readers:
    Data Coordinator.
  - Data coordinator can now directly return packed data fields to
    input layers.
+ - Added padding and cutout transformations
 
 Build system:
+ - Added support for using uptream Spack repositories
+ - Added support to reuse existing Spack environments, which
+   significantly decreases the startup time of running a CI job
+ - Enforce consistent GPU targets in Spack environment
+ - Switched from Bamboo to GitLab CI framework
 
 Bug fixes:
  - Fixed GPU kernels that launched with more blocks than allowed
@@ -97,6 +122,30 @@ Bug fixes:
    Transformer
  - Fixed a bug where the input layer performed unnecessary memory
    allocations.
+ - Bug fixes within Cosmoflow and U-Net models
+ - Fixed a bug in the GPU-based computation of the batchnorm
+ statistics
+ - Patch for when distconv'd input layer is followed by non-distconv layer
+ - Bugfix input layer activations: Fixed the input layer so that it
+   would only resize the activation matrix if it wasn't already setup
+   to be a view of the data_coordinator's matrix.  This addresses a
+   signficant performance bug in the data ingestion where the
+   activation matrix was a view into the data coordinator's internal buffers.
+ - Fixed bad convolution parameters producing incorrect layer shapes.
+ - Enabling tensor copy on distconv-enabled Identity layer
+ - General cleanup and improvement in the coverage and robustness of
+   CI testing
+ - Fix buffer overflow in SMILES data reader
+ - Fix a bug in TSE
+ - Do not construct bias weights when not needed in conv and FC modules
+ - Use tournament set in LTFB with truncation selection exchange
+ - Cleanup data reader tests memory leaks
+ - Fixed a buffer overrun, heap overflow, and double allocation of the
+   data store in the SMILES data reader
+ - Match LayerNorm and InstanceNorm layers to PyTorch
+ - Make sure GPU grid dims are valid in slice/concat layers
+ - Fixed incorrect matrix ording in K-FAC for conv layer
+ - Bugfix for polynomial learning rate schedule
 
 Retired features:
 
diff --git a/model_zoo/lbann.cpp b/model_zoo/lbann.cpp
@@ -231,7 +231,7 @@ int main(int argc, char* argv[])
       stack_profiler::get()->print();
     }
   }
-  catch (exception& e) {
+  catch (lbann::exception& e) {
     if (arg_parser.get<bool>(LBANN_OPTION_STACK_TRACE_TO_FILE)) {
       std::ostringstream ss("stack_trace");
       const auto& rank = get_rank_in_world();

Original file line number	Diff line number	Diff line change
`@@ -231,7 +231,7 @@ int main(int argc, char* argv[])`
`231`	`231`	`stack_profiler::get()->print();`
`232`	`232`	`}`
`233`	`233`	`}`
`234`		`- catch (exception& e) {`
	`234`	`+ catch (lbann::exception& e) {`
`235`	`235`	`if (arg_parser.get<bool>(LBANN_OPTION_STACK_TRACE_TO_FILE)) {`
`236`	`236`	`std::ostringstream ss("stack_trace");`
`237`	`237`	`const auto& rank = get_rank_in_world();`