Skip to content

Commit 80eef8b

Browse files
bvanessentbennun
andauthored
Updated release notes for v0.104 (#2379)
* Updated release notes. * Fixed whitespace * Update ReleaseNotes.txt Co-authored-by: Tal Ben-Nun <[email protected]> --------- Co-authored-by: Tal Ben-Nun <[email protected]>
1 parent dc55f88 commit 80eef8b

File tree

1 file changed

+30
-6
lines changed

1 file changed

+30
-6
lines changed

ReleaseNotes.txt

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ C++ API:
44
Support for new training algorithms:
55

66
Support for new network structures:
7-
- Added GPT-3 transformers and training recipes
7+
- Added GPT-3 transformers and training recipes
88

99
Support for new layers:
10-
- Select operator (set tensor value based on predicate)
10+
- Select operator (set tensor value based on predicate)
11+
- Model parallelism for channel-wise fully-connected layers
1112

1213
Python front-end:
1314
- Support for PyTorch Module conversion to LBANN graphs (requires PyTorch 2.0
@@ -20,24 +21,42 @@ Performance optimizations:
2021
layer does not need its activations in the backward pass. This optimization
2122
can be disabled by setting the environment variable
2223
DISTCONV_DISABLE_MEM_OPT=1.
23-
- Allow weights to be distributed across ranks by sharding them. Enable by
24-
setting sharded=True in any weights object.
24+
- Added support for selective weight sharding (also known as
25+
Fully-Sharded Data Parallelism, or FSDP). To enable, set sharded=true
26+
on weight objects.
2527
- Allow distconv to be disabled at runtime with LBANN_DISABLE_DISTCONV=1.
2628
- Activations are now deallocated when no longer needed via a reference counter,
2729
disable with LBANN_DISABLE_ACT_GC=1.
2830
- Added option for LBANN to set the number of OMP threads to modest
2931
default (4) if the environment doesn't specify anything.
32+
- Save memory on backpropagation by not replicating gradients between
33+
GradientManager and data_type_optimizer
34+
- Save more memory in FSDP by synchronizing previous outstanding
35+
async communication calls and freeing up local gradient contributions
36+
- FSDP: release full weight views after backprop
37+
- Batching heads in multi-head attention into single operations
38+
instead of on a per-head basis
39+
- Stacking the weights and biases for queries/keys/values in
40+
self-attention
3041

3142
Model portability & usability:
43+
- Added support for profiling with Caliper
3244

3345
Experiments & Applications:
46+
- Updated CosmoFlow model to automatically scale the model
47+
architecture and parallelism with input size.
48+
- Added a PyTorch reference implementation of CosmoFlow.
3449

3550
Internal features:
36-
- Fixed a bug where in-place layers sometimes attached a locked view
37-
of a matrix to a mutable view.
3851
- Removed the mini_batch_size parameter from the following functions
3952
in the layer class hierarchy: fp_setup_inputs, fp_setup_outputs, bp_setup_gradient_wrt_inputs
4053
and the distconv_adapter class: fp_setup, bp_setup
54+
- Support global and local gradient norm clipping with the clip_gradient_norm callback
55+
- Interactive progress bar with the progress_bar callback
56+
- Evaluate progress callback allows for periodic monitoring during
57+
training with independent data set (intra-epoch evaluation)
58+
- Detailed memory usage profiling with the memory_profiler callback
59+
- Refactored subgraph parallelism
4160

4261
I/O & data readers:
4362
- Renamed percent_of_data_to_use more accurately to fraction_of_data_to_use.
@@ -63,6 +82,11 @@ Build system:
6382
- Set a default time limit for CI tests to avoid unnecessary stalls
6483

6584
Bug fixes:
85+
- Fixed a bug where in-place layers sometimes attached a locked view
86+
of a matrix to a mutable view.
87+
- Fixed a bug when trying to use the legacy HDF5 data reader without data store.
88+
- Fixed concurrency bugs in the data store
89+
- Fixed DistConv memory optimization bug
6690

6791
Retired features:
6892
- Support for autoencoder strategy in the summarize images callback was removed

0 commit comments

Comments
 (0)