Skip to content

Commit 0433bb8

Browse files
committed
Merge branch 'release-v0.102'
============================== Release Notes: v0.102 ============================== Support for new training algorithms: - LTFB is now a first-class training algorithm. - LTFB now allows multiple metrics. The local algorithm is favored by each trainer and a partner model must win every metric to be declared the tournament winner. - The batched iterative optimizer (sgd_training_algorithm) was refactored for consistency. - Improved documentation of training algorithm infrastructure. Support for new network structures: - ATOM WAE model - character-based Wasserstein Autoencoder - Community GAN model for graph data sets Support for new layers: - "DFTAbs" layer that computes the absolute value of the channel-wise DFT of the input data - Adding support for 3D Matrix Multiplication - Added scatter and gather neural network layers - CPU-based GRU layers using oneDNN - Added batch-wise reduce-sum - ArcFace loss Python front-end: - Added 3D U-Net Model - Added Cosmoflow Model - Ported CANDLE Pilot1 models - Support nvprof - Added channelwise fully connected layer - Added support for non square kernels, padding, stride, and dilation for the convolution module - Support for OpenMPI launcher Performance optimizations: - Use cuDNN 8 RNN API and CUDA Graphs in GRU layer - Cache CUDA Graphs for each active mini-batch size - Tuned performance of slice, concatenate, and tessellate layers on ARM processors - Parallelize computation of Gaussian random numbers - Optimizing tessellate, concatenate, and slice layers on CPU Experiments & Applications: - Added experiment scripts for ATOM cWAE Gordon Bell simulations - LBANN-ATOM model inference and analysis Internal features: - Wrapper classes for CUDA Graphs API - Elementary examples of using complex numbers - cuDNN handles are now wrapped in RAII management classes - Improved HWLOC compatility for v1.11 and v2.x - Added an enum type of visitor hooks that will eventually be used to allow callbacks or other visitors to operate at user defined hook points - Changed checkpoint logic to checkpoint at the start of epochs and changed the naming scheme to use the callback phase (visitor hook) in the name rather than the current execution context. - Added in-memory binary model exchange for LTFB. - Added support for ROCm and MIOpen - Added support for oneDNN - Updated the bamboo test environment to use local executable rather than hard coded executables - Overhauled and refactored serialization throughout code to use Cereal serialization library - Significant cleanup and refactoring of code base to improve compile times. Moving to ensure that code adheres to standard split of header between declaration and implementation functions (for templated code). Specifically focused on serialization functions and comm class. Reduced dependencies through over reaching header inclusions. - The relationship of execution_contexts and training_algorithms was clarified. There is still work to do here. - Added DistConv tests both convolution and pooling layers - Support padding in distributed embedding layer - Added dump model graph callback - Added perturb learning rate callback - Added batched inference algorithm - Switched ATOM tests to use CPU embedding and tessellate layers to minimize noise I/O & data readers: - Experimental data reader that generates graph random walks with HavoqGT - Added explict tournament execution mode - Added support to split training data reader into validation and tournament readers - node2vec data reader Build system: - Hydrogen v1.5.0+ - Aluminum v0.5.0+ - DiHydrogen v0.2.0 is required - C++14 or newer standard with CUDA (CMake: "-DCMAKE_CUDA_STANDARD=14") - OpenCV is now an optional dependency via CMake "LBANN_WITH_VISION" - CNPY is now an optional dependency via CMake "LBANN_WITH_CNPY" - Adds support in the build_lbann.sh script for concretizing extra packages with the primary LBANN installation - New features in the build script to setup / configure the build environment, but stop and allow the user to manually add extra packages - Add a set of user-focused build scripts that use the main build_lbann.sh script to setup good defaults on known systems - Added application specific build scripts for users such as ATOM - Added support for pulling from Spack mirrors and setting them up - Split embedded Python support from Python Front End - Switched Spack-based build script to use Spack's clingo concretizer Bug fixes: - Fixed a bug where LBANN didn't set the Hydrogen RNG seed - Fixed both CosmoFlow and UNet models PFE as well as addressed issues in the data reader and data coordinator. - Fixed the HDF5 data reader to properly specify the supported I/O types - Fixed calculation of the linearized response size - Fixed the data coordinator's interface to input_layer - Fixed error with deterministic execution of dropout layers Retired features: - Removed deprecated JAG leader mode which was made obsolete when the data reader moved into the data coordinator - Removed the deprecated partitioned data reader modes that were used to partition and overlap data sets for multiple models - Removed deprecated ActivationDescriptor class
2 parents 0bb0f50 + 033cef5 commit 0433bb8

File tree

1,075 files changed

+67030
-21053
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,075 files changed

+67030
-21053
lines changed

.clang-format

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
###############################################################################
2+
# Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC.
3+
# Produced at the Lawrence Livermore National Laboratory.
4+
# Written by the LBANN Research Team (B. Van Essen, et al.) listed in
5+
# the CONTRIBUTORS file. <[email protected]>
6+
#
7+
# LLNL-CODE-697807.
8+
# All rights reserved.
9+
#
10+
# This file is part of LBANN: Livermore Big Artificial Neural Network
11+
# Toolkit. For details, see http://software.llnl.gov/LBANN or
12+
# https://github.com/LLNL/LBANN.
13+
#
14+
# Licensed under the Apache License, Version 2.0 (the "Licensee"); you
15+
# may not use this file except in compliance with the License. You may
16+
# obtain a copy of the License at:
17+
#
18+
# http://www.apache.org/licenses/LICENSE-2.0
19+
#
20+
# Unless required by applicable law or agreed to in writing, software
21+
# distributed under the License is distributed on an "AS IS" BASIS,
22+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
23+
# implied. See the License for the specific language governing
24+
# permissions and limitations under the license.
25+
###############################################################################
26+
27+
# Basic clang-format specification for LBANN.
28+
# Based on clang-10 for LC compatibility.
29+
30+
---
31+
Language: Cpp
32+
BasedOnStyle: LLVM
33+
AccessModifierOffset: -2
34+
AlignAfterOpenBracket: Align
35+
AlignConsecutiveMacros: false
36+
AlignConsecutiveAssignments: false
37+
AlignConsecutiveDeclarations: false
38+
AlignEscapedNewlines: Right
39+
AlignOperands: true
40+
AlignTrailingComments: true
41+
AllowAllArgumentsOnNextLine: false
42+
AllowAllConstructorInitializersOnNextLine: true
43+
AllowAllParametersOfDeclarationOnNextLine: false
44+
AllowShortBlocksOnASingleLine: Never
45+
AllowShortCaseLabelsOnASingleLine: false
46+
AllowShortFunctionsOnASingleLine: All
47+
AllowShortLambdasOnASingleLine: All
48+
AllowShortIfStatementsOnASingleLine: Never
49+
AllowShortLoopsOnASingleLine: false
50+
AlwaysBreakAfterDefinitionReturnType: None
51+
AlwaysBreakAfterReturnType: None
52+
AlwaysBreakBeforeMultilineStrings: false
53+
AlwaysBreakTemplateDeclarations: MultiLine
54+
BinPackArguments: false
55+
BinPackParameters: false
56+
BraceWrapping:
57+
AfterCaseLabel: false
58+
AfterClass: true
59+
AfterControlStatement: false
60+
AfterEnum: true
61+
AfterFunction: true
62+
AfterNamespace: false
63+
AfterObjCDeclaration: false
64+
AfterStruct: true
65+
AfterUnion: true
66+
AfterExternBlock: false
67+
BeforeCatch: true
68+
BeforeElse: true
69+
IndentBraces: false
70+
SplitEmptyFunction: false
71+
SplitEmptyRecord: true
72+
SplitEmptyNamespace: true
73+
BreakBeforeBinaryOperators: None
74+
BreakBeforeBraces: Custom
75+
BreakBeforeInheritanceComma: false
76+
BreakInheritanceList: BeforeColon
77+
BreakBeforeTernaryOperators: true
78+
BreakConstructorInitializersBeforeComma: false
79+
BreakConstructorInitializers: BeforeColon
80+
BreakAfterJavaFieldAnnotations: false
81+
BreakStringLiterals: true
82+
ColumnLimit: 80
83+
CommentPragmas: '^ IWYU pragma:'
84+
CompactNamespaces: false
85+
ConstructorInitializerAllOnOneLineOrOnePerLine: false
86+
ConstructorInitializerIndentWidth: 2
87+
ContinuationIndentWidth: 2
88+
Cpp11BracedListStyle: true
89+
DeriveLineEnding: true
90+
DerivePointerAlignment: false
91+
DisableFormat: false
92+
ExperimentalAutoDetectBinPacking: false
93+
FixNamespaceComments: true
94+
ForEachMacros:
95+
- foreach
96+
- Q_FOREACH
97+
- BOOST_FOREACH
98+
IncludeBlocks: Preserve
99+
IncludeCategories:
100+
- Regex: '^"(llvm|llvm-c|clang|clang-c)/'
101+
Priority: 2
102+
SortPriority: 0
103+
- Regex: '^(<|"(catch|gtest|gmock|isl|json)/)'
104+
Priority: 3
105+
SortPriority: 0
106+
- Regex: '.*'
107+
Priority: 1
108+
SortPriority: 0
109+
IncludeIsMainRegex: '(Test)?$'
110+
IncludeIsMainSourceRegex: ''
111+
IndentCaseLabels: false
112+
IndentGotoLabels: true
113+
IndentPPDirectives: None
114+
IndentWidth: 2
115+
IndentWrappedFunctionNames: false
116+
JavaScriptQuotes: Leave
117+
JavaScriptWrapImports: true
118+
KeepEmptyLinesAtTheStartOfBlocks: true
119+
MacroBlockBegin: ''
120+
MacroBlockEnd: ''
121+
MaxEmptyLinesToKeep: 1
122+
NamespaceIndentation: None
123+
ObjCBinPackProtocolList: Auto
124+
ObjCBlockIndentWidth: 2
125+
ObjCSpaceAfterProperty: false
126+
ObjCSpaceBeforeProtocolList: true
127+
PenaltyBreakAssignment: 2
128+
PenaltyBreakBeforeFirstCallParameter: 19
129+
PenaltyBreakComment: 300
130+
PenaltyBreakFirstLessLess: 120
131+
PenaltyBreakString: 1000
132+
PenaltyBreakTemplateDeclaration: 10
133+
PenaltyExcessCharacter: 1000000
134+
PenaltyReturnTypeOnItsOwnLine: 60
135+
PointerAlignment: Left
136+
ReflowComments: true
137+
SortIncludes: true
138+
SortUsingDeclarations: true
139+
SpaceAfterCStyleCast: false
140+
SpaceAfterLogicalNot: false
141+
SpaceAfterTemplateKeyword: true
142+
SpaceBeforeAssignmentOperators: true
143+
SpaceBeforeCpp11BracedList: false
144+
SpaceBeforeCtorInitializerColon: true
145+
SpaceBeforeInheritanceColon: true
146+
SpaceBeforeParens: ControlStatements
147+
SpaceBeforeRangeBasedForLoopColon: true
148+
SpaceInEmptyBlock: false
149+
SpaceInEmptyParentheses: false
150+
SpacesBeforeTrailingComments: 1
151+
SpacesInAngles: false
152+
SpacesInConditionalStatement: false
153+
SpacesInContainerLiterals: true
154+
SpacesInCStyleCastParentheses: false
155+
SpacesInParentheses: false
156+
SpacesInSquareBrackets: false
157+
SpaceBeforeSquareBrackets: false
158+
Standard: c++17
159+
StatementMacros:
160+
- Q_UNUSED
161+
- QT_REQUIRE_VERSION
162+
TabWidth: 8
163+
UseCRLF: false
164+
UseTab: Never
165+
...

.gitlab-ci.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
2+
before_script:
3+
- echo "=== before_script section ==="
4+
5+
after_script:
6+
- echo "=== after_script section ==="
7+
8+
stages:
9+
- compiler
10+
- integration
11+
- unit
12+
13+
compilerRay:
14+
stage: compiler
15+
tags:
16+
- ray
17+
- shell
18+
script:
19+
- echo "=== compilerRay section ==="
20+
21+
integrationRay:
22+
stage: compiler
23+
tags:
24+
- ray
25+
- shell
26+
script:
27+
- echo "=== integrationRay section ==="
28+
29+
unitRay:
30+
stage: compiler
31+
tags:
32+
- ray
33+
- shell
34+
script:
35+
- echo "=== unitRay section ==="
36+
- echo "FINISHED"

.gitmodules

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
1-
[submodule "applications/graph/snap"]
2-
path = applications/graph/snap
3-
url = https://github.com/snap-stanford/snap
4-
ignore = dirty
5-
[submodule "applications/graph/largescale_node2vec"]
6-
path = applications/graph/largescale_node2vec
7-
url = https://lc.llnl.gov/bitbucket/scm/havoq/largescale_node2vec.git
8-
ignore = dirty
91
[submodule "applications/ATOM/moses"]
102
path = applications/ATOM/moses
113
url = [email protected]:samadejacobs/moses.git
4+
[submodule "applications/graph/node2vec/snap"]
5+
path = applications/graph/node2vec/snap
6+
url = https://github.com/snap-stanford/snap
7+
ignore = dirty
8+
[submodule "applications/graph/node2vec/havoqgt"]
9+
path = applications/graph/node2vec/havoqgt
10+
url = https://github.com/KIwabuchi/havoqgt
11+
branch = develop
12+
ignore = dirty
13+
[submodule "applications/graph/node2vec/largescale_node2vec"]
14+
path = applications/graph/node2vec/largescale_node2vec
15+
url = https://lc.llnl.gov/bitbucket/scm/havoq/largescale_node2vec.git
16+
ignore = dirty

.travis.yml

Lines changed: 0 additions & 50 deletions
This file was deleted.

0 commit comments

Comments
 (0)