Skip to content

Commit 3397ad7

Browse files
committed
remove wierd primarily useful bit
Signed-off-by: Peter St. John <pstjohn@nvidia.com>
1 parent 970c32d commit 3397ad7

File tree

5 files changed

+5
-10
lines changed

5 files changed

+5
-10
lines changed

bionemo-recipes/models/esm2/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,8 +142,7 @@ You can also mix FP8 and FP4 layers by providing both recipes and a mixed `layer
142142

143143
When `use_quantized_model_init=True` is set in the config, layers are created inside a
144144
`te.quantized_model_init` context. This tells TransformerEngine to initialize weights directly in
145-
the target quantized format, avoiding a separate quantization step after initialization. This is
146-
primarily useful when loading pre-quantized checkpoints.
145+
the target quantized format, avoiding a separate quantization step after initialization.
147146

148147
```python
149148
config = NVEsmConfig.from_pretrained(

bionemo-recipes/models/mixtral/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,8 +125,7 @@ You can also mix FP8 and FP4 layers by providing both recipes and a mixed `layer
125125

126126
When `use_quantized_model_init=True` is set in the config, layers are created inside a
127127
`te.quantized_model_init` context. This tells TransformerEngine to initialize weights directly in
128-
the target quantized format, avoiding a separate quantization step after initialization. This is
129-
primarily useful when loading pre-quantized checkpoints.
128+
the target quantized format, avoiding a separate quantization step after initialization.
130129

131130
```python
132131
config = NVMixtralConfig(

bionemo-recipes/models/qwen/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -158,8 +158,7 @@ The same pattern applies to Qwen2.5 models using `NVQwen2Config` and `NVQwen2For
158158

159159
When `use_quantized_model_init=True` is set in the config, layers are created inside a
160160
`te.quantized_model_init` context. This tells TransformerEngine to initialize weights directly in
161-
the target quantized format, avoiding a separate quantization step after initialization. This is
162-
primarily useful when loading pre-quantized checkpoints.
161+
the target quantized format, avoiding a separate quantization step after initialization.
163162

164163
```python
165164
config = NVQwen3Config.from_pretrained(

bionemo-recipes/recipes/esm2_native_te/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -174,8 +174,7 @@ claims the remaining layers. For example, if `fp8_layers=[1,2,3]` is set and `fp
174174

175175
When training with FP8 or FP4, you can initialize model weights directly in the target quantized format by setting
176176
`config_kwargs.use_quantized_model_init=true`. This tells TransformerEngine to create weights inside a
177-
`te.quantized_model_init` context, avoiding a separate quantization step after initialization. This is primarily useful
178-
when loading pre-quantized checkpoints.
177+
`te.quantized_model_init` context, avoiding a separate quantization step after initialization.
179178

180179
```bash
181180
python train_fsdp2.py --config-name L0_sanity \

bionemo-recipes/recipes/llama3_native_te/README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -140,8 +140,7 @@ python train_fsdp2.py --config-name L0_sanity fp8_config.enabled=true
140140

141141
When training with FP8, you can initialize model weights directly in the target quantized format by setting
142142
`config_kwargs.use_quantized_model_init=true`. This tells TransformerEngine to create weights inside a
143-
`te.quantized_model_init` context, avoiding a separate quantization step after initialization. This is primarily useful
144-
when loading pre-quantized checkpoints.
143+
`te.quantized_model_init` context, avoiding a separate quantization step after initialization.
145144

146145
```bash
147146
python train_fsdp2.py --config-name L0_sanity \

0 commit comments

Comments
 (0)