remove wierd primarily useful bit

pstjohn · pstjohn · commit 3397ad737d70 · 2026-03-10T19:58:06.000-06:00
Signed-off-by: Peter St. John &lt;pstjohn@nvidia.com&gt;
diff --git a/bionemo-recipes/models/esm2/README.md b/bionemo-recipes/models/esm2/README.md
@@ -142,8 +142,7 @@ You can also mix FP8 and FP4 layers by providing both recipes and a mixed `layer
 
 When `use_quantized_model_init=True` is set in the config, layers are created inside a
 `te.quantized_model_init` context. This tells TransformerEngine to initialize weights directly in
-the target quantized format, avoiding a separate quantization step after initialization. This is
-primarily useful when loading pre-quantized checkpoints.
+the target quantized format, avoiding a separate quantization step after initialization.
 
 ```python
 config = NVEsmConfig.from_pretrained(
diff --git a/bionemo-recipes/models/mixtral/README.md b/bionemo-recipes/models/mixtral/README.md
@@ -125,8 +125,7 @@ You can also mix FP8 and FP4 layers by providing both recipes and a mixed `layer
 
 When `use_quantized_model_init=True` is set in the config, layers are created inside a
 `te.quantized_model_init` context. This tells TransformerEngine to initialize weights directly in
-the target quantized format, avoiding a separate quantization step after initialization. This is
-primarily useful when loading pre-quantized checkpoints.
+the target quantized format, avoiding a separate quantization step after initialization.
 
 ```python
 config = NVMixtralConfig(
diff --git a/bionemo-recipes/models/qwen/README.md b/bionemo-recipes/models/qwen/README.md
@@ -158,8 +158,7 @@ The same pattern applies to Qwen2.5 models using `NVQwen2Config` and `NVQwen2For
 
 When `use_quantized_model_init=True` is set in the config, layers are created inside a
 `te.quantized_model_init` context. This tells TransformerEngine to initialize weights directly in
-the target quantized format, avoiding a separate quantization step after initialization. This is
-primarily useful when loading pre-quantized checkpoints.
+the target quantized format, avoiding a separate quantization step after initialization.
 
 ```python
 config = NVQwen3Config.from_pretrained(
diff --git a/bionemo-recipes/recipes/esm2_native_te/README.md b/bionemo-recipes/recipes/esm2_native_te/README.md
@@ -174,8 +174,7 @@ claims the remaining layers. For example, if `fp8_layers=[1,2,3]` is set and `fp
 
 When training with FP8 or FP4, you can initialize model weights directly in the target quantized format by setting
 `config_kwargs.use_quantized_model_init=true`. This tells TransformerEngine to create weights inside a
-`te.quantized_model_init` context, avoiding a separate quantization step after initialization. This is primarily useful
-when loading pre-quantized checkpoints.
+`te.quantized_model_init` context, avoiding a separate quantization step after initialization.
 
 ```bash
 python train_fsdp2.py --config-name L0_sanity \
diff --git a/bionemo-recipes/recipes/llama3_native_te/README.md b/bionemo-recipes/recipes/llama3_native_te/README.md
@@ -140,8 +140,7 @@ python train_fsdp2.py --config-name L0_sanity fp8_config.enabled=true
 
 When training with FP8, you can initialize model weights directly in the target quantized format by setting
 `config_kwargs.use_quantized_model_init=true`. This tells TransformerEngine to create weights inside a
-`te.quantized_model_init` context, avoiding a separate quantization step after initialization. This is primarily useful
-when loading pre-quantized checkpoints.
+`te.quantized_model_init` context, avoiding a separate quantization step after initialization.
 
 ```bash
 python train_fsdp2.py --config-name L0_sanity \