docs(site): fix perf section formatting; add code package links

g-husam · g-husam · commit a1e18461af1f · 2026-01-26T14:51:46.000-05:00
diff --git a/docs/README.md b/docs/README.md
@@ -27,15 +27,16 @@ We performed some tests on a [Vertex AI Training Cluster](https://docs.cloud.goo
 These tests were conducted using ML Flashpoint _alongside_ NeMo's recommended checkpointing (as you would in production), where NeMo's default checkpointing used a 7-10 TB [Filestore](https://cloud.google.com/filestore) instance.
 
 Observations when comparing the hybrid of ML Flashpoint (every 5 steps) and NeMo checkpointing (every 50 steps) to just NeMo's regular checkpointing (every 10 steps):
+
 * Data write times that are up to 20-30x faster, with little to no optimization.
 This is expected to further improve with additional optimizations.
 * Total checkpoint recovery times that are ~7-10x faster (includes the time it takes to do checkpoint detection, cross-node coordination, replication, read into model state and be ready to resume training).
 * For _async_ checkpointing: improvements averaging **3-6%** for _overall job time_, with peaks of **5-10%** improvements.
 These improvements only account for checkpoint save efficiency, representing a "worst case" in the sense that checkpointing purely adds overhead and isn't actually used.
 Any job interruptions will also benefit from the improved checkpoint recovery times.
 
-While [ML runtime goodput](https://cloud.google.com/blog/products/ai-machine-learning/goodput-metric-as-measure-of-ml-productivity) is important, we focus on overall job time as an end-to-end metric, as it is most transparent and accounts for actual cost.
-Goodput can be misleading if improvements to unproductive time actually worsen productive time.
+While [ML runtime goodput](https://cloud.google.com/blog/products/ai-machine-learning/goodput-metric-as-measure-of-ml-productivity) is important, we focus on overall job time as an end-to-end metric, as it is simpler, most transparent and accounts for actual cost.
+Goodput can be misleading if improvements to unproductive time actually worsen productive time, and the change in total evaluation period (job time) is not taken into account.
 
 ## Design Philosophy
 
diff --git a/docs/user-guide.md b/docs/user-guide.md
@@ -28,7 +28,7 @@ See the project's [README](http://cs/h/cloud-mlnet/ml-flashpoint/+/main:README.m
 
 ### NeMo 2.0 & Pytorch Lightning
 
-Code: See the `ml_flashpoint.adapter.nemo` package.
+Code: See the [`ml_flashpoint.adapter.nemo`](https://github.com/google/ml-flashpoint/tree/main/src/ml_flashpoint/adapter/nemo) package.
 
 !!! note
 
@@ -107,7 +107,7 @@ This reduces blocking time by avoiding duplicate work, at the cost of having a l
 
 ### Megatron-LM
 
-Code: See the `ml_flashpoint.adapter.megatron` package.
+Code: See the [`ml_flashpoint.adapter.megatron`](https://github.com/google/ml-flashpoint/tree/main/src/ml_flashpoint/adapter/megatron) package.
 
 The Megatron strategies depend on the PyTorch DCP implementations.
 Below are instructions for setting up ML Flashpoint checkpointing, which you should configure alongside regular checkpointing to long-term storage.
@@ -195,4 +195,4 @@ else:
 
 ### PyTorch DCP
 
-Code: See the `ml_flashpoint.adapter.pytorch` package.
+Code: See the [`ml_flashpoint.adapter.pytorch`](https://github.com/google/ml-flashpoint/tree/main/src/ml_flashpoint/adapter/pytorch) package.