THUDM
diff --git a/‎.doctrees/_examples_synced/geo3k_vlm/README.doctree‎
758 Bytes b/‎.doctrees/_examples_synced/geo3k_vlm/README.doctree‎
758 Bytes
diff --git a/‎.doctrees/_examples_synced/geo3k_vlm_multi_turn/README.doctree‎
-309 Bytes b/‎.doctrees/_examples_synced/geo3k_vlm_multi_turn/README.doctree‎
-309 Bytes
diff --git a/‎.doctrees/environment.pickle‎
0 Bytes b/‎.doctrees/environment.pickle‎
0 Bytes
diff --git a/‎_examples_synced/geo3k_vlm/README.html‎
Lines changed: 4 additions & 0 deletions b/‎_examples_synced/geo3k_vlm/README.html‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎_examples_synced/geo3k_vlm_multi_turn/README.html‎
Lines changed: 4 additions & 1 deletion b/‎_examples_synced/geo3k_vlm_multi_turn/README.html‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎_sources/_examples_synced/geo3k_vlm/README.md‎
Lines changed: 5 additions & 0 deletions b/‎_sources/_examples_synced/geo3k_vlm/README.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎_sources/_examples_synced/geo3k_vlm_multi_turn/README.md‎
Lines changed: 4 additions & 1 deletion b/‎_sources/_examples_synced/geo3k_vlm_multi_turn/README.md‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎searchindex.js‎
Lines changed: 1 addition & 1 deletion b/‎searchindex.js‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎zh/.doctrees/_examples_synced/geo3k_vlm/README.doctree‎
760 Bytes b/‎zh/.doctrees/_examples_synced/geo3k_vlm/README.doctree‎
760 Bytes
diff --git a/‎zh/.doctrees/_examples_synced/geo3k_vlm_multi_turn/README.doctree‎
-300 Bytes b/‎zh/.doctrees/_examples_synced/geo3k_vlm_multi_turn/README.doctree‎
-300 Bytes
@@ -467,6 +467,10 @@ <h2> Contents </h2>
   <section class="tex2jax_ignore mathjax_ignore" id="vlm-single-turn-rl-fsdp-megatron">
 <h1>VLM Single-Turn RL (FSDP &amp; Megatron)<a class="headerlink" href="#vlm-single-turn-rl-fsdp-megatron" title="Link to this heading">#</a></h1>
 <p>Training VLMs with FSDP or Megatron on single-turn reasoning task using GRPO on the <a class="reference external" href="https://huggingface.co/datasets/hiyouga/geometry3k">GEO3K dataset</a>. We used processed version <a class="reference external" href="https://huggingface.co/datasets/chenhegu/geo3k_imgurl">here</a>.</p>
+<p>Note: Please make sure the cudnn version in the environment is 9.16.0.29 to prevent severe performance regression in conv3d in torch 2.9 mentioned in https://github.com/pytorch/pytorch/issues/168167. Otherwise, you can reinstall cudnn with:</p>
+<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span>nvidia-cudnn-cu12<span class="o">==</span><span class="m">9</span>.16.0.29
+</pre></div>
+</div>
 <p align="center">
   <img src="fsdp_vs_megatron.png" alt="FSDP vs Megatron Reward Plot" width="800">
 </p>
 
@@ -457,7 +457,10 @@ <h2> Contents </h2>
   <section class="tex2jax_ignore mathjax_ignore" id="vlm-multi-turn-geo3k-dataset">
 <h1>VLM Multi-Turn (geo3k dataset)<a class="headerlink" href="#vlm-multi-turn-geo3k-dataset" title="Link to this heading">#</a></h1>
 <p>Training VLM on <a class="reference external" href="https://huggingface.co/datasets/hiyouga/geometry3k">geo3k dataset</a> with multi-turn reasoning with interactive environment feedback, using GRPO. For the dataset, we used the <a class="reference external" href="https://huggingface.co/datasets/VeraIsHere/geo3k_imgurl_processed">processed version</a>.</p>
-<p><strong>Thanks to slime’s clean design, multi-turn RL aligns with first principles: with a <a class="reference internal" href="#rollout.py#L309"><span class="xref myst">custom rollout function</span></a>, any training backend (e.g. Megatron/FSDP) can use it.</strong></p>
+<p>Note: Please make sure the cudnn version in the environment is 9.16.0.29 to prevent severe performance regression in conv3d in torch 2.9 mentioned in https://github.com/pytorch/pytorch/issues/168167. Otherwise, you can reinstall cudnn with:</p>
+<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span>nvidia-cudnn-cu12<span class="o">==</span><span class="m">9</span>.16.0.29
+</pre></div>
+</div>
 <p>The multi-turn rollout is implemented through a <a class="reference internal" href="#rollout.py#L309"><span class="xref myst">custom generate function</span></a>, overriding the original generate function.</p>
 <p>In terms of the environment interaction, this example initializes a <a class="reference internal" href="#env_geo3k.py"><span class="xref myst">custom interactive environment</span></a> with the APIs below.</p>
 <details>
 
@@ -2,6 +2,11 @@
 
 Training VLMs with FSDP or Megatron on single-turn reasoning task using GRPO on the [GEO3K dataset](https://huggingface.co/datasets/hiyouga/geometry3k). We used processed version [here](https://huggingface.co/datasets/chenhegu/geo3k_imgurl).
 
+Note: Please make sure the cudnn version in the environment is 9.16.0.29 to prevent severe performance regression in conv3d in torch 2.9 mentioned in https://github.com/pytorch/pytorch/issues/168167. Otherwise, you can reinstall cudnn with:
+```bash
+pip install nvidia-cudnn-cu12==9.16.0.29
+```
+
 <p align="center">
   <img src="fsdp_vs_megatron.png" alt="FSDP vs Megatron Reward Plot" width="800">
 </p>
 
@@ -1,7 +1,10 @@
 # VLM Multi-Turn (geo3k dataset)
 Training VLM on [geo3k dataset](https://huggingface.co/datasets/hiyouga/geometry3k) with multi-turn reasoning with interactive environment feedback, using GRPO. For the dataset, we used the [processed version](https://huggingface.co/datasets/VeraIsHere/geo3k_imgurl_processed).
 
-**Thanks to slime's clean design, multi-turn RL aligns with first principles: with a [custom rollout function](rollout.py#L309), any training backend (e.g. Megatron/FSDP) can use it.**
+Note: Please make sure the cudnn version in the environment is 9.16.0.29 to prevent severe performance regression in conv3d in torch 2.9 mentioned in https://github.com/pytorch/pytorch/issues/168167. Otherwise, you can reinstall cudnn with:
+```bash
+pip install nvidia-cudnn-cu12==9.16.0.29
+```
 
 The multi-turn rollout is implemented through a [custom generate function](rollout.py#L309), overriding the original generate function.