Skip to content

Commit 1786e4b

Browse files
committed
deploy: 80e0528
1 parent 5f85364 commit 1786e4b

File tree

76 files changed

+278
-134
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+278
-134
lines changed
3.93 KB
Binary file not shown.

.doctrees/environment.pickle

807 Bytes
Binary file not shown.

_examples_synced/eval/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
<link rel="search" title="Search" href="../../search.html" />
5353
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5454
<meta name="docsearch:language" content="en"/>
55-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
55+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5656
</head>
5757

5858

@@ -599,7 +599,7 @@ <h2>4) Inside the Skills container<a class="headerlink" href="#inside-the-skills
599599

600600
<div class="footer-item">
601601
<p class="last-updated">
602-
Last updated on Dec 22, 2025.
602+
Last updated on Dec 24, 2025.
603603
<br/>
604604
</p>
605605
</div>

_examples_synced/eval_multi_task/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
<link rel="search" title="Search" href="../../search.html" />
5353
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5454
<meta name="docsearch:language" content="en"/>
55-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
55+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5656
</head>
5757

5858

@@ -534,7 +534,7 @@ <h2>IFBench Notes<a class="headerlink" href="#ifbench-notes" title="Link to this
534534

535535
<div class="footer-item">
536536
<p class="last-updated">
537-
Last updated on Dec 22, 2025.
537+
Last updated on Dec 24, 2025.
538538
<br/>
539539
</p>
540540
</div>

_examples_synced/fully_async/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
<link rel="prev" title="Search-R1 lite" href="../search-r1/README.html" />
5555
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5656
<meta name="docsearch:language" content="en"/>
57-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
57+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5858
</head>
5959

6060

@@ -595,7 +595,7 @@ <h2>Config Differences (2 Key Points)<a class="headerlink" href="#config-differe
595595

596596
<div class="footer-item">
597597
<p class="last-updated">
598-
Last updated on Dec 22, 2025.
598+
Last updated on Dec 24, 2025.
599599
<br/>
600600
</p>
601601
</div>

_examples_synced/geo3k_vlm/README.html

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
<link rel="search" title="Search" href="../../search.html" />
5353
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5454
<meta name="docsearch:language" content="en"/>
55-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
55+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5656
</head>
5757

5858

@@ -435,6 +435,7 @@ <h2> Contents </h2>
435435
</div>
436436
<nav aria-label="Page">
437437
<ul class="visible nav section-nav flex-column">
438+
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#data-preparation-for-sft-training">Data Preparation (For SFT Training)</a></li>
438439
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#reproduce">Reproduce</a><ul class="visible nav section-nav flex-column">
439440
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#configuration">Configuration</a></li>
440441
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#supported-models">Supported Models</a></li>
@@ -463,6 +464,38 @@ <h1>VLM Single-Turn RL (FSDP &amp; Megatron)<a class="headerlink" href="#vlm-sin
463464
<p align="center">
464465
<img src="fsdp_vs_megatron.png" alt="FSDP vs Megatron Reward Plot" width="800">
465466
</p>
467+
<section id="data-preparation-for-sft-training">
468+
<h2>Data Preparation (For SFT Training)<a class="headerlink" href="#data-preparation-for-sft-training" title="Link to this heading">#</a></h2>
469+
<p>The <a class="reference external" href="https://huggingface.co/datasets/chenhegu/geo3k_imgurl">geo3k_imgurl</a> dataset contains:</p>
470+
<ul class="simple">
471+
<li><p><code class="docutils literal notranslate"><span class="pre">problem</span></code>: The math problem text (string)</p></li>
472+
<li><p><code class="docutils literal notranslate"><span class="pre">answer</span></code>: The answer (string, e.g., “270”)</p></li>
473+
<li><p><code class="docutils literal notranslate"><span class="pre">images</span></code>: Image data (list)</p></li>
474+
</ul>
475+
<p>For SFT training, we need to format the <code class="docutils literal notranslate"><span class="pre">answer</span></code> field for <code class="docutils literal notranslate"><span class="pre">\boxed{}</span></code> format and the messages. You can use the following script to format the answer field:</p>
476+
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span><span class="w"> </span><span class="nn">datasets</span><span class="w"> </span><span class="kn">import</span> <span class="n">load_dataset</span>
477+
<span class="kn">import</span><span class="w"> </span><span class="nn">pandas</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">pd</span>
478+
479+
<span class="n">ds</span> <span class="o">=</span> <span class="n">load_dataset</span><span class="p">(</span><span class="s2">&quot;chenhegu/geo3k_imgurl&quot;</span><span class="p">,</span> <span class="n">split</span><span class="o">=</span><span class="s2">&quot;train&quot;</span><span class="p">)</span>
480+
481+
<span class="k">def</span><span class="w"> </span><span class="nf">format_answer</span><span class="p">(</span><span class="n">answer</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">str</span><span class="p">:</span>
482+
<span class="w"> </span><span class="sd">&quot;&quot;&quot;Format answer to include \\boxed{} format.&quot;&quot;&quot;</span>
483+
<span class="k">return</span> <span class="sa">f</span><span class="s2">&quot;Answer: </span><span class="se">\\</span><span class="s2">boxed</span><span class="se">{{</span><span class="si">{</span><span class="n">answer</span><span class="si">}</span><span class="se">}}</span><span class="s2">&quot;</span>
484+
485+
<span class="k">def</span><span class="w"> </span><span class="nf">process_sample</span><span class="p">(</span><span class="n">sample</span><span class="p">):</span>
486+
<span class="n">formatted_answer</span> <span class="o">=</span> <span class="sa">f</span><span class="s2">&quot;Answer: </span><span class="se">\\</span><span class="s2">boxed</span><span class="se">{{</span><span class="si">{</span><span class="n">sample</span><span class="p">[</span><span class="s1">&#39;answer&#39;</span><span class="p">]</span><span class="si">}</span><span class="se">}}</span><span class="s2">&quot;</span>
487+
488+
<span class="n">sample</span><span class="p">[</span><span class="s2">&quot;messages&quot;</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span>
489+
<span class="p">{</span><span class="s2">&quot;role&quot;</span><span class="p">:</span> <span class="s2">&quot;user&quot;</span><span class="p">,</span> <span class="s2">&quot;content&quot;</span><span class="p">:</span> <span class="n">sample</span><span class="p">[</span><span class="s2">&quot;problem&quot;</span><span class="p">]},</span>
490+
<span class="p">{</span><span class="s2">&quot;role&quot;</span><span class="p">:</span> <span class="s2">&quot;assistant&quot;</span><span class="p">,</span> <span class="s2">&quot;content&quot;</span><span class="p">:</span> <span class="n">formatted_answer</span><span class="p">}</span>
491+
<span class="p">]</span>
492+
<span class="k">return</span> <span class="n">sample</span>
493+
494+
<span class="n">ds</span> <span class="o">=</span> <span class="n">ds</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">process_sample</span><span class="p">)</span>
495+
<span class="n">ds</span><span class="o">.</span><span class="n">to_parquet</span><span class="p">(</span><span class="s2">&quot;/root/datasets/geo3k_imgurl/train_formatted.parquet&quot;</span><span class="p">)</span>
496+
</pre></div>
497+
</div>
498+
</section>
466499
<section id="reproduce">
467500
<h2>Reproduce<a class="headerlink" href="#reproduce" title="Link to this heading">#</a></h2>
468501
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="nb">export</span><span class="w"> </span><span class="nv">WANDB_API_KEY</span><span class="o">=</span>your_wandb_api_key
@@ -475,6 +508,9 @@ <h2>Reproduce<a class="headerlink" href="#reproduce" title="Link to this heading
475508

476509
<span class="c1"># With different model</span>
477510
<span class="nv">SLIME_SCRIPT_MODEL_NAME</span><span class="o">=</span>Qwen3-VL-4B-Instruct<span class="w"> </span>./examples/geo3k_vlm/run_geo3k_vlm.sh
511+
512+
<span class="c1"># SFT</span>
513+
./examples/geo_3k_vlm/run_geo3k_vlm_sft.sh
478514
</pre></div>
479515
</div>
480516
<section id="configuration">
@@ -578,6 +614,7 @@ <h2>B200<a class="headerlink" href="#b200" title="Link to this heading">#</a></h
578614
</div>
579615
<nav class="bd-toc-nav page-toc">
580616
<ul class="visible nav section-nav flex-column">
617+
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#data-preparation-for-sft-training">Data Preparation (For SFT Training)</a></li>
581618
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#reproduce">Reproduce</a><ul class="visible nav section-nav flex-column">
582619
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#configuration">Configuration</a></li>
583620
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#supported-models">Supported Models</a></li>
@@ -622,7 +659,7 @@ <h2>B200<a class="headerlink" href="#b200" title="Link to this heading">#</a></h
622659

623660
<div class="footer-item">
624661
<p class="last-updated">
625-
Last updated on Dec 22, 2025.
662+
Last updated on Dec 24, 2025.
626663
<br/>
627664
</p>
628665
</div>

_examples_synced/low_precision/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
<link rel="search" title="Search" href="../../search.html" />
5353
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5454
<meta name="docsearch:language" content="en"/>
55-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
55+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5656
</head>
5757

5858

@@ -579,7 +579,7 @@ <h2>TODO<a class="headerlink" href="#todo" title="Link to this heading">#</a></h
579579

580580
<div class="footer-item">
581581
<p class="last-updated">
582-
Last updated on Dec 22, 2025.
582+
Last updated on Dec 24, 2025.
583583
<br/>
584584
</p>
585585
</div>

_examples_synced/multi_agent/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
<link rel="prev" title="Retool: from SFT to RL" href="../retool/README.html" />
5555
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5656
<meta name="docsearch:language" content="en"/>
57-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
57+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5858
</head>
5959

6060

@@ -583,7 +583,7 @@ <h2>New Arguments<a class="headerlink" href="#new-arguments" title="Link to this
583583

584584
<div class="footer-item">
585585
<p class="last-updated">
586-
Last updated on Dec 22, 2025.
586+
Last updated on Dec 24, 2025.
587587
<br/>
588588
</p>
589589
</div>

_examples_synced/reproducibility/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
<link rel="prev" title="DeepSeek R1 with 128xH100" href="../../examples/deepseek-r1.html" />
5555
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5656
<meta name="docsearch:language" content="en"/>
57-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
57+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5858
</head>
5959

6060

@@ -545,7 +545,7 @@ <h1>Reproducibility<a class="headerlink" href="#reproducibility" title="Link to
545545

546546
<div class="footer-item">
547547
<p class="last-updated">
548-
Last updated on Dec 22, 2025.
548+
Last updated on Dec 24, 2025.
549549
<br/>
550550
</p>
551551
</div>

_examples_synced/retool/README.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454
<link rel="prev" title="Fully Asynchronous Rollout Example" href="../fully_async/README.html" />
5555
<meta name="viewport" content="width=device-width, initial-scale=1"/>
5656
<meta name="docsearch:language" content="en"/>
57-
<meta name="docbuild:last-update" content="Dec 22, 2025"/>
57+
<meta name="docbuild:last-update" content="Dec 24, 2025"/>
5858
</head>
5959

6060

@@ -642,7 +642,7 @@ <h2>Safety Features<a class="headerlink" href="#safety-features" title="Link to
642642

643643
<div class="footer-item">
644644
<p class="last-updated">
645-
Last updated on Dec 22, 2025.
645+
Last updated on Dec 24, 2025.
646646
<br/>
647647
</p>
648648
</div>

0 commit comments

Comments
 (0)