You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ We love Tinker. Tinker simplifies LLM post-training for developers and researche
11
11
## Quick Start
12
12
13
13
- Follow the [Pig Latin notebook](examples/sft/pig-latin/piglatin_sft_notebook.ipynb) or [Text-to-SQL notebook](examples/sft/text-to-sql/texttosql_sft_notebook.ipynb) to see supervised fine-tuning in action.
14
-
- Follow the [Text-to-SQL RL recipe](examples/rl/text-to-sql/README.md) to see reinforcement learning in action.
14
+
- Follow the [Text-to-SQL RL recipe](examples/text-to-sql/README.md) to see reinforcement learning in action.
15
15
16
16
Snippet below shows a sample Reinforcement Learning loop like GRPO, where the 4 API primitives are used to create a generate-and-reward-train loop:
17
17
@@ -78,15 +78,15 @@ Detailed guides and runnable examples are structured under `docs/` and `examples
78
78
79
79
-**Guides:**
80
80
- Supervised finetuning:
81
-
-[Pig Latin SFT Notebook](examples/sft/pig-latin/piglatin_sft_notebook.ipynb) & [script guide](docs/guides/supervised/pig-latin.md)
81
+
-[Pig Latin SFT Notebook](examples/sft/pig-latin/piglatin_sft_notebook.ipynb) & [script guide](examples/sft/pig-latin/README.md)
0 commit comments