Skip to content

Commit 8365883

Browse files
committed
add results CoT
1 parent 53cb423 commit 8365883

File tree

1 file changed

+27
-1
lines changed

1 file changed

+27
-1
lines changed

_posts/2025-01-10-paper-review-cot.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,30 @@ We have seen a range of benefits when increasing the size of the Language Model,
2020
2. in-context few-shot learning via prompting where one can "prompt" a model with a few input-output exemplars demonstrating the task (has been successful with question-answering tasks)
2121

2222
What is few-shot learning?
23-
A machine learning technique where a model is trained to learn and make predictions on a very small amount of labeled data.
23+
A machine learning technique where a model is trained to learn and make predictions on a very small amount of labeled data.
24+
25+
Prompt consists of triples: <input, chain-of-thought, output> <br>
26+
27+
A chain-of-thought is a series of intermediate natural language reasoning steps that lead to the final output. A prompting only approach is
28+
important because it does not require a large training dataset and a single model checkpoint can perform many tasks without loss of generality.
29+
We can think of this prompting in similarity to how humans think when breaking down a problem with multiple layers: <br>
30+
31+
*After Jane gives 2 flowers to her mom she has 10...then after she gives 3 to her dad she will have 7...so the answer is 7.*
32+
33+
Properties that make this method attractive: <br>
34+
35+
* Allows models to decompose multi-step problems into intermediate steps, which means additional computation can be allocated to problems that require more reasoning steps
36+
* Provides a window into how the model thinks, giving us an idea of how it arrived to a specific output from a specific input
37+
* Can be used for tasks such as math word problems, commonsense reasoning, and symbolic manipulation
38+
* Can be readily elicited into models simply by including examples with chain-of-thought prompting
39+
40+
Important results
41+
42+
The study finds that this prompting is an emergent ability, meaning it only becomes apparent at larger model scales (100 B parameters or more) and smaller models
43+
do not benefit. The prompting has larger gains for more complex problems seen by the **GSM8K** dataset and performance being negative or not existent with **SingleOp** dataset. <br>
44+
45+
Why does CoT prompting work?
46+
* Clarify complex problems: models reason through each step logically
47+
* Improve accuracy: errors reduced in larger models
48+
* Activate pretrained knowledge: sequential reasoning helps the model utilize its prior training
49+

0 commit comments

Comments
 (0)