The difference between DeepSeek-V3 and DeepSeek-R1-Zero #39

helperfunc · 2025-01-25T23:50:03Z

helperfunc
Jan 25, 2025

Can we think the difference between DeepSeek-V3 and DeepSeek-R1-Zero is only with or without supervised fine-tuning during post-training?

fawern · 2025-01-27T14:13:10Z

fawern
Jan 27, 2025

Yes, the primary difference between DeepSeek-R1-Zero and DeepSeek-R1 lies in the use of supervised fine-tuning during post-training. DeepSeek-R1-Zero is trained exclusively using reinforcement learning without any supervised fine-tuning. In contrast, DeepSeek-R1 incorporates a supervised fine-tuning phase to enhance readability and coherence in its outputs.

0 replies

fawern · 2025-01-27T14:16:29Z

fawern
Jan 27, 2025

here are some sources:
https://aipapersacademy.com/deepseek-r1/?utm_source=chatgpt.com
https://www.datacamp.com/blog/deepseek-r1?utm_source=chatgpt.com

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The difference between DeepSeek-V3 and DeepSeek-R1-Zero #39

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The difference between DeepSeek-V3 and DeepSeek-R1-Zero #39

Uh oh!

helperfunc Jan 25, 2025

Replies: 2 comments

Uh oh!

fawern Jan 27, 2025

Uh oh!

fawern Jan 27, 2025

helperfunc
Jan 25, 2025

fawern
Jan 27, 2025

fawern
Jan 27, 2025