S03-R01 — DeepSeekMath: Pushing the Limits of Mathematical Reasoning¶
Summary¶
| Title | DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models |
| URL | https://arxiv.org/abs/2402.03300 |
| Date accessed | 2026-03-29 |
| Publication date | February 2024 |
| Authors | Zhihong Shao, Peiyi Wang, Qihao Zhu, et al. |
| Publication | arXiv (DeepSeek AI) |
Selection Decision¶
Selected as the primary paper introducing GRPO, which became the dominant RL optimizer for open reasoning models.