S02-R01 — Direct Preference Optimization: Your Language Model is Secretly a Reward Model¶
Summary¶
| Title | Direct Preference Optimization: Your Language Model is Secretly a Reward Model |
| URL | https://arxiv.org/abs/2305.18290 |
| Date accessed | 2026-03-29 |
| Publication date | May 2023 (revised July 2024) |
| Authors | Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn |
| Publication | NeurIPS 2023 |
Selection Decision¶
Selected as the primary paper introducing DPO. Seminal work with 6000+ citations that established the most widely adopted RLHF alternative.