R0040/2026-03-28/Q001/S03/R02¶
Original KTO paper applying prospect theory to LLM alignment.
Summary¶
| Field | Value |
|---|---|
| Title | KTO: Model Alignment as Prospect Theoretic Optimization |
| URL | https://arxiv.org/abs/2402.01306 |
| Date accessed | 2026-03-28 |
| Publication date | 2024-02-02 (revised 2024-11-19) |
| Author(s) | Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela |
| Publication | ICML 2024 |
Selection Decision¶
Included in evidence base: Yes
Rationale: Primary source for KTO. Published at ICML 2024. Introduces the theoretical framework of "human-aware losses" that unifies DPO and related methods, while demonstrating that binary feedback can match pairwise preference performance.