SRC07 — KTO: Model Alignment as Prospect Theoretic Optimization¶
Source¶
| Title | KTO: Model Alignment as Prospect Theoretic Optimization |
| Publisher | ICML 2024 / arXiv |
| Authors | Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela |
| Date | February 2024 (accepted ICML 2024) |
| URL | https://arxiv.org/abs/2402.01306 |
| Type | Peer-reviewed conference paper |
Summary Ratings¶
| Dimension | Rating |
|---|---|
| Reliability | High |
| Relevance | High |
| Missing data bias | Low |
| Measurement bias | Low |
| Selective reporting bias | Low |
| Randomization bias | N/A |
| Protocol deviation bias | Low |
| COI / Funding bias | Low |
Rationale¶
| Dimension | Rationale |
|---|---|
| Reliability | Peer-reviewed at ICML 2024, grounded in established economic theory (prospect theory) |
| Relevance | Proposes a fundamentally different approach using binary signals instead of preferences |
| COI / Funding | Academic authors (Stanford, Cohere); no single commercial interest |
Evidence Extracts¶
| Evidence | Summary |
|---|---|
| SRC07-E01 | KTO uses binary desirability signals instead of comparative preferences, matching RLHF performance |