Skip to content

R0040/2026-03-28/Q001/S03/R02

Research R0040 — RLHF Alternatives
Run 2026-03-28
Query Q001
Search S03
Result S03-R02

Original KTO paper applying prospect theory to LLM alignment.

Summary

Field Value
Title KTO: Model Alignment as Prospect Theoretic Optimization
URL https://arxiv.org/abs/2402.01306
Date accessed 2026-03-28
Publication date 2024-02-02 (revised 2024-11-19)
Author(s) Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela
Publication ICML 2024

Selection Decision

Included in evidence base: Yes

Rationale: Primary source for KTO. Published at ICML 2024. Introduces the theoretical framework of "human-aware losses" that unifies DPO and related methods, while demonstrating that binary feedback can match pairwise preference performance.