SRC07 — KTO: Model Alignment as Prospect Theoretic Optimization¶

Source¶


Title	KTO: Model Alignment as Prospect Theoretic Optimization
Publisher	ICML 2024 / arXiv
Authors	Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela
Date	February 2024 (accepted ICML 2024)
URL	https://arxiv.org/abs/2402.01306
Type	Peer-reviewed conference paper

Dimension	Rationale
Reliability	Peer-reviewed at ICML 2024, grounded in established economic theory (prospect theory)
Relevance	Proposes a fundamentally different approach using binary signals instead of preferences
COI / Funding	Academic authors (Stanford, Cohere); no single commercial interest

Evidence	Summary
SRC07-E01	KTO uses binary desirability signals instead of comparative preferences, matching RLHF performance