Skip to content

R0040/2026-04-01/Q001/SRC04

Research R0040 — RLHF Alternatives
Run 2026-04-01
Query Q001
Search S04
Result S04-R01
Source SRC04

Ethayarajh et al. -- KTO: Model Alignment as Prospect Theoretic Optimization

Source

Field Value
Title KTO: Model Alignment as Prospect Theoretic Optimization
Publisher ICML 2024
Author(s) Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela
Date 2024-02-02
URL https://arxiv.org/abs/2402.01306
Type Research paper (peer-reviewed)

Summary

Dimension Rating
Reliability High
Relevance High
Bias: Missing data Low risk
Bias: Measurement Low risk
Bias: Selective reporting Low risk
Bias: Randomization N/A -- not an RCT
Bias: Protocol deviation N/A -- not an RCT
Bias: COI/Funding Some concerns

Rationale

Dimension Rationale
Reliability Peer-reviewed at ICML 2024, a top-tier venue. Authors from Stanford and Contextual AI.
Relevance Introduces a methodologically novel approach to alignment that reduces data requirements.
Bias flags Authors founded Contextual AI which commercializes KTO. However, paper underwent rigorous peer review.

Evidence Extracts

Evidence ID Summary
SRC04-E01 KTO uses prospect theory, requires only binary labels, matches DPO at 1B-30B scale