SRC05 — Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback¶

Source¶


Title	Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Publisher	TMLR 2023 / arXiv
Authors	Stephen Casper, et al. (32 co-authors)
Date	July 2023
URL	https://arxiv.org/abs/2307.15217
Type	Peer-reviewed survey paper

Dimension	Rationale
Reliability	Comprehensive survey of 250+ papers, 32 co-authors from multiple institutions
Relevance	Catalogues the specific problems driving the search for RLHF alternatives
COI / Funding	Multi-institutional; no single commercial entity dominates

Evidence	Summary
SRC05-E01	RLHF has both tractable problems and fundamental limitations across feedback, reward model, and policy