R03¶


Research	R0040 — RLHF Alternatives
Run	2026-03-28
Query	Q002
Search	S02
Result	S02-R03

Wei et al. paper on synthetic data reducing sycophancy.

Summary¶

Field	Value
Title	Simple Synthetic Data Reduces Sycophancy in Large Language Models
URL	https://arxiv.org/abs/2308.03958
Date accessed	2026-03-28
Publication date	2024-02-16
Author(s)	Jerry Wei et al.
Publication	arXiv

Selection Decision¶

Included in evidence base: Yes

Rationale: Demonstrates that sycophancy can be reduced through data-level intervention (synthetic non-sycophantic examples) without changing the training algorithm. This is directly relevant to the question of whether RLHF must be replaced or can be fixed through better data.