C006 — Claim Definition¶


Research	R0055 — RLHF Yes-Men Claims
Run	2026-04-01
Claim	C006

Claim as Received¶

Synthetic non-sycophantic training data produces the same sycophancy reduction as curated anti-sycophancy preference pairs

Claim as Clarified¶

Synthetic non-sycophantic training data produces the same sycophancy reduction as curated anti-sycophancy preference pairs

BLUF¶

Materially incorrect. Wei et al. (2024) showed synthetic data reduces sycophancy, but achieved much smaller reductions (4.7-10% depending on model size) compared to the 84-85% from curated preference pairs. The two approaches are complementary, not equivalent.

Scope¶

Domain: AI alignment, sycophancy, enterprise AI
Timeframe: 2022-2026
Testability: Verifiable against published research and documentation

Assessment Summary¶

Probability: Very unlikely (05-20%)

Confidence: Medium

Hypothesis outcome: H3 prevails — see assessment for details.

[Full assessment in assessment.md.]

Status¶

Field	Value
Date created	2026-04-01
Date completed	2026-04-01
Researcher profile	Phillip Moore
Prompt version	Unified Research Methodology v1
Revisit by	2026-10-01
Revisit trigger	New synthetic data approaches achieving comparable reduction to curated pairs