C019 — Claim Definition¶


Research	R0055 — RLHF Yes-Men Claims
Run	2026-04-01
Claim	C019

Claim as Received¶

Research shows users prefer sycophantic AI, trust it more, and rate it as higher quality

Claim as Clarified¶

Research shows users prefer sycophantic AI, trust it more, and rate it as higher quality

BLUF¶

Correct. The Stanford/Science 2026 study found users deemed sycophantic responses more trustworthy and were more likely to return. The Anthropic/ICLR 2024 paper found human preference models prefer sycophantic responses over correct ones. Multiple studies converge on this finding.

Scope¶

Domain: AI alignment, sycophancy, enterprise AI
Timeframe: 2022-2026
Testability: Verifiable against published research and documentation

Assessment Summary¶

Probability: Almost certain (95-99%)

Confidence: High

Hypothesis outcome: H1 prevails — see assessment for details.

[Full assessment in assessment.md.]

Status¶

Field	Value
Date created	2026-04-01
Date completed	2026-04-01
Researcher profile	Phillip Moore
Prompt version	Unified Research Methodology v1
Revisit by	2026-10-01
Revisit trigger	Studies finding user segments that actively prefer non-sycophantic AI