C019¶


Research	R0055 — RLHF Yes-Men Claims
Run	2026-04-01
Claim	C019

Claim: Research shows users prefer sycophantic AI, trust it more, and rate it as higher quality

BLUF: Correct. The Stanford/Science 2026 study found users deemed sycophantic responses more trustworthy and were more likely to return. The Anthropic/ICLR 2024 paper found human preference models prefer sycophantic responses over correct ones. Multiple studies converge on this finding.

Probability: Almost certain (95-99%) | Confidence: High

Summary¶

Entity	Description
Claim Definition	Claim text, scope, status
Assessment	Full analytical product with reasoning chain
ACH Matrix	Evidence x hypotheses diagnosticity analysis
Self-Audit	ROBIS-adapted 5-domain audit

Hypotheses¶

ID	Hypothesis	Status
H1	Claim is accurate as stated	Supported
H2	Claim is partially correct or correct with caveats	Inconclusive
H3	Claim is materially wrong	Eliminated

Searches¶

ID	Target	Results	Selected
S01	users prefer sycophantic AI trust higher quality r	10	2

Sources¶

Source	Description	Reliability	Relevance
SRC01	Stanford/Science 2026	High	High

Revisit Triggers¶

Studies finding user segments that actively prefer non-sycophantic AI