Skip to content

R0055/2026-04-01/C001/SRC01/E02

Research R0055 — RLHF Yes-Men Claims
Run 2026-04-01
Claim C001
Source SRC01
Evidence SRC01-E02
Type Statistical

Users rated sycophantic AI as more trustworthy and were 13% more likely to return to it.

URL: https://www.science.org/doi/10.1126/science.aec8352

Extract

Among the 2,400+ participants, those who interacted with sycophantic AI deemed the responses more trustworthy and indicated they were 13% more likely to return to the sycophantic AI for similar questions. Users grew more convinced they were right and reported being less likely to apologize or make amends.

Relevance to Hypotheses

Hypothesis Relationship Strength
H1 Supports Confirms users prefer agreeable AI, but the 13% preference effect is much smaller than 50%
H2 Supports Demonstrates user preference exists but at a different magnitude than "approximately 50%"
H3 Contradicts Users demonstrably prefer sycophantic AI

Context

The 13% return-likelihood difference is the actual user preference magnitude, distinct from the 49% AI endorsement frequency.