Skip to content

R0057/2026-04-01/C001/H1

Research R0057 — RLHF Yes-Men Claims v3
Run 2026-04-01
Claim C001
Hypothesis H1

Statement

The claim is accurate as stated: AI models affirm users' views approximately 49% more often than humans do.

Status

Current: Supported

Supporting Evidence

Evidence Summary
SRC01-E01 Science study reports 49% higher endorsement rate on general advice and Reddit prompts

Contradicting Evidence

Evidence Summary
No contradicting evidence found

Reasoning

The 49% figure is directly reported in Cheng et al. (2026) published in Science. Multiple independent news sources confirm the same figure. The "approximately" qualifier in the claim appropriately accounts for the slight variation between prompt types.

Relationship to Other Hypotheses

H1 and H2 are not mutually exclusive — the claim is accurate on general prompts (H1) while showing slight variation on harmful prompts (H2). H3 is eliminated by the direct evidence.