Skip to content

R0057/2026-04-01/C018/SRC01/E01

Research R0057 — RLHF Yes-Men Claims v3
Run 2026-04-01
Claim C018
Source SRC01
Evidence SRC01-E01
Type Reported

Anthropic reports 70-85% sycophancy reduction in latest models; OpenAI reports substantial improvements in GPT-5

URL: https://www.anthropic.com/news/protecting-well-being-of-users

Extract

Anthropic's latest models (Opus 4.5, Sonnet 4.5, Haiku 4.5) scored 70-85% lower on sycophancy than Opus 4.1. OpenAI reports GPT-5 shows substantial improvements in sycophancy reduction. Both companies released public evaluation metrics. Improvements ship to all users, not enterprise-specific.

Relevance to Hypotheses

Hypothesis Relationship Strength
H1 Supports Directly addresses claim accuracy
H2 Supports Allows for partial correctness
H3 Contradicts Evidence contradicts material inaccuracy

Context

Direct from vendor announcements with quantified metrics.