Skip to content

R0057/2026-04-01/C018/H1

Research R0057 — RLHF Yes-Men Claims v3
Run 2026-04-01
Claim C018
Hypothesis H1

Statement

Both are working on model-level sycophancy reduction

Status

Current: Supported

Supporting Evidence

Evidence Summary
SRC01-E01 Anthropic reports 70-85% sycophancy reduction in latest models; OpenAI reports substantial improvements in GPT-5

Contradicting Evidence

Evidence Summary
No contradicting evidence found

Reasoning

Anthropic's latest models (Opus 4.5, Sonnet 4.5, Haiku 4.5) scored 70-85% lower on sycophancy than Opus 4.1. OpenAI reports GPT-5 shows substantial improvements in sycophancy reduction. Both companies released public evaluation metrics. Improvements ship to all users, not enterprise-specific.

Relationship to Other Hypotheses

H1 represents full accuracy. H2 allows for partial correctness. H3 is eliminated by the evidence.