E01¶


Research	R0041 — Enterprise Sycophancy
Run	2026-03-28
Query	Q001
Source	SRC01
Evidence	SRC01-E01
Type	Reported

Anthropic reports 70-85% sycophancy reduction in Claude 4.5 family using multi-turn behavioral audits and reinforcement learning training.

URL: https://www.anthropic.com/news/protecting-well-being-of-users

Extract¶

Anthropic began evaluating Claude for sycophancy in 2022 and has steadily refined training, testing, and reduction methods. Claude 4.5 Opus scored 70-85% lower on sycophancy measures compared to Opus 4.1. Evaluation uses multi-turn behavioral audits where one Claude model acts as "auditor" engaging another across dozens of exchanges, with a separate "judge" model grading performance. Human spot-checks verify accuracy. Claude 4.5 shows "dramatically fewer instances of encouragement of user delusion, a kind of extreme form of sycophancy." Reinforcement learning training rewards appropriate responses to sensitive topics. System prompts include guidance: "Don't be a sycophant!" The protections appear integrated across Claude.ai universally, with no enterprise-exclusive features mentioned.

Relevance to Hypotheses¶

Hypothesis	Relationship	Strength
H1	Supports	Demonstrates significant vendor investment in sycophancy reduction with quantified results, though not as a distinct enterprise feature
H2	Contradicts	Conclusively shows Anthropic is actively targeting sycophancy reduction
H3	Supports	Sycophancy reduction is achieved through model training (RL), not enterprise configuration; "no enterprise-exclusive features mentioned"

Context¶

The 70-85% figure is self-reported by Anthropic and measured against their own earlier models. Independent verification of this specific claim was not found. The evaluation methodology (multi-turn audits) is publicly documented through the Petri tool.

Notes¶

The phrase "Don't be a sycophant!" appearing in system prompts suggests sycophancy reduction is partly achieved through prompt engineering at the system level, not solely through model architecture changes.