E01¶


Research	R0041 — Enterprise Sycophancy
Run	2026-04-01
Query	Q001
Source	SRC02
Evidence	SRC02-E01
Type	Reported

Anthropic's sycophancy reduction claims for Claude Sonnet 4.5

URL: https://www.anthropic.com/news/claude-sonnet-4-5

Extract¶

Anthropic states that Claude's "extensive safety training" has achieved "reducing concerning behaviors like sycophancy, deception, power-seeking, and the tendency to encourage delusional thinking." The company claims 70-85% improvement in sycophancy reduction over previous model generations. The model is described as "our most aligned frontier model yet."

No enterprise-specific API parameters or configurations for controlling sycophancy are mentioned. No details on how sycophancy was measured. The improvements are presented as general model-wide enhancements available to all users, not enterprise-differentiated features.

Separately, Anthropic began evaluating Claude for sycophancy in 2022 and has "steadily refined how it trains, tests, and reduces sycophancy, with the most recent models being the least sycophantic to date."

Relevance to Hypotheses¶

Hypothesis	Relationship	Strength
H1	Contradicts	No enterprise-specific product or API parameters offered
H2	Supports	Demonstrates active, long-running research and measurable improvement
H3	Contradicts	The claimed improvements are substantial, even if self-reported

Context¶

The 70-85% figure is a vendor self-report without published methodology. The researcher profile notes skepticism toward vendor safety claims, which is warranted here. However, the longitudinal commitment (since 2022) suggests genuine investment.