R0024/2026-03-25/Q004/H1¶


Research	R0024 — Sycophancy and Addiction
Run	2026-03-25
Query	Q004
Hypothesis	H1

Statement¶

Yes, multiple AI companies have published quantitative before/after sycophancy reduction metrics and/or made forward-looking commitments to measurable reduction targets.

Status¶

Current: Partially supported

Some companies (Anthropic, Google) have published before/after metrics. However, the metrics lack standardization, independent verification, and binding commitment mechanisms. OpenAI acknowledged the problem but its methodology is opaque and "future measurements may not be directly comparable to past ones."

Supporting Evidence¶

Evidence	Summary
SRC01-E01	Anthropic published 70-85% sycophancy reduction in 4.5 vs 4.1 models and open-sourced Petri evaluation tool
SRC02-E01	OpenAI published post-mortems on GPT-4o sycophancy incident

Contradicting Evidence¶

Evidence	Summary
SRC03-E01	SciELO analysis found Anthropic places responsibility on users despite acknowledging the problem
SRC04-E01	42-state AG coalition demanded commitments, implying companies had not already made them

Reasoning¶

H1 is partially supported. Metrics exist from Anthropic and Google, but the characterization of "committed to measurable targets" overstates the reality. Anthropic published before/after metrics and open-sourced an evaluation tool, which is the strongest example. Google claimed "measurable reductions" in Gemini 3 but details are less specific. OpenAI published incident analyses but not systematic before/after metrics. No company has made binding commitments to ongoing sycophancy reduction targets.

Relationship to Other Hypotheses¶

H1 is partially supported, H2 is partially eliminated, H3 best describes the current state.