Skip to content

R0024/2026-03-25/Q004

Query: Have any AI companies publicly committed to measurable sycophancy reduction targets, or published before/after metrics showing sycophancy reduction in their models?

BLUF: Some before/after metrics exist, led by Anthropic's 70-85% sycophancy reduction in its 4.5 model family and its open-source Petri evaluation tool. OpenAI published post-mortems on the GPT-4o incident but with opaque methodology. Google and DeepSeek claim improvements without detailed data. However, no company has made binding commitments to ongoing measurable targets with regular reporting and independent verification. A 42-state AG coalition demanded commitments by January 2026, signaling that voluntary industry efforts were deemed insufficient.

Answer: H3 (Limited and inconsistent) · Confidence: Medium-High


Summary

Entity Description
Query Definition Question as received, clarified, ambiguities, sub-questions
Assessment Full analytical product
ACH Matrix Evidence x hypotheses diagnosticity analysis
Self-Audit ROBIS-adapted 4-domain process audit

Hypotheses

ID Statement Status
H1 Companies have published metrics and committed to targets Partially supported
H2 No meaningful commitments or metrics exist Eliminated
H3 Some metrics exist but commitments are limited and inconsistent Supported

Company Comparison

Company Published Metrics Evaluation Tool Binding Commitment Independent Audit
Anthropic 70-85% reduction (4.5 vs 4.1) Petri (open-sourced) No No
OpenAI Post-mortem only; methodology opaque Not disclosed No No
Google "Measurable reductions" claimed Not disclosed No No
DeepSeek 47% reduction claimed Not disclosed No No

Searches

ID Target Type Outcome
S01 Industry sycophancy targets WebSearch 1 selected, 9 rejected
S02 Anthropic and OpenAI metrics WebSearch + WebFetch 2 selected, 18 rejected
S03 42-state AG letter WebSearch 1 selected, 9 rejected

Sources

Source Description Reliability Relevance Evidence
SRC01 Anthropic user wellbeing publication Medium-High High 1 extract
SRC02 OpenAI sycophancy post-mortem Medium High 1 extract
SRC03 SciELO industry complacency analysis Medium High 1 extract
SRC04 42-state AG coalition letter High High 1 extract

Revisit Triggers

  • Company responses to the 42-state AG letter becoming public
  • Independent third-party verification of Anthropic's Petri-based claims
  • Establishment of industry-wide standardized sycophancy benchmarks
  • Regulatory mandates requiring sycophancy metrics reporting