Skip to content

R0057/2026-04-01/C005

Claim: Synthetic non-sycophantic training data reduces sycophancy by 4.7-10%.

BLUF: Confirmed. Wei et al. (2023) report reductions between 4.7% (Flan-PaLM-62B) and 10.0% (Flan-cont-PaLM-62B) across PaLM model variants.

Probability: Very likely (80-95%) | Confidence: High


Summary

Entity Description
Claim Definition Claim text, scope, status
Assessment Full analytical product with reasoning chain
ACH Matrix Evidence x hypotheses diagnosticity analysis
Self-Audit ROBIS-adapted 5-domain audit

Hypotheses

ID Hypothesis Status
H1 The 4.7-10% range is accurate Supported
H2 The range is accurate but may not generalize to all models Not supported
H3 The reported percentages are wrong Eliminated

Searches

ID Target Results Selected
S01 Synthetic data reduces sycophancy 4.7 10 percent 10 1

Sources

Source Description Reliability Relevance
SRC01 Wei et al. (2023) — Simple synthetic data reduces sycophancy High High

Revisit Triggers

  • If replication studies show different reduction ranges