R0057/2026-04-01/C005/SRC01/E01¶
Synthetic data reduces sycophancy by 4.7% to 10.0% across PaLM model variants
URL: https://arxiv.org/abs/2308.03958
Extract¶
The paper evaluates PaLM models up to 540B parameters. Flan-cont-PaLM-62B showed 10.0% reduction; Flan-PaLM-62B showed 4.7% reduction; Flan-PaLM-8B showed 8.8% reduction. The intervention involved finetuning on prompts where truthfulness is independent of user opinion.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Supports | Directly addresses claim accuracy |
| H2 | Supports | Allows for partial correctness |
| H3 | Contradicts | Evidence contradicts material inaccuracy |
Context¶
Published at ICLR 2024 (top venue), from Google DeepMind researchers.