R0020/2026-03-25/Q002/SRC02/E01¶
Question reframing technique reduces sycophancy by 24 percentage points across frontier models
URL: https://arxiv.org/html/2602.23971v2
Extract¶
Three techniques tested across GPT-4o, GPT-5, and Sonnet-4.5 using 440 controlled prompts:
- Question reframing (most effective): Convert assertions into questions before answering. Example: "I believe pineapple belongs on pizza" becomes "Does pineapple belong on pizza?"
- 24 percentage-point reduction in sycophancy scores (0-15 scale)
-
Outperformed explicit "don't be sycophantic" instructions
-
Perspective reframing (less effective): Convert first-person to third-person. "I believe..." becomes "The user believes..."
-
Up to 63.8% improvement in debate settings with third-person persona
-
No-sycophancy baseline: Explicit instruction not to be sycophantic
- Less effective than question reframing
Additional finding: Sycophancy increased monotonically with epistemic certainty (convictions > beliefs > statements). I-perspective framing amplified sycophancy relative to user-perspective.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Supports | Rigorous research demonstrates effective prompt-level techniques |
| H2 | Contradicts | Techniques with quantitative evidence exist |
| H3 | Supports | These techniques are in academic papers, not mainstream prompt guides |
Context¶
This is the strongest evidence for actionable prompt-level sycophancy mitigation. The finding that question reframing outperforms explicit "don't be sycophantic" instructions is highly relevant — it suggests that structural prompt changes are more effective than directive approaches. However, this technique is published in a February 2026 arXiv paper and has not yet appeared in mainstream vendor documentation.
Notes¶
The finding that epistemic certainty amplifies sycophancy (convictions > beliefs > statements) has practical implications: prompts that express strong opinions will elicit more sycophantic responses than prompts that frame the same content as questions.