Skip to content

R0023/2026-03-25/Q001/SRC04/E01

Research R0023 — Counterproductive advice and prompt lifecycle
Run 2026-03-25
Query Q001
Source SRC04
Evidence SRC04-E01
Type Statistical

Expert persona underperforms base model: 68.0% vs. 71.6% accuracy across multiple-choice questions.

URL: https://aclanthology.org/2024.findings-emnlp.888/

Extract

Across 4 popular families of LLMs and 2,410 factual questions: adding personas in system prompts does not improve model performance compared to the control setting where no persona is added. When LLMs are asked to decide between multiple-choice answers, the expert persona underperforms the base model consistently across all categories (overall accuracy: 68.0% vs. 71.6% base model).

162 distinct roles examined across 6 interpersonal relationship types and 8 expertise domains. While selecting optimal personas per question boosted accuracy, automatically identifying the best persona proved unsuccessful, "often performing no better than random selection." Persona effects "can be largely random."

Mechanism: adding a detailed expert persona "distracts" the model by activating an "instruction-following mode" that prioritizes tone and style at the expense of "factual recall."

Relevance to Hypotheses

Hypothesis Relationship Strength
H1 Supports 3.6 percentage point accuracy drop from expert persona — active degradation, not just neutral
H2 Contradicts Consistent across all categories — not edge cases
H3 Supports The mechanism (instruction-following vs. factual recall tradeoff) explains why personas help with style but hurt accuracy

Context

This is an independent replication of the Wharton findings by a different research group using different methodology. The convergence of these two studies significantly strengthens the evidence that persona prompting degrades factual accuracy.