R0023/2026-03-25/Q001/S02¶
WebSearch — Specific techniques found counterproductive: CoT, persona prompting, emotional prompting
Summary¶
| Field | Value |
|---|---|
| Source/Database | WebSearch (3 queries combined) |
| Query terms | (1) "prompt engineering" counterproductive harmful techniques "chain of thought" backfires study; (2) "role prompting" "persona prompting" counterproductive study evidence; (3) Meincke Mollick "decreasing value chain of thought" |
| Filters | None |
| Results returned | 30 |
| Results selected | 8 |
| Results rejected | 22 |
Selected Results¶
| Result | Title | URL | Rationale |
|---|---|---|---|
| S02-R01 | Prompting Science Report 2: The Decreasing Value of CoT | https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5285532 | Primary empirical study on CoT limitations, Wharton GAIL |
| S02-R02 | When "A Helpful Assistant" Is Not Really Helpful (EMNLP 2024) | https://aclanthology.org/2024.findings-emnlp.888/ | Peer-reviewed study showing personas do not improve LLM performance |
| S02-R03 | Prompting Science Report 4: Playing Pretend | https://gail.wharton.upenn.edu/research-and-insights/playing-pretend-expert-personas/ | Large-scale study: expert personas don't improve factual accuracy |
| S02-R04 | Prompting Science Report 1: Prompt Engineering is Complicated and Contingent | https://gail.wharton.upenn.edu/research-and-insights/tech-report-prompt-engineering-is-complicated-and-contingent/ | Foundational study showing prompt effects are highly variable |
| S02-R05 | Prompting Science Report 3: I'll pay you or I'll kill you | https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5375404 | Empirical test of tipping/threatening prompts showing no reliable effect |
| S02-R06 | Telling an AI model that it's an expert makes it worse (The Register) | https://www.theregister.com/2026/03/24/ai_models_persona_prompting/ | Journalism covering the Wharton persona study, provides accessible summary |
| S02-R07 | Research Shows Where Persona Prompting Works and When It Backfires | https://www.searchenginejournal.com/research-you-are-an-expert-prompts-can-damage-factual-accuracy/570397/ | Industry analysis of persona prompting research |
| S02-R08 | Emotional prompting amplifies disinformation (Frontiers in AI) | https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1543603/full | Peer-reviewed study: emotional prompts increase disinformation risk |
Rejected Results¶
| Result | Title | URL | Rationale |
|---|---|---|---|
| S02-R09 | Role Prompting guide (LearnPrompting) | https://learnprompting.org/docs/advanced/zero_shot/role_prompting | Tutorial content, not empirical research |
| S02-R10 | PromptHub blog on role prompting | https://www.prompthub.us/blog/role-prompting-does-adding-personas-to-your-prompts-really-make-a-difference | Blog summary, secondary to primary sources already captured |
| S02-R11 | Emotion Prompting (Medium) | https://medium.com/aimonks/emotionprompt-elevating-ai-with-emotional-intelligence-baee341f521b | Promotional summary, not primary research |
| S02-R12 | EmotionPrompt (TechTalks) | https://bdtechtalks.com/2023/11/06/llm-emotion-prompting/ | News coverage of positive emotion prompting results — the original study showed benefits in some tasks |
| S02-R13 | Do Persona-Infused LLMs Affect Strategic Reasoning? | https://arxiv.org/html/2512.06867v1 | Different focus — strategic games, not factual accuracy |
| S02-R14 | Influence of persona on conversational agents | https://www.sciencedirect.com/science/article/pii/S0747563225002067 | Embodied conversational agents, different domain |
| S02-R15 | Understanding CoT Prompting (Google Research) | https://research.google/pubs/towards-understanding-chain-of-thought-prompting-an-empirical-study-of-what-matters/ | Explores what matters in CoT, not specifically counterproductive effects |
| S02-R16 | Evaluating Prompt Engineering for Accuracy (arXiv) | https://arxiv.org/pdf/2506.00072 | General accuracy evaluation, not focused on counterproductive techniques |
| S02-R17 | Prompt engineering consistency in clinical guidelines (npj Digital Medicine) | https://www.nature.com/articles/s41746-024-01029-4 | Clinical domain, tangential to counterproductive techniques |
| S02-R18 | Prompting Techniques for SE Tasks | https://arxiv.org/html/2506.05614v1 | Software engineering domain, useful but not primary for this query |
| S02-R19 | Dan Cleary Medium article on role prompting | https://medium.com/@dan_43009/role-prompting-does-adding-personas-to-your-prompts-really-make-a-difference-ad223b5f1998 | Blog post, secondary |
| S02-R20 | PromptHub substack on persona prompting | https://prompthub.substack.com/p/act-like-a-or-maybe-not-the-truth | Newsletter, secondary |
| S02-R21 | Emotion Prompting (Relevance AI) | https://relevanceai.com/prompt-engineering/use-emotion-prompting-to-improve-ai-interactions | Vendor marketing, not research |
| S02-R22 | Emotion and AI (Foundation Inc) | https://foundationinc.co/lab/emotionprompts-llm | Agency blog, secondary |
Notes¶
This combined search across three queries was the most productive for Q001, yielding the Wharton Prompting Science Reports series (Reports 1-4) as the strongest evidence cluster. The EMNLP 2024 persona study and the Frontiers emotional prompting study provide independent peer-reviewed confirmation. The rejected results are predominantly secondary reporting or tutorial content.