E01¶


Research	R0028 — Prompt Engineering Claims
Run	2026-03-26
Claim	C013
Source	SRC01
Evidence	SRC01-E01
Type	Factual

Primary evidence supporting the claim assessment.

URL: https://gail.wharton.upenn.edu/research-and-insights/tech-report-chain-of-thought/

Extract¶

Partially correct. GAIL's research (a separate report, not the persona report) found that CoT prompting provides minimal benefits for reasoning models (2.9-3.1% average improvement for o3-mini and o4-mini) with substantial time costs (20-80% increase), and Gemini Flash 2.5 actually showed performance decreases (-13.1% at 100% threshold). However, this was published as a separate technical report dated June 2025, not part of the persona study, and not presented at EMNLP 2024.

Relevance to Hypotheses¶

Hypothesis	Relationship	Strength
H1	Partially supports	Direct evidence
H2	Supports	Direct evidence
H3	Contradicts	Evidence contradicts material wrongness

Context¶

Evidence gathered during 2026-03-26 research run.