R0023/2026-03-25/Q001/S02/R04¶
Foundational Wharton study demonstrating that prompt engineering effects are highly variable and measurement-dependent.
Summary¶
| Field | Value |
|---|---|
| Title | Prompting Science Report 1: Prompt Engineering is Complicated and Contingent |
| URL | https://gail.wharton.upenn.edu/research-and-insights/tech-report-prompt-engineering-is-complicated-and-contingent/ |
| Date accessed | 2026-03-25 |
| Publication date | 2025-03-04 |
| Author(s) | Lennart Meincke, Ethan Mollick, Lilach Mollick, Dan Shapiro |
| Publication | Wharton Generative AI Labs / SSRN |
Selection Decision¶
Included in evidence base: Yes
Rationale: Establishes the methodological foundation for the Wharton prompting science series. Shows that benchmark thresholds dramatically change conclusions (GPT-4o: 30.28% at strict threshold vs. 47.54% at majority), and prompt tweaks produce 60-point swings on individual questions that average out.