R0054/2026-03-31/C002/SRC03/E01¶
LLMs have systematic weaknesses in processing negation, explaining why positive-only instructions are insufficient.
URL: https://gadlet.com/posts/negative-prompting/
Extract¶
Key findings from the search results:
- Research from KAIST found that larger AI models actually perform worse on negated prompts
- GPT-3, GPT-Neo, and other models consistently struggle with negation across multiple benchmarks
- LLMs "may struggle to distinguish between facts and their negations, misunderstand the semantic impact of negative particles, and fail to generalize negation handling robustly, even with instruction tuning"
- Claude documentation suggests "prefer general instructions over prescriptive steps" — but this applies to simple tasks, not complex multi-step analytical processes
- The implication: positive instructions are the primary vehicle, but negative constraints are needed to catch failure modes that positive instructions cannot prevent
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Supports | Explains the mechanism: LLMs have documented weaknesses that require both instruction types |
| H2 | Supports | The Claude documentation note provides weak support for the idea that positive-only might sometimes work |
| H3 | Contradicts | Directly contradicts — LLMs have systematic negation weaknesses that cannot be overcome by positive instructions alone |
Context¶
The KAIST finding about larger models performing worse on negation is counterintuitive and particularly relevant — it suggests that as models become more capable, they do not automatically become better at following negative constraints, which reinforces the need for careful prompt design combining both instruction types.