R0023/2026-03-25/Q001/SRC02/E02¶
CoT introduces inconsistency that causes errors on questions the model would otherwise answer correctly.
URL: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5285532
Extract¶
The study found that CoT "introduces inconsistency that causes errors on 'easy' questions the model would otherwise answer correctly." This is distinct from simply failing to help — CoT actively degrades performance by introducing additional reasoning steps that can lead the model astray on straightforward questions. Many models "perform CoT-like reasoning by default, even without explicit instructions," making explicit CoT prompting redundant and potentially harmful.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Supports | Direct evidence of a popular technique being actively counterproductive — causing errors that would not otherwise occur |
| H2 | Contradicts | This is not an edge case; it is a systematic pattern across models |
| H3 | Supports | The mechanism is context-dependent: models that already reason internally are most affected |
Context¶
This finding has significant practical implications: the most common prompt engineering advice — "think step by step" — can make a model perform worse when the model already incorporates step-by-step reasoning internally.