R0054/2026-03-31/C003/H2¶
Statement¶
The claim is partially correct: LLMs sometimes skip steps, but it is not primarily caused by a conflict between helpfulness and compliance — it may be due to context window limitations, attention issues, or other technical factors.
Status¶
Current: Inconclusive
Supporting Evidence¶
| Evidence | Summary |
|---|---|
| SRC03-E01 | Semantic override could be interpreted as a technical limitation (inability to override pretrained weights) rather than a helpfulness conflict |
Contradicting Evidence¶
| Evidence | Summary |
|---|---|
| SRC01-E01 | Anthropic explicitly attributes the behavior to RLHF training and human preference data favoring agreeableness |
| SRC04-E01 | Medical research specifically identifies "trained to be helpful" as the root cause |
Reasoning¶
While technical factors like context window limitations and attention decay could contribute to step-skipping, the evidence more strongly supports the helpfulness-compliance conflict as the primary mechanism. Both Anthropic and the medical sycophancy research explicitly identify RLHF-trained helpfulness as the root cause.
Relationship to Other Hypotheses¶
H2 offers an alternative mechanism but the evidence does not support it as the primary explanation. The helpfulness-compliance conflict (H1) is better supported.