Skip to content

R0053/2026-03-31-02/C002/S02/R01

Research R0053 — Prompt Claims
Run 2026-03-31-02
Claim C002
Search S02
Result S02-R01

Key paper on instruction hierarchy failures in LLMs

Summary

Field Value
Title Control Illusion: The Failure of Instruction Hierarchies in Large Language Models
URL https://arxiv.org/abs/2502.15851
Date accessed 2026-03-31
Publication date 2025-02-21
Author(s) Yilin Geng, Haonan Li, Honglin Mu, Xudong Han, Timothy Baldwin, Omri Abend, Eduard Hovy, Lea Frermann
Publication arXiv (accepted to AAAI-26)

Selection Decision

Included in evidence base: Yes

Rationale: Directly relevant — demonstrates that instruction hierarchies fail in LLMs, supporting the claim that requirements need enforcement but challenging the mechanism.