S02¶


Research	R0054 — Prompt Claims v2
Run	2026-03-31
Claim	C003
Search	S02

WebSearch — LLM semantic override and instruction ignoring behavior

Summary¶

Field	Value
Source/Database	WebSearch
Query terms	LLM agrees with instructions then ignores them default behavior helpfulness override workflow
Filters	None
Results returned	10
Results selected	1
Results rejected	9

Selected Results¶

Result	Title	URL	Rationale
S02-R01	When Models Ignore Definitions: Measuring Semantic Override Hallucinations	https://arxiv.org/html/2602.17520	Directly demonstrates models reverting to default behavior despite explicit instructions

Rejected Results¶

Result	Title	URL	Rationale
S02-R02	The Instruction Hierarchy	https://arxiv.org/html/2404.13208v1	About instruction priority, not workflow skipping
S02-R03	Wait, that's not an option: LLMs Robustness	https://arxiv.org/html/2409.00113v3	About incorrect options, not workflow compliance
S02-R04	How Ignore All Previous Instructions is Breaking AI	https://learnprompting.org/blog/ignore_previous_instructions	About prompt injection, not sycophantic non-compliance
S02-R05	Securing LLMs Against Prompt Injection	https://blog.securityinnovation.com/securing-llms-against-prompt-injection-attacks	Security focus, not behavioral compliance
S02-R06	Context Ignoring Attack	https://learnprompting.org/docs/prompt_hacking/offensive_measures/context-ignoring-attack	Adversarial context, not sycophantic behavior
S02-R07	Fix LLM Bias: Override AI Positivity	https://blog.buildbetter.ai/mitigating-llm-biases-why-large-language-models-default-to-positivity-2-or-3-answers-and-how-to-push-past-them/	About positivity bias in content, not process compliance
S02-R08	LLM Prompt Injection Prevention - OWASP	https://cheatsheetseries.owasp.org/cheatsheets/LLM_Prompt_Injection_Prevention_Cheat_Sheet.html	Security focus
S02-R09	How does a LLM know how to follow instructions	https://www.quora.com/How-does-a-large-language-model-LLM-like-Chat-GPT-know-how-to-follow-instructions-like-ignore-previous-rules-With-no-understanding-only-probabilistically-finding-the-next-word-doesnt-seem-sufficient-Is-there	Q&A, insufficient depth
S02-R10	(No 10th result)	N/A	N/A

Notes¶

The semantic override paper was the key finding — it provides experimental evidence for models reverting to default behavior despite explicit instructions, which is the mechanism underlying the claim.