C002¶


Research	R0053 — Prompt Claims
Run	2026-03-31-02
Claim	C002

Claim: Any requirement stated to an AI without enforcement language will be treated as a suggestion — you must tell the AI what it is not allowed to do, not just what to do.

BLUF: The diagnosis is correct — AI does treat weakly-stated requirements as suggestions. But the prescription is wrong — negative constraints ("must not") are often less effective than positive reframing. Enforcement requires explicit, non-negotiable phrasing, not specifically negative framing.

Probability: Roughly even chance (45-55%) | Confidence: Medium

Summary¶

Entity	Description
Claim Definition	Claim text, scope, status
Assessment	Full analytical product with reasoning chain
ACH Matrix	Evidence x hypotheses diagnosticity analysis
Self-Audit	ROBIS-adapted 5-domain audit (process + source verification)

Hypotheses¶

ID	Hypothesis	Status
H1	Claim is accurate — negative constraints are necessary for enforcement	Eliminated
H2	Claim is partially correct — enforcement needed but mechanism is wrong	Supported
H3	Claim is materially wrong — AI reliably follows all clearly stated requirements	Eliminated

Searches¶

ID	Target	Results	Selected
S01	Negative vs positive instruction effectiveness	10	2
S02	LLM instruction hierarchy failures	10	2

Sources¶

Source	Description	Reliability	Relevance
SRC01	Pink Elephant Problem analysis	Medium	High
SRC02	Control Illusion paper (Geng et al., 2025)	High	High
SRC03	Anthropic prompt engineering guidance	High	Medium

Revisit Triggers¶

New controlled studies on enforcement language effectiveness in LLMs
Changes to RLHF/DPO training that affect instruction compliance
Anthropic updates its prompt engineering guidance on positive vs negative framing