C002 — Self-Audit¶


Research	R0053 — Prompt Claims
Run	2026-03-31-02
Claim	C002

ROBIS 4-Domain Audit¶

Domain 1: Eligibility Criteria¶

Rating: Low risk

Criterion	Assessment
Defined "enforcement language" before searching	Yes — interpreted as explicit, non-negotiable directive language
Distinguished diagnosis from prescription	Yes — tested the problem claim and the solution claim separately

Notes: The claim's two components were identified and tested independently.

Domain 2: Search Comprehensiveness¶

Rating: Low risk

Criterion	Assessment
Multiple search strategies used	Yes — negative instruction effectiveness and instruction hierarchy failures
Searches designed to test each hypothesis	Yes — S01 tests mechanism, S02 tests problem existence
All results dispositioned	Yes — 20 results, all dispositioned
Source diversity achieved	Yes — blog analysis, academic paper, vendor guidance

Notes: 2 searches, 20 results total.

Domain 3: Evaluation Consistency¶

Rating: Low risk

Criterion	Assessment
All sources scored using same framework	Yes
Evidence typed consistently	Yes
ACH matrix applied	Yes
Diagnosticity analysis performed	Yes

Notes: Consistent framework applied.

Domain 4: Synthesis Fairness¶

Rating: Low risk

Criterion	Assessment
All hypotheses given fair hearing	Yes
Contradictory evidence surfaced	Yes — nuance about when negative constraints do work
Confidence calibrated to evidence	Yes
Gaps acknowledged	Yes

Notes: The assessment acknowledges that negative constraints have a role in specific contexts (firm boundaries), which is fair to H1.

Domain 5: Source-Back Verification¶

Rating: Low risk

Source	Claim in Assessment	Source Actually Says	Match?
SRC01	Negative instructions less effective	"LLMs seem to produce worse output the more DO NOTs are included"	Yes
SRC01	Anthropic advises positive framing	"Tell Claude what to do instead of what not to do"	Yes
SRC02	Instruction hierarchies fail	"fails to establish a reliable instruction hierarchy"	Yes

Discrepancies found: 0

Corrections applied: None needed

Unresolved flags: None

Notes: All claims verified against sources.

Overall Assessment¶

Overall risk of bias: Low risk

The main nuance is that "enforcement language" in the claim may not specifically mean "negative constraints" — the claim uses both concepts but the second sentence explicitly says "tell the AI what it is not allowed to do." This was interpreted as advocating negative framing.

Researcher Bias Check¶

Confirmation bias risk: The methodology uses enforcement language extensively, which could bias toward confirming the approach works. Mitigated by testing the specific mechanism (negative vs positive).
No researcher profile provided: Cannot check declared biases.