R0053/2026-03-31-02/C002 — Self-Audit¶
ROBIS 4-Domain Audit¶
Domain 1: Eligibility Criteria¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| Defined "enforcement language" before searching | Yes — interpreted as explicit, non-negotiable directive language |
| Distinguished diagnosis from prescription | Yes — tested the problem claim and the solution claim separately |
Notes: The claim's two components were identified and tested independently.
Domain 2: Search Comprehensiveness¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| Multiple search strategies used | Yes — negative instruction effectiveness and instruction hierarchy failures |
| Searches designed to test each hypothesis | Yes — S01 tests mechanism, S02 tests problem existence |
| All results dispositioned | Yes — 20 results, all dispositioned |
| Source diversity achieved | Yes — blog analysis, academic paper, vendor guidance |
Notes: 2 searches, 20 results total.
Domain 3: Evaluation Consistency¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| All sources scored using same framework | Yes |
| Evidence typed consistently | Yes |
| ACH matrix applied | Yes |
| Diagnosticity analysis performed | Yes |
Notes: Consistent framework applied.
Domain 4: Synthesis Fairness¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| All hypotheses given fair hearing | Yes |
| Contradictory evidence surfaced | Yes — nuance about when negative constraints do work |
| Confidence calibrated to evidence | Yes |
| Gaps acknowledged | Yes |
Notes: The assessment acknowledges that negative constraints have a role in specific contexts (firm boundaries), which is fair to H1.
Domain 5: Source-Back Verification¶
Rating: Low risk
| Source | Claim in Assessment | Source Actually Says | Match? |
|---|---|---|---|
| SRC01 | Negative instructions less effective | "LLMs seem to produce worse output the more DO NOTs are included" | Yes |
| SRC01 | Anthropic advises positive framing | "Tell Claude what to do instead of what not to do" | Yes |
| SRC02 | Instruction hierarchies fail | "fails to establish a reliable instruction hierarchy" | Yes |
Discrepancies found: 0
Corrections applied: None needed
Unresolved flags: None
Notes: All claims verified against sources.
Overall Assessment¶
Overall risk of bias: Low risk
The main nuance is that "enforcement language" in the claim may not specifically mean "negative constraints" — the claim uses both concepts but the second sentence explicitly says "tell the AI what it is not allowed to do." This was interpreted as advocating negative framing.
Researcher Bias Check¶
- Confirmation bias risk: The methodology uses enforcement language extensively, which could bias toward confirming the approach works. Mitigated by testing the specific mechanism (negative vs positive).
- No researcher profile provided: Cannot check declared biases.