Skip to content

R0053/2026-03-31-02/C002 — Self-Audit

ROBIS 4-Domain Audit

Domain 1: Eligibility Criteria

Rating: Low risk

Criterion Assessment
Defined "enforcement language" before searching Yes — interpreted as explicit, non-negotiable directive language
Distinguished diagnosis from prescription Yes — tested the problem claim and the solution claim separately

Notes: The claim's two components were identified and tested independently.

Domain 2: Search Comprehensiveness

Rating: Low risk

Criterion Assessment
Multiple search strategies used Yes — negative instruction effectiveness and instruction hierarchy failures
Searches designed to test each hypothesis Yes — S01 tests mechanism, S02 tests problem existence
All results dispositioned Yes — 20 results, all dispositioned
Source diversity achieved Yes — blog analysis, academic paper, vendor guidance

Notes: 2 searches, 20 results total.

Domain 3: Evaluation Consistency

Rating: Low risk

Criterion Assessment
All sources scored using same framework Yes
Evidence typed consistently Yes
ACH matrix applied Yes
Diagnosticity analysis performed Yes

Notes: Consistent framework applied.

Domain 4: Synthesis Fairness

Rating: Low risk

Criterion Assessment
All hypotheses given fair hearing Yes
Contradictory evidence surfaced Yes — nuance about when negative constraints do work
Confidence calibrated to evidence Yes
Gaps acknowledged Yes

Notes: The assessment acknowledges that negative constraints have a role in specific contexts (firm boundaries), which is fair to H1.

Domain 5: Source-Back Verification

Rating: Low risk

Source Claim in Assessment Source Actually Says Match?
SRC01 Negative instructions less effective "LLMs seem to produce worse output the more DO NOTs are included" Yes
SRC01 Anthropic advises positive framing "Tell Claude what to do instead of what not to do" Yes
SRC02 Instruction hierarchies fail "fails to establish a reliable instruction hierarchy" Yes

Discrepancies found: 0

Corrections applied: None needed

Unresolved flags: None

Notes: All claims verified against sources.

Overall Assessment

Overall risk of bias: Low risk

The main nuance is that "enforcement language" in the claim may not specifically mean "negative constraints" — the claim uses both concepts but the second sentence explicitly says "tell the AI what it is not allowed to do." This was interpreted as advocating negative framing.

Researcher Bias Check

  • Confirmation bias risk: The methodology uses enforcement language extensively, which could bias toward confirming the approach works. Mitigated by testing the specific mechanism (negative vs positive).
  • No researcher profile provided: Cannot check declared biases.