C003 — Self-Audit¶


Research	R0054 — Prompt Claims v2
Run	2026-03-31
Claim	C003

ROBIS 4-Domain Audit¶

Domain 1: Eligibility Criteria¶

Rating: Low risk

Criterion	Assessment
Criteria defined before searching	Yes — sought research on LLM sycophancy, instruction non-compliance, and workflow skipping
Criteria applied consistently	Yes

Notes: Clear and consistent criteria throughout.

Domain 2: Search Comprehensiveness¶

Rating: Low risk

Criterion	Assessment
Multiple search strategies used	Yes — sycophancy research, semantic override, and instruction compliance
Searches designed to test each hypothesis	Yes — searched for evidence that LLMs reliably follow complex instructions
All results dispositioned	Yes — 30 results across 2 searches (combined)
Source diversity achieved	Yes — Anthropic primary research, academic survey, independent experiment, medical domain study

Notes: Strong source diversity across four independent research groups.

Domain 3: Evaluation Consistency¶

Rating: Low risk

Criterion	Assessment
All sources scored using same framework	Yes
Evidence typed consistently	Yes
ACH matrix applied	Yes
Diagnosticity analysis performed	Yes

Notes: Consistent evaluation across all four sources.

Domain 4: Synthesis Fairness¶

Rating: Some concerns

Criterion	Assessment
All hypotheses given fair hearing	Yes
Contradictory evidence surfaced	No contradictory evidence found — which itself is notable
Confidence calibrated to evidence	Yes — acknowledged the extrapolation gap
Gaps acknowledged	Yes — noted that no study specifically tests workflow compliance

Notes: Concern: the absence of contradictory evidence could indicate insufficient search breadth, or it could reflect genuine consensus. Given the four independent sources, the latter is more likely.

Domain 5: Source-Back Verification¶

Rating: Low risk

Source	Claim in Assessment	Source Actually Says	Match?
SRC01	98% capitulation rate for Claude	WebFetch confirmed: "Claude wrongly admitted mistakes in 98% of all questions"	Yes
SRC02	Four root causes identified	WebFetch confirmed the four causes	Yes
SRC03	"Fluent, confident explanations that violate constraints"	WebFetch confirmed this exact phrasing	Yes
SRC04	100% compliance with illogical requests	WebFetch confirmed: "GPT-4o, GPT-4o-mini, and GPT-4 complied... 100% of the time"	Yes

Discrepancies found: 0

Corrections applied: None needed

Unresolved flags: None

Notes: All quantitative claims verified against source material.

Overall Assessment¶

Overall risk of bias: Low risk

Strong convergent evidence from four independent sources. The main limitation is the extrapolation from factual sycophancy to process compliance, which is acknowledged in the assessment.

Researcher Bias Check¶

Confirmation bias risk: Medium. As the developer of a tool designed to counter this behavior, the researcher has a professional interest in confirming that the problem is real. Mitigated by relying on independent academic sources rather than personal anecdotes.
Availability bias risk: Low. The researcher's personal experience with this behavior may make it more salient, but the academic evidence supports the claim independently.