Q002 — Self-Audit¶


Research	R0048 — Corporate AI Training
Run	2026-04-01
Query	Q002

ROBIS 4-Domain Audit¶

Domain 1: Eligibility Criteria¶

Rating: Low risk

Criterion	Assessment
Search vocabulary comprehensive	Yes — searched "sycophancy," automation bias, overtrust, overreliance, confirmation reinforcement, acquiescence
Criteria defined before searching	Yes — both AI safety and human-factors terminology mapped in advance
Scope appropriate	Yes — corporate, government, policy, and academic sources all included

Notes: The query itself defined an excellent vocabulary exploration strategy that was followed.

Domain 2: Search Comprehensiveness¶

Rating: Low risk

Criterion	Assessment
Multiple search strategies used	Yes — three searches targeting different terminology and source types
Searches designed to test each hypothesis	Yes — searched for both sycophancy-in-training and adjacent-concept evidence
All results dispositioned	Yes — 22 results across 3 searches dispositioned
Source diversity achieved	Yes — academic, policy, government, industry, professional sources

Notes: The null result (no sycophancy in training) is strongly supported by the comprehensive search. If sycophancy appeared in any major training program, it would likely have been found.

Domain 3: Evaluation Consistency¶

Rating: Low risk

Criterion	Assessment
All sources scored consistently	Yes — same GRADE/bias framework applied to all 6 sources
Evidence typed consistently	Yes — Analytical, Factual, Statistical, Reported types applied
ACH matrix applied	Yes — all evidence evaluated against all 3 hypotheses
Diagnosticity analysis performed	Yes — NHS automation bias identified as most diagnostic

Notes: Consistent evaluation across all sources.

Domain 4: Synthesis Fairness¶

Rating: Low risk

Criterion	Assessment
All hypotheses given fair hearing	Yes — H1 actively searched for; H2 given detailed analysis
Contradictory evidence surfaced	Yes — NHS and Microsoft examples surfaced as strongest counterevidence
Confidence calibrated to evidence	Yes — Medium-High reflects strong absence finding with internal-training caveat
Gaps acknowledged	Yes — internal training content gap explicitly noted

Notes: The finding that sycophancy is absent from training aligns with the researcher's prior expectation. However, the evidence strongly supports this conclusion independently.

Domain 5: Source-Back Verification¶

Rating: Low risk

Source	Claim in Assessment	Source Actually Says	Match?
SRC04	AI is 49% more sycophantic than humans	Fortune reports "AI chatbots affirmed user actions 49% more often than humans"	Yes
SRC01	Georgetown frames as policy problem requiring new interventions	Georgetown lists four intervention categories, none involving existing training	Yes
SRC06	NHS names automation bias	NHS search results reference "cognitive biases including automation bias"	Yes
SRC03	Brookings recommends AI literacy in DOL programs	Brookings advocates "AI literacy" in DOL workforce development	Yes

Discrepancies found: 0

Corrections applied: None needed

Unresolved flags: None

Notes: All claims verified. The 49% figure is reported from Fortune's coverage of the Science study; the primary Science paper was behind a paywall.

Overall Assessment¶

Overall risk of bias: Low risk

The research process was thorough and the finding is strongly supported. The main risk is that the researcher's prior expectation was confirmed, which always warrants extra scrutiny. This scrutiny was applied through multiple vocabulary search strategies and active search for counterevidence.

Researcher Bias Check¶

Confirmation bias risk: HIGH — the finding matches the researcher's declared expectation. Compensated by comprehensive multi-vocabulary search and active pursuit of counterevidence (NHS automation bias, Microsoft failure scenarios).
Availability bias risk: Low — searched across multiple domains and terminology sets.
Anchoring risk: Low — hypotheses were generated before evidence collection.