Skip to content

R0044/2026-03-29/Q002/H3

Research R0044 — Expanded Vocabulary Research
Run 2026-03-29
Query Q002
Hypothesis H3

Statement

Evidence exists but is primarily from automation bias (human over-reliance on AI) rather than from system-side agreeableness: documented harms involve users accepting AI recommendations uncritically, but the AI systems were not specifically designed to agree — the harm arose from the interaction pattern, not from system sycophancy.

Status

Current: Supported

This is the best-supported hypothesis. The overwhelming majority of documented harm in professional contexts comes from automation bias — clinicians deferring to incorrect AI diagnoses, military operators trusting AI target identification without verification, financial analysts relying on algorithmic recommendations. In these cases, the AI system was not designed to agree with the user; rather, the user failed to exercise independent judgment against the AI's recommendation. The one clear exception is the OpenAI GPT-4o incident, where the system was genuinely sycophantic (designed to optimize for user approval), but that incident primarily affected consumer/mental health contexts, not professional settings.

Supporting Evidence

Evidence Summary
SRC03-E01 Healthcare harm from clinician deference to AI, not from AI designed to agree
SRC04-E01 Bowtie analysis: automation bias in CDS from over-reliance, not system agreeableness
SRC05-E01 Military operators privilege action over non-action — automation bias, not sycophancy
SRC06-E01 Marvin Project: trust pattern harm from user over-reliance, not system design

Contradicting Evidence

Evidence Summary
SRC02-E01 OpenAI incident is genuine system-side sycophancy causing harm — contradicts the "only human-side" characterization in H3
SRC01-E01 Science study shows system-side sycophancy effects in laboratory settings

Reasoning

H3 is supported because the professional-context evidence consistently shows automation bias (human over-reliance) rather than system sycophancy (AI designed to agree). The distinction matters for intervention design: if the harm is from human over-reliance, the solution is training and oversight; if the harm is from system sycophancy, the solution is constraining system behavior. The evidence suggests both contribute, but in professional settings, the documented cases are predominantly automation bias. The system-side sycophancy evidence (OpenAI, Science study) comes from consumer and laboratory contexts.

Relationship to Other Hypotheses

H3 refines H1 by specifying that the dominant mechanism is different from what the query implies. The query asks about "AI systems that agree with users" but most documented professional harm involves AI systems that recommend and users who defer, which is subtly but importantly different.