Skip to content

R0044/2026-04-01/Q003 — Query Definition

Query as Received

Has anyone in the regulated industries (aviation, defense, healthcare, finance) published research or guidance that explicitly connects the human-factors concept of "automation bias" or "overtrust" to the AI safety concept of "sycophancy"? Is anyone bridging these two vocabularies?

Query as Clarified

This query asks whether any publication explicitly recognizes that the human-factors tradition (automation bias, overtrust, complacency) and the AI safety tradition (sycophancy, RLHF-induced agreement) are studying the same phenomenon under different names — and makes that connection explicit. The key word is "explicitly" — many papers address one or both sides, but the question is whether anyone has deliberately bridged the vocabularies.

Embedded assumption surfaced: The query assumes these vocabularies are studying the same phenomenon. They address overlapping but not identical mechanisms: automation bias is a human cognitive tendency; sycophancy is an AI system behavior. They interact when sycophantic AI output triggers or amplifies automation bias in the human operator.

BLUF

One paper comes close to bridging these vocabularies: Ibrahim et al. (2025) "Measuring and mitigating overreliance is necessary for building human-compatible AI" explicitly connects cognitive science concepts (automation bias, cognitive offloading) with AI safety concepts (sycophancy, RLHF-induced agreement). The CSET Georgetown brief also bridges user-level and technical-level factors. However, no publication was found that explicitly states "automation bias and sycophancy are two names for the same problem" or provides a formal vocabulary mapping. The bridge is emerging but not yet formally constructed.

Scope

  • Domain: Cross-disciplinary research connecting human factors and AI safety
  • Timeframe: Current as of April 2026
  • Testability: Verifiable by locating publications that explicitly use both vocabulary sets

Assessment Summary

Probability: N/A (open-ended query)

Confidence: Medium

Hypothesis outcome: H2 (partial bridge exists) is best supported.

[Full assessment in assessment.md.]

Status

Field Value
Date created 2026-04-01
Date completed 2026-04-01
Researcher profile Not provided
Prompt version Unified Research Methodology v1
Revisit by 2026-10-01
Revisit trigger Publication of formal vocabulary mapping between human factors and AI safety communities