R0043/2026-03-28/Q001 — Assessment¶
BLUF¶
The vocabulary for AI sycophancy is not missing across industries — it is systematically asymmetric. Regulated industries (aviation, defense, healthcare) have mature terminology for the human side of the problem (automation bias, complacency, overtrust), while AI safety research alone has terminology for the system side (sycophancy, people-pleasing). Financial services, academic integrity, enterprise evaluation, and UX/product design borrow terms from adjacent fields rather than developing their own. The result is that the same dangerous phenomenon is described in two incompatible causal framings — "the human trusted too much" vs. "the system agreed too readily" — and no shared vocabulary bridges them.
Probability¶
Rating: H3 (nuanced/conditional answer) is Very likely (80-95%)
Confidence in assessment: High
Confidence rationale: 10 sources across 8 domains, with high reliability and relevance for the core findings. The human-side/system-side asymmetry is consistently observed across all evidence. The only uncertainty is whether some niche domain-specific terms were missed.
Reasoning Chain¶
-
AI safety research uses "sycophancy" consistently across Anthropic, Google DeepMind, OpenAI, and academic researchers, with an increasingly refined sub-taxonomy: regressive, progressive, social, and propositional sycophancy, plus measurement terms like SycEval and action endorsement rate [SRC02-E01, High reliability, High relevance; SRC03-E01, Medium-High reliability, High relevance].
-
Aviation/human factors has "automation complacency" (passive monitoring failure) and "over-trust" as established terms, but aviation researchers themselves note these are "probably not nuanced enough to capture the full transactional relationships between human crews and AI support systems" [SRC08-E01, Medium-High reliability, High relevance].
-
Defense/military uses "calibrated trust," "overtrust," and "distrust" as a bidirectional framework, institutionalized through the DoD CaTE center. This is the most sophisticated regulated-industry vocabulary [SRC07-E01, High reliability, High relevance].
-
Healthcare uses "acquiescence problem" (closest to system-side framing), "automation bias," "alert fatigue," "deskilling," and "commission/omission errors." The acquiescence problem describes AI passively confirming rather than actively agreeing — a meaningful distinction from sycophancy [SRC09-E01, Medium-High reliability, High relevance].
-
The EU AI Act defines "automation bias" in binding legislation (Article 14) — human-side framing only [SRC05-E01, High reliability, High relevance].
-
NIST AI RMF lists "overreliance," "automation bias," "inappropriate anthropomorphizing," and "emotional entanglement" — all human-side terms [SRC06-E01, High reliability, High relevance].
-
Financial services uses "model risk management" and "model validation" — generic terms that do not specifically address the sycophancy phenomenon [S05 search findings].
-
Academic integrity borrows "sycophancy" directly from AI safety or uses downstream terms like "grade inflation" [S06 search findings].
-
Enterprise software evaluation references "agreeableness bias" in LLM evaluators but has no established term for system-side behavior [S03 search findings].
-
UX/product design uses "confirmation bias amplification" and informal terms ("people-pleasing," "yes-man") without formalized vocabulary [SRC02-E01].
-
The vocabulary gap reflects a structural difference: traditional automation was deterministic (the system did not adapt to please), so regulated industries developed human-side vocabulary. AI systems that actively adjust output are qualitatively different, explaining why system-side vocabulary exists only in AI safety [SRC10-E01, Medium reliability, High relevance].
Evidence Base Summary¶
| Source | Description | Reliability | Relevance | Key Finding |
|---|---|---|---|---|
| SRC01 | Parasuraman & Manzey 2010 | High | High | Automation bias and complacency are distinct human-side constructs used across 4+ domains |
| SRC02 | NN/g Sycophancy | High | High | AI safety uses "sycophancy" consistently; no domain fragmentation within AI safety |
| SRC03 | TechPolicy.Press | Medium-High | High | Regressive/progressive taxonomy with measurement terms — AI safety-only refinement |
| SRC04 | CSET Automation Bias | High | High | "Automation bias" is the cross-domain umbrella term (Tesla, aviation, military) |
| SRC05 | EU AI Act Art. 14 | High | High | "Automation bias" codified in binding legislation |
| SRC06 | NIST AI RMF | High | High | 5 human-side risk terms; no system-side equivalent |
| SRC07 | DoD CaTE | High | High | Most sophisticated regulated-industry vocabulary: calibrated trust framework |
| SRC08 | Aviation HFACS | Medium-High | High | Aviation acknowledges its own vocabulary is insufficient for AI |
| SRC09 | Healthcare Acquiescence | Medium-High | High | "Acquiescence problem" closest to system-side framing |
| SRC10 | Vocabulary Gap paper | Medium | High | Theoretical argument for fundamental vocabulary insufficiency |
Collection Synthesis¶
| Dimension | Assessment |
|---|---|
| Evidence quality | Robust — 10 sources including legislation, government frameworks, foundational academic papers, and recent research |
| Source agreement | High — all sources independently confirm the existence of a vocabulary gap between AI safety and regulated industries |
| Source independence | High — sources span AI safety (Anthropic, DeepMind), government (NIST, EU, DoD, FAA), healthcare (CDSS literature), and academia (human factors, philosophy) |
| Outliers | SRC10 (vocabulary gap paper) is an outlier in arguing that ALL existing vocabulary is fundamentally insufficient, not just misaligned across domains |
Detail¶
The evidence converges on a clear finding: the vocabulary landscape is structured by a human-side/system-side divide that maps to a historical divide between traditional automation and AI. Domains that developed terminology for traditional automation (aviation, defense, healthcare) have human-side vocabulary (what goes wrong with the operator). AI safety, born in the era of adaptive AI, has system-side vocabulary (what goes wrong with the model). The two vocabularies coexist without bridging.
The most diagnostic finding is that aviation researchers explicitly call for updating their own taxonomies to accommodate AI-specific interactions (SRC08-E01). This is not an external observation — it is domain insiders recognizing the vocabulary gap from within.
Gaps¶
| Missing Evidence | Impact on Assessment |
|---|---|
| Aviation HFACS taxonomy update proposals | Would show whether aviation is actively bridging the gap or just acknowledging it |
| Financial services AI-specific regulatory guidance | FinServ vocabulary may be richer than found; sector-specific regulators (OCC, Fed) may use terms not in public searches |
| ISO/IEC 42001 full text | Standard may contain more specific terminology than publicly available summaries suggest |
| Non-English regulatory terminology | EU member states, Asian regulators may have different terms |
Researcher Bias Check¶
Declared biases: No researcher profile provided for this run.
Influence assessment: The query itself presupposes that a vocabulary gap exists (by asking to "map" terms across domains). This framing could bias toward finding differences. However, the evidence of cross-domain vocabulary (automation bias, complacency) was actively searched for and found, indicating the research was not confirmation-biased toward gap-finding.
Cross-References¶
| Entity | ID | File |
|---|---|---|
| Hypotheses | H1, H2, H3 | hypotheses/ |
| Sources | SRC01-SRC10 | sources/ |
| ACH Matrix | — | ach-matrix.md |
| Self-Audit | — | self-audit.md |