Skip to content

R0043/2026-03-28/Q001 — Assessment

BLUF

The vocabulary for AI sycophancy is not missing across industries — it is systematically asymmetric. Regulated industries (aviation, defense, healthcare) have mature terminology for the human side of the problem (automation bias, complacency, overtrust), while AI safety research alone has terminology for the system side (sycophancy, people-pleasing). Financial services, academic integrity, enterprise evaluation, and UX/product design borrow terms from adjacent fields rather than developing their own. The result is that the same dangerous phenomenon is described in two incompatible causal framings — "the human trusted too much" vs. "the system agreed too readily" — and no shared vocabulary bridges them.

Probability

Rating: H3 (nuanced/conditional answer) is Very likely (80-95%)

Confidence in assessment: High

Confidence rationale: 10 sources across 8 domains, with high reliability and relevance for the core findings. The human-side/system-side asymmetry is consistently observed across all evidence. The only uncertainty is whether some niche domain-specific terms were missed.

Reasoning Chain

  1. AI safety research uses "sycophancy" consistently across Anthropic, Google DeepMind, OpenAI, and academic researchers, with an increasingly refined sub-taxonomy: regressive, progressive, social, and propositional sycophancy, plus measurement terms like SycEval and action endorsement rate [SRC02-E01, High reliability, High relevance; SRC03-E01, Medium-High reliability, High relevance].

  2. Aviation/human factors has "automation complacency" (passive monitoring failure) and "over-trust" as established terms, but aviation researchers themselves note these are "probably not nuanced enough to capture the full transactional relationships between human crews and AI support systems" [SRC08-E01, Medium-High reliability, High relevance].

  3. Defense/military uses "calibrated trust," "overtrust," and "distrust" as a bidirectional framework, institutionalized through the DoD CaTE center. This is the most sophisticated regulated-industry vocabulary [SRC07-E01, High reliability, High relevance].

  4. Healthcare uses "acquiescence problem" (closest to system-side framing), "automation bias," "alert fatigue," "deskilling," and "commission/omission errors." The acquiescence problem describes AI passively confirming rather than actively agreeing — a meaningful distinction from sycophancy [SRC09-E01, Medium-High reliability, High relevance].

  5. The EU AI Act defines "automation bias" in binding legislation (Article 14) — human-side framing only [SRC05-E01, High reliability, High relevance].

  6. NIST AI RMF lists "overreliance," "automation bias," "inappropriate anthropomorphizing," and "emotional entanglement" — all human-side terms [SRC06-E01, High reliability, High relevance].

  7. Financial services uses "model risk management" and "model validation" — generic terms that do not specifically address the sycophancy phenomenon [S05 search findings].

  8. Academic integrity borrows "sycophancy" directly from AI safety or uses downstream terms like "grade inflation" [S06 search findings].

  9. Enterprise software evaluation references "agreeableness bias" in LLM evaluators but has no established term for system-side behavior [S03 search findings].

  10. UX/product design uses "confirmation bias amplification" and informal terms ("people-pleasing," "yes-man") without formalized vocabulary [SRC02-E01].

  11. The vocabulary gap reflects a structural difference: traditional automation was deterministic (the system did not adapt to please), so regulated industries developed human-side vocabulary. AI systems that actively adjust output are qualitatively different, explaining why system-side vocabulary exists only in AI safety [SRC10-E01, Medium reliability, High relevance].

Evidence Base Summary

Source Description Reliability Relevance Key Finding
SRC01 Parasuraman & Manzey 2010 High High Automation bias and complacency are distinct human-side constructs used across 4+ domains
SRC02 NN/g Sycophancy High High AI safety uses "sycophancy" consistently; no domain fragmentation within AI safety
SRC03 TechPolicy.Press Medium-High High Regressive/progressive taxonomy with measurement terms — AI safety-only refinement
SRC04 CSET Automation Bias High High "Automation bias" is the cross-domain umbrella term (Tesla, aviation, military)
SRC05 EU AI Act Art. 14 High High "Automation bias" codified in binding legislation
SRC06 NIST AI RMF High High 5 human-side risk terms; no system-side equivalent
SRC07 DoD CaTE High High Most sophisticated regulated-industry vocabulary: calibrated trust framework
SRC08 Aviation HFACS Medium-High High Aviation acknowledges its own vocabulary is insufficient for AI
SRC09 Healthcare Acquiescence Medium-High High "Acquiescence problem" closest to system-side framing
SRC10 Vocabulary Gap paper Medium High Theoretical argument for fundamental vocabulary insufficiency

Collection Synthesis

Dimension Assessment
Evidence quality Robust — 10 sources including legislation, government frameworks, foundational academic papers, and recent research
Source agreement High — all sources independently confirm the existence of a vocabulary gap between AI safety and regulated industries
Source independence High — sources span AI safety (Anthropic, DeepMind), government (NIST, EU, DoD, FAA), healthcare (CDSS literature), and academia (human factors, philosophy)
Outliers SRC10 (vocabulary gap paper) is an outlier in arguing that ALL existing vocabulary is fundamentally insufficient, not just misaligned across domains

Detail

The evidence converges on a clear finding: the vocabulary landscape is structured by a human-side/system-side divide that maps to a historical divide between traditional automation and AI. Domains that developed terminology for traditional automation (aviation, defense, healthcare) have human-side vocabulary (what goes wrong with the operator). AI safety, born in the era of adaptive AI, has system-side vocabulary (what goes wrong with the model). The two vocabularies coexist without bridging.

The most diagnostic finding is that aviation researchers explicitly call for updating their own taxonomies to accommodate AI-specific interactions (SRC08-E01). This is not an external observation — it is domain insiders recognizing the vocabulary gap from within.

Gaps

Missing Evidence Impact on Assessment
Aviation HFACS taxonomy update proposals Would show whether aviation is actively bridging the gap or just acknowledging it
Financial services AI-specific regulatory guidance FinServ vocabulary may be richer than found; sector-specific regulators (OCC, Fed) may use terms not in public searches
ISO/IEC 42001 full text Standard may contain more specific terminology than publicly available summaries suggest
Non-English regulatory terminology EU member states, Asian regulators may have different terms

Researcher Bias Check

Declared biases: No researcher profile provided for this run.

Influence assessment: The query itself presupposes that a vocabulary gap exists (by asking to "map" terms across domains). This framing could bias toward finding differences. However, the evidence of cross-domain vocabulary (automation bias, complacency) was actively searched for and found, indicating the research was not confirmation-biased toward gap-finding.

Cross-References

Entity ID File
Hypotheses H1, H2, H3 hypotheses/
Sources SRC01-SRC10 sources/
ACH Matrix ach-matrix.md
Self-Audit self-audit.md