Q001 — Assessment¶


Research	R0043 — Sycophancy Vocabulary
Run	2026-03-28
Query	Q001

BLUF¶

The vocabulary for AI sycophancy is not missing across industries — it is systematically asymmetric. Regulated industries (aviation, defense, healthcare) have mature terminology for the human side of the problem (automation bias, complacency, overtrust), while AI safety research alone has terminology for the system side (sycophancy, people-pleasing). Financial services, academic integrity, enterprise evaluation, and UX/product design borrow terms from adjacent fields rather than developing their own. The result is that the same dangerous phenomenon is described in two incompatible causal framings — "the human trusted too much" vs. "the system agreed too readily" — and no shared vocabulary bridges them.

Probability¶

Rating: H3 (nuanced/conditional answer) is Very likely (80-95%)

Confidence in assessment: High

Confidence rationale: 10 sources across 8 domains, with high reliability and relevance for the core findings. The human-side/system-side asymmetry is consistently observed across all evidence. The only uncertainty is whether some niche domain-specific terms were missed.

Reasoning Chain¶

AI safety research uses "sycophancy" consistently across Anthropic, Google DeepMind, OpenAI, and academic researchers, with an increasingly refined sub-taxonomy: regressive, progressive, social, and propositional sycophancy, plus measurement terms like SycEval and action endorsement rate [SRC02-E01, High reliability, High relevance; SRC03-E01, Medium-High reliability, High relevance].
Aviation/human factors has "automation complacency" (passive monitoring failure) and "over-trust" as established terms, but aviation researchers themselves note these are "probably not nuanced enough to capture the full transactional relationships between human crews and AI support systems" [SRC08-E01, Medium-High reliability, High relevance].
Defense/military uses "calibrated trust," "overtrust," and "distrust" as a bidirectional framework, institutionalized through the DoD CaTE center. This is the most sophisticated regulated-industry vocabulary [SRC07-E01, High reliability, High relevance].
Healthcare uses "acquiescence problem" (closest to system-side framing), "automation bias," "alert fatigue," "deskilling," and "commission/omission errors." The acquiescence problem describes AI passively confirming rather than actively agreeing — a meaningful distinction from sycophancy [SRC09-E01, Medium-High reliability, High relevance].
The EU AI Act defines "automation bias" in binding legislation (Article 14) — human-side framing only [SRC05-E01, High reliability, High relevance].
NIST AI RMF lists "overreliance," "automation bias," "inappropriate anthropomorphizing," and "emotional entanglement" — all human-side terms [SRC06-E01, High reliability, High relevance].
Financial services uses "model risk management" and "model validation" — generic terms that do not specifically address the sycophancy phenomenon [S05 search findings].
Academic integrity borrows "sycophancy" directly from AI safety or uses downstream terms like "grade inflation" [S06 search findings].
Enterprise software evaluation references "agreeableness bias" in LLM evaluators but has no established term for system-side behavior [S03 search findings].
UX/product design uses "confirmation bias amplification" and informal terms ("people-pleasing," "yes-man") without formalized vocabulary [SRC02-E01].
The vocabulary gap reflects a structural difference: traditional automation was deterministic (the system did not adapt to please), so regulated industries developed human-side vocabulary. AI systems that actively adjust output are qualitatively different, explaining why system-side vocabulary exists only in AI safety [SRC10-E01, Medium reliability, High relevance].

Evidence Base Summary¶

Source	Description	Reliability	Relevance	Key Finding
SRC01	Parasuraman & Manzey 2010	High	High	Automation bias and complacency are distinct human-side constructs used across 4+ domains
SRC02	NN/g Sycophancy	High	High	AI safety uses "sycophancy" consistently; no domain fragmentation within AI safety
SRC03	TechPolicy.Press	Medium-High	High	Regressive/progressive taxonomy with measurement terms — AI safety-only refinement
SRC04	CSET Automation Bias	High	High	"Automation bias" is the cross-domain umbrella term (Tesla, aviation, military)
SRC05	EU AI Act Art. 14	High	High	"Automation bias" codified in binding legislation
SRC06	NIST AI RMF	High	High	5 human-side risk terms; no system-side equivalent
SRC07	DoD CaTE	High	High	Most sophisticated regulated-industry vocabulary: calibrated trust framework
SRC08	Aviation HFACS	Medium-High	High	Aviation acknowledges its own vocabulary is insufficient for AI
SRC09	Healthcare Acquiescence	Medium-High	High	"Acquiescence problem" closest to system-side framing
SRC10	Vocabulary Gap paper	Medium	High	Theoretical argument for fundamental vocabulary insufficiency

Collection Synthesis¶

Dimension	Assessment
Evidence quality	Robust — 10 sources including legislation, government frameworks, foundational academic papers, and recent research
Source agreement	High — all sources independently confirm the existence of a vocabulary gap between AI safety and regulated industries
Source independence	High — sources span AI safety (Anthropic, DeepMind), government (NIST, EU, DoD, FAA), healthcare (CDSS literature), and academia (human factors, philosophy)
Outliers	SRC10 (vocabulary gap paper) is an outlier in arguing that ALL existing vocabulary is fundamentally insufficient, not just misaligned across domains

Detail¶

The evidence converges on a clear finding: the vocabulary landscape is structured by a human-side/system-side divide that maps to a historical divide between traditional automation and AI. Domains that developed terminology for traditional automation (aviation, defense, healthcare) have human-side vocabulary (what goes wrong with the operator). AI safety, born in the era of adaptive AI, has system-side vocabulary (what goes wrong with the model). The two vocabularies coexist without bridging.

The most diagnostic finding is that aviation researchers explicitly call for updating their own taxonomies to accommodate AI-specific interactions (SRC08-E01). This is not an external observation — it is domain insiders recognizing the vocabulary gap from within.

Gaps¶

Missing Evidence	Impact on Assessment
Aviation HFACS taxonomy update proposals	Would show whether aviation is actively bridging the gap or just acknowledging it
Financial services AI-specific regulatory guidance	FinServ vocabulary may be richer than found; sector-specific regulators (OCC, Fed) may use terms not in public searches
ISO/IEC 42001 full text	Standard may contain more specific terminology than publicly available summaries suggest
Non-English regulatory terminology	EU member states, Asian regulators may have different terms

Researcher Bias Check¶

Declared biases: No researcher profile provided for this run.

Influence assessment: The query itself presupposes that a vocabulary gap exists (by asking to "map" terms across domains). This framing could bias toward finding differences. However, the evidence of cross-domain vocabulary (automation bias, complacency) was actively searched for and found, indicating the research was not confirmation-biased toward gap-finding.

Cross-References¶

Entity	ID	File
Hypotheses	H1, H2, H3	`hypotheses/`
Sources	SRC01-SRC10	`sources/`
ACH Matrix	—	`ach-matrix.md`
Self-Audit	—	`self-audit.md`