R0043/2026-04-01/Q001 — Assessment¶
BLUF¶
The phenomenon AI researchers call "sycophancy" maps to a rich but fragmented vocabulary across eight domains. No two fields use the same primary term, and the terms describe different facets of the same system: sycophancy names the model behavior, automation bias names the human cognitive response, overreliance names the behavioral outcome, and domain-specific terms (acquiescence, deference, miscalibrated trust, model risk) name the phenomenon as it manifests in each context. A comprehensive vocabulary map is constructable but must be understood as a map of related concepts with different causal framings, not a simple synonym table.
Probability¶
Rating: N/A (open-ended query)
Confidence in assessment: Medium
Confidence rationale: Strong evidence coverage across AI safety, defense/military, healthcare, and aviation domains. Moderate coverage of financial services and academic integrity. Thinner coverage of enterprise software evaluation and UX/product design, which appear to lack domain-native vocabulary. The confidence is limited by the inability to access several key PDFs (NIST AI 600-1, CSET automation bias brief) in full text.
Reasoning Chain¶
-
AI safety research has established "sycophancy" as the primary term, with formal subtypes including regressive sycophancy, progressive sycophancy, and the metrics "action endorsement rate" and "attitude extremity." [SRC01-E01, High reliability, High relevance] [SRC02-E01, Medium-High reliability, High relevance]
-
Defense/military AI uses "automation bias," "automation complacency," and "overtrust" — terms that predate LLM-era sycophancy by two decades. These terms focus on the human side of the interaction (how operators respond to automated outputs) rather than the model side. The DOD has institutionalized this vocabulary through the Center for Calibrated Trust (CaTE). [SRC03-E01, High reliability, High relevance]
-
Healthcare AI uses "acquiescence" and "deference" alongside borrowed "automation bias." The distinctly healthcare framing is the "acquiescence problem" — LLMs passively confirming clinician hypotheses rather than challenging them, reinforcing diagnostic errors like anchoring and premature closure. [SRC04-E01, High reliability, High relevance]
-
Aviation has the most mature vocabulary from decades of human factors research, including a four-category taxonomy (use, misuse, disuse, abuse) and NASA ASRS definitions. Critically, aviation researchers themselves acknowledge their existing terms "are probably not nuanced enough" for AI-era interactions. [SRC08-E01, Medium-High reliability, High relevance]
-
Financial services frames this through "model risk" and the SR 11-7 "effective challenge" and "challenge function" requirements — the obligation to independently validate model outputs and challenge their recommendations. This addresses the same concern (agreeable-but-wrong output) through a governance mechanism rather than a behavioral description. [JUDGMENT based on SR 11-7 search results]
-
A multi-institution research team (Oxford, Cambridge, Princeton, Stanford, OpenAI) formally distinguished overreliance (behavior), automation bias (cognition), sycophancy (model property), and trust (attitude) as related but categorically distinct concepts. [SRC05-E01, High reliability, High relevance]
-
Academic integrity addresses sycophancy primarily through borrowed terms — "confirmation bias" — and focuses on detection rather than prevention. No domain-specific term for AI agreement-seeking behavior was identified. [JUDGMENT based on S01/S04 search results]
-
Enterprise software evaluation and UX/product design lack domain-native terms, using "hallucination rate," "satisfaction-accuracy tradeoff," and "dark patterns" as the closest approximations. IEEE Spectrum's adoption of "sycophancy" suggests the engineering community is borrowing from AI safety rather than coining its own term. [SRC07-E01, High reliability, Medium-High relevance]
-
JUDGMENT: The vocabulary fragmentation is not merely a naming convention difference — it reflects genuinely different causal framings. AI safety focuses on what the model does; human factors focuses on how humans respond; regulated industries focus on governance mechanisms to prevent harm. A vocabulary map must acknowledge these distinctions.
Evidence Base Summary¶
| Source | Description | Reliability | Relevance | Key Finding |
|---|---|---|---|---|
| SRC01 | NN/g sycophancy definition | High | High | Three behavioral categories of sycophancy |
| SRC02 | TechPolicy.Press taxonomy | Medium-High | High | Regressive/progressive sycophancy distinction |
| SRC03 | CSET automation bias brief | High | High | Defense uses automation bias/complacency/overtrust |
| SRC04 | PMC cross-domain mapping | High | High | Psychology: delusion confirmation, AI-induced psychosis |
| SRC05 | Oxford overreliance paper | High | High | Formal taxonomy: overreliance vs automation bias vs sycophancy |
| SRC06 | Braun acquiescence study | Medium-High | Medium | LLMs show opposite of acquiescence bias |
| SRC07 | IEEE Spectrum article | High | Medium-High | Engineering adopts "sycophancy" + DeepMind's "sandbagging" |
| SRC08 | Aviation HF requirements | Medium-High | High | Aviation terms acknowledged as inadequate for AI |
| SRC09 | Yes-Machine Problem article | Medium | High | Explicit vocabulary gap identification |
Collection Synthesis¶
| Dimension | Assessment |
|---|---|
| Evidence quality | Medium — mix of peer-reviewed, trade, and journalism sources |
| Source agreement | High — all sources agree the vocabulary is fragmented |
| Source independence | High — sources span academic, government, trade, and journalism |
| Outliers | SRC06 (acquiescence bias reversal) is the only source contradicting the narrative of AI agreement-seeking |
Detail¶
The evidence consistently supports the finding that vocabulary is fragmented across domains. The most significant analytical contribution comes from SRC05 (Oxford overreliance paper), which provides the formal taxonomy showing these terms are not synonyms but describe different aspects of the same system. The most practically relevant contribution comes from SRC09 (despite lower reliability), which is one of the few sources to explicitly name the vocabulary gap as a problem.
Cross-Domain Vocabulary Map¶
| Domain | Primary Terms | Focus | Maturity |
|---|---|---|---|
| AI Safety Research | Sycophancy, sycophantic behavior, reward hacking, regressive/progressive sycophancy, sandbagging | Model behavior | High (2020s) |
| Defense/Military | Automation bias, automation complacency, overtrust, miscalibrated trust | Human cognition | Very High (2000s) |
| Healthcare | Acquiescence, deference, automation bias (borrowed), acquiescence problem | Clinical decision quality | Medium (2020s) |
| Aviation/FAA | Automation complacency, overtrust/undertrust, use/misuse/disuse/abuse | Human factors | Very High (1990s) |
| Financial Services | Model risk, effective challenge, challenge function, independent validation | Governance mechanism | High (2011) |
| Academic Integrity | Confirmation bias amplification (borrowed) | Detection | Low |
| Enterprise Software | Hallucination rate (borrowed), accuracy metrics | Measurement | Low |
| UX/Product Design | Satisfaction-accuracy tradeoff, dark patterns (borrowed) | Design tension | Low |
| Cross-cutting (emerging) | Overreliance, appropriate reliance, calibrated trust | Human behavior | Medium (2024-2025) |
Gaps¶
| Missing Evidence | Impact on Assessment |
|---|---|
| Full text of NIST AI 600-1 (PDF inaccessible) | Could not verify whether NIST uses sycophancy or related terms in the generative AI risk profile |
| Full text of CSET automation bias brief (PDF inaccessible) | Relied on search result summaries rather than primary text |
| Legal/regulatory domain vocabulary | Law was mentioned peripherally but not systematically investigated |
| Survey of actual procurement language | Enterprise requirements are inferred from framework descriptions, not direct procurement documents |
| Non-English terminology | All searches were English-language; other language communities may have different terms |
Researcher Bias Check¶
Declared biases: The researcher's anti-sycophancy stance could lead to framing the vocabulary gap as more problematic than it is. The finding that vocabulary is fragmented could be interpreted as either (a) a dangerous gap that needs fixing or (b) a natural consequence of different disciplines studying different aspects of the same system.
Influence assessment: The vocabulary map itself is factual and not influenced by the bias. The interpretation — whether fragmentation is a problem — is addressed in Q003 rather than Q001, which limits bias influence here.
Cross-References¶
Every ID in this table MUST be a link to the corresponding file.
| Entity | ID | File |
|---|---|---|
| Hypotheses | H1, H2, H3 | hypotheses/ |
| Sources | SRC01, SRC02, SRC03, SRC04, SRC05, SRC06, SRC07, SRC08, SRC09 | sources/ |
| ACH Matrix | — | ach-matrix.md |
| Self-Audit | — | self-audit.md |