R0043/2026-04-01¶
This run investigates the vocabulary landscape around AI sycophancy across eight industries, the regulatory coverage of the phenomenon under domain-specific names, and whether the vocabulary gap has been recognized as a problem requiring solutions.
Queries¶
Q001 — Vocabulary Mapping — Medium Confidence
Query: What terms do different industries and disciplines use to describe AI behavior that prioritizes user agreement, comfort, or satisfaction over accuracy, correctness, or safety?
Answer: The phenomenon maps to a rich but fragmented vocabulary. No two fields use the same primary term, and the terms describe different facets of the same system: sycophancy (model behavior), automation bias (human cognition), overreliance (human behavior), and domain-specific terms for manifestations in each context.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Comprehensive map constructable | Supported | — |
| H2: Some domains lack terminology | Partially Supported | — |
| H3: Terms describe different phenomena | Partially Supported | — |
Confidence: Medium · Sources: 9 · Searches: 5
Q002 — Enterprise Requirements — Medium Confidence
Query: Search for enterprise requirements, procurement specifications, regulatory guidance, or deployment standards that address the sycophancy phenomenon under its domain-specific names.
Answer: No regulatory framework directly addresses "sycophancy" by name. Four indirect mechanisms provide partial coverage: EU AI Act (automation bias), NIST AI 600-1 (confabulation), SR 11-7 (effective challenge), FDA (human factors). The gap is at the intersection of model behavior and regulatory language.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Direct requirements exist | Eliminated | — |
| H2: No requirements at all | Eliminated | — |
| H3: Indirect coverage only | Supported | — |
Confidence: Medium · Sources: 6 · Searches: 2
Q003 — Vocabulary Gap as Problem — Medium-High Confidence
Query: Has the vocabulary gap itself been identified as a problem in the AI safety or AI governance literature?
Answer: The broader AI terminology gap is well-recognized, with multiple organizations proposing solutions. However, the specific sycophancy vocabulary gap has not been prioritized in any identified taxonomy, glossary, or framework — even the most comprehensive efforts (53-threat taxonomy, 100+ term glossary) exclude sycophancy.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Gap recognized, active efforts | Supported (broad level) | — |
| H2: Gap not recognized | Eliminated | — |
| H3: Broader gap recognized, sycophancy excluded | Supported (best fit) | — |
Confidence: Medium-High · Sources: 4 · Searches: 1
Collection Analysis¶
Cross-Cutting Patterns¶
| Pattern | Queries Affected | Significance |
|---|---|---|
| Sycophancy falls between taxonomy categories | Q001, Q003 | Not a governance concept, not a security threat, not a process term — a behavioral model property that current frameworks miss |
| Human-side vs model-side framing divide | Q001, Q002 | Regulations address human responses (automation bias) but not the model behavior that triggers them |
| Indirect regulatory coverage is the norm | Q002, Q003 | Four distinct regulatory mechanisms provide partial coverage, but none names sycophancy directly |
| Structural research isolation impedes diffusion | Q001, Q003 | 83% homophily between AI safety and ethics communities limits terminology diffusion |
Collection Statistics¶
| Metric | Value |
|---|---|
| Queries investigated | 3 |
| Answered with high confidence | 0 |
| Answered with medium-high confidence | 1 (Q003) |
| Answered with medium confidence | 2 (Q001, Q002) |
Source Independence Assessment¶
Sources span 5 countries (US, UK, Australia, EU, Germany), 4 institutional types (academic, government, professional association, journalism), and 8+ domains. No two sources share a common upstream origin. The convergence on vocabulary fragmentation is derived from independent observations across multiple communities, strengthening the finding.
The Roytburg & Miller network analysis provides structural evidence for WHY the sources are independent — the 83% homophily finding means the communities that coined these terms rarely interact, producing genuinely independent vocabulary development.
Collection Gaps¶
| Gap | Impact | Mitigation |
|---|---|---|
| NIST AI 600-1 full text inaccessible (PDF) | Could not verify sycophancy-related content | Used secondary sources and NIST website summaries |
| DOD-specific AI deployment standards not fully investigated | May contain sycophancy-relevant human-machine teaming requirements | Covered through CSET brief and DOD CaTE references |
| Non-English terminology not searched | EU, Asian, and other language communities may use different terms | English-language search covers the primary research literature |
| Actual procurement RFPs not accessible | Cannot confirm whether organizations require sycophancy testing in practice | Used regulatory frameworks as proxy |
| No sources defending the status quo found | Cannot fully test whether vocabulary fragmentation is viewed as appropriate by some communities | This absence may reflect genuine consensus or search bias |
Collection Self-Audit¶
| Domain | Rating | Notes |
|---|---|---|
| Eligibility criteria | Low risk | Clear criteria: domain-specific terminology, regulatory provisions, taxonomy efforts |
| Search comprehensiveness | Some concerns | 8 search queries across 16 WebSearch calls; some domains received more attention than others |
| Evaluation consistency | Low risk | All 19 sources across 3 queries scored using identical framework |
| Synthesis fairness | Low risk | Counter-evidence included (SRC06-braun acquiescence reversal); nuanced hypotheses (H3) in all queries |
Resources¶
Summary¶
| Metric | Value |
|---|---|
| Queries investigated | 3 |
| Files produced | 176 |
| Sources scored | 19 |
| Evidence extracts | 19 |
| Results dispositioned | 13 + 62 = 75 (Q001) + 6 + 44 = 50 (Q002) + 4 + 16 = 20 (Q003) = 145 total |
Tool Breakdown¶
| Tool | Uses | Purpose |
|---|---|---|
| WebSearch | 16 | Search queries across domains |
| WebFetch | 13 | Page content retrieval (3 failed — PDF/403/redirect) |
| Write | 176 | File creation |
| Read | 2 | Methodology and output format reading |
| Edit | 0 | No file modifications |
| Bash | 6 | Directory creation, file generation, validation |
Token Distribution¶
| Category | Tokens |
|---|---|
| Input (context) | ~250,000 |
| Output (generation) | ~120,000 |
| Total | ~370,000 |