R0044/2026-03-29¶


Research	R0044 — Expanded Vocabulary Research
Mode	Query
Run date	2026-03-29
Queries	4
Prompt	Unified Research Standard v1.0-draft
Model	Claude Opus 4.6

This run investigates whether the expanded vocabulary from human-factors and AI safety research reveals regulatory requirements, documented harms, vocabulary bridges, and institutional knowledge about AI systems that reinforce user assumptions rather than challenging them. The four queries systematically explore the regulatory landscape, consequence evidence, vocabulary gap, and the most sophisticated institutional actor (DoD CaTE).

Queries¶

Q001 — Regulatory requirements constraining AI system behavior — H3: Indirect/nascent

Query: Using the expanded vocabulary, search for enterprise or government requirements that constrain AI system behavior — not just human operator behavior — to prevent the system from reinforcing user assumptions or providing agreeable-but-incorrect output. Focus on defense, healthcare, aviation, and financial services.

Answer: System-side requirements exist across defense, aviation, and general AI governance (NIST, EU AI Act), but they address system design (transparency, oversight enablement) rather than system output content (preventing agreeable-but-incorrect responses). No regulation explicitly prohibits AI sycophancy. Financial services regulation remains entirely human-focused.

Hypothesis	Status	Probability
H1: System-side requirements exist	Partially supported	Likely (55-80%)
H2: No system-side requirements	Eliminated	—
H3: Indirect/nascent requirements	Supported	Very likely (80-95%)

Sources: 6 | Searches: 5

Full analysis

Q002 — Consequences of agreeable AI in professional contexts — H3: Primarily automation bias

Query: Search for research on the consequences of AI systems that agree with users rather than challenge them, specifically in high-stakes professional contexts. Look for case studies, incident reports, or empirical studies where agreeable AI output led to measurable harm or near-misses.

Answer: Documented consequences exist across consumer and professional contexts, but with a critical asymmetry: system-side sycophancy harm is primarily documented in consumer/laboratory settings (OpenAI incident, Science study), while professional-context harm comes predominantly from automation bias (human over-reliance) rather than AI designed to agree. The distinction is narrowing as professional tools adopt RLHF optimization.

Hypothesis	Status	Probability
H1: Documented harm exists	Partially supported	Very likely (80-95%)
H2: No documented harm	Eliminated	—
H3: Primarily automation bias, not sycophancy	Supported	Likely (55-80%)

Sources: 6 | Searches: 3

Full analysis

Q003 — Bridging automation bias and sycophancy vocabularies — H3: Partial/emerging

Query: Has anyone in the regulated industries published research that explicitly connects the human-factors concept of "automation bias" to the AI safety concept of "sycophancy"? Is anyone bridging these two vocabularies?

Answer: Bridging is emerging but not yet systematic. Georgetown CSET's "AI Safety and Automation Bias" paper (November 2024) is the strongest candidate. However, the most comprehensive automation bias systematic review (2025, 35 studies) does not mention sycophancy, and the most sophisticated sycophancy analysis connects to confirmation bias, not automation bias. No publication was found that formally maps the two vocabularies as descriptions of the same underlying phenomenon.

Hypothesis	Status	Probability
H1: Explicit bridging exists	Partially supported	Unlikely (20-45%)
H2: No bridging exists	Eliminated	—
H3: Partial/emerging bridging	Supported	Likely (55-80%)

Sources: 4 | Searches: 3

Full analysis

Q004 — CaTE publications and system-side scope — H3: System properties, not output behavior

Query: What has the DoD CaTE center published about calibrating trust in AI systems, and does their work address the system-side behavior (AI adjusting output to match user expectations) or only the human-side behavior (users trusting AI too much)?

Answer: CaTE has published a Guidebook and companion guides focused on trust measurement and trustworthiness evaluation. CaTE addresses system design properties (trustworthiness dimensions) and human trust calibration, but does NOT address system-side output behavior. The concept of sycophancy is absent from CaTE's vocabulary. CaTE operates on a "measure and inform" paradigm, not a "constrain and prevent" paradigm.

Hypothesis	Status	Probability
H1: Both system-side and human-side	Eliminated	—
H2: Only human-side	Partially supported	—
H3: System properties, not output behavior	Supported	Almost certain (95-99%)

Sources: 3 | Searches: 3

Full analysis

Collection Analysis¶

Cross-Cutting Patterns¶

Pattern	Queries Affected	Significance
Design vs. output gap	Q001, Q004	All regulatory frameworks and institutional approaches address system design properties (transparency, explainability) but not system output behavior (preventing agreeableness/sycophancy). This is the central finding of the run.
Vocabulary siloing	Q001, Q003	Human factors vocabulary (automation bias, overtrust, calibrated trust) and AI safety vocabulary (sycophancy, RLHF alignment) remain largely separate, creating blind spots in both regulation and research.
Human-side paradigm dominance	Q001, Q002, Q004	The dominant regulatory and research paradigm frames the problem as human cognitive vulnerability to be managed, not as system behavior to be constrained. Even the most sophisticated institution (CaTE) operates within this paradigm.
Automation bias vs. sycophancy convergence	Q002	Professional-context harm is currently from automation bias (human over-reliance), but as AI tools adopt RLHF optimization, the distinction between automation bias and sycophancy will blur. The OpenAI incident previews this convergence.

Collection Statistics¶

Metric	Value
Queries investigated	4
H3 (nuanced) supported	4 (Q001, Q002, Q003, Q004)
H2 (negative) eliminated	4 (Q001, Q002, Q003, Q004)
H1 (affirmative) partially supported	3 (Q001, Q002, Q003)
H1 eliminated	1 (Q004)

Source Independence Assessment¶

Sources span a wide range of institutional types: government standards bodies (NIST, FAA, EU Parliament), regulatory agencies (FINRA), military research centers (CaTE/SEI, Sandia), academic publishers (Science, JAMA, ISQ, Springer), policy research centers (Georgetown CSET, ICRC), and technology companies (OpenAI). No single upstream source dominates the evidence base. The convergence on the "design vs. output" gap is independently confirmed across all institutional types.

Collection Gaps¶

Gap	Impact	Mitigation
PDF extraction failures (NIST AI 600-1, CaTE Guidebook, CSET paper, OMB M-26-04)	May miss specific system behavioral requirements or vocabulary bridging within these documents	Human reviewer should obtain and read full texts
Engineering-specific evidence absent	Q002 found no engineering case studies of agreeable AI harm	Search in engineering-specific databases (IEEE, ASME)
Financial services case studies absent	No documented financial losses from AI confirmation reinforcement	Search financial incident databases (SEC enforcement, FINRA arbitrations)
Classified military evidence	Most consequential military AI over-reliance incidents may be classified	Accept as structural limitation

Collection Self-Audit¶

Domain	Rating	Notes
Eligibility criteria	Low risk	Criteria were well-defined by the queries and applied consistently
Search comprehensiveness	Some concerns	PDF extraction failures reduced evidence depth for key sources. 14 searches across 4 queries covered the target space.
Evaluation consistency	Low risk	Same scoring framework applied across all 19 sources
Synthesis fairness	Low risk	All hypotheses received fair hearing; the consistent H3 outcome reflects the evidence pattern, not a methodological bias

Resources¶

Summary¶

Metric	Value
Queries investigated	4
Files produced	~80
Sources scored	19
Evidence extracts	19
Results dispositioned	~50 selected + ~50 rejected = ~100 total
Duration (wall clock)	26m 13s
Tool uses (total)	153

Tool Breakdown¶

Tool	Uses	Purpose
WebSearch	16	Search queries across all sectors and topics
WebFetch	12	Page content retrieval for key sources
Write	~80	File creation for all output files
Read	4	Methodology and output format document reading
Bash	2	Directory creation

Token Distribution¶

Category	Tokens
Input (context)	~450,000
Output (generation)	~120,000
Total	~570,000