Q004¶


Research	R0044 — Expanded Vocabulary Research
Run	2026-04-01
Query	Q004

Query: The DoD CaTE (Calibrated AI Trust and Expectations) center was identified as having the most sophisticated regulated-industry vocabulary for this problem. What has CaTE published about calibrating trust in AI systems, and does their work address the system-side behavior (AI adjusting output to match user expectations) or only the human-side behavior (users trusting AI too much)?

BLUF: CaTE has published one primary guidebook (TEVV of LAWS, April 2025). Its scope covers both system trustworthiness evaluation and operator trust measurement, but emphasizes the human side. CaTE does not address AI systems adjusting output to match user expectations, does not use sycophancy vocabulary, and does not constrain AI output behavior. Its "calibrated trust" concept is a human-side calibration: matching operator trust to system capability.

Probability: N/A (open-ended query) | Confidence: Medium

Correction: The query refers to "Calibrated AI Trust and Expectations" but CaTE actually stands for "Center for Calibrated Trust Measurement and Evaluation."

Summary¶

Entity	Description
Query Definition	Query text, scope, status
Assessment	Full analytical product with reasoning chain
ACH Matrix	Evidence x hypotheses diagnosticity analysis
Self-Audit	ROBIS-adapted 5-domain audit (process + source verification)

Hypotheses¶

ID	Hypothesis	Status
H1	CaTE addresses AI output behavior	Eliminated
H2	Both sides, human emphasis	Supported
H3	Human-side only	Eliminated

Searches¶

ID	Target	Results	Selected
S01	CaTE center overview	10	2
S02	CaTE guidebook	10	1

Sources¶

Source	Description	Reliability	Relevance
SRC01	CaTE TEVV Guidebook	High	High
SRC02	SEI Annual Review	High	Medium-High
SRC03	DefenseScoop launch	Medium-High	Medium-High

Key Insight¶

CaTE's "calibrated trust" concept answers the question "Does the operator's trust level match the system's actual capabilities?" It does not answer the question "Is the system actively manipulating the operator's trust?" This is a significant distinction: a system could pass all CaTE trustworthiness evaluations while simultaneously producing sycophantic output that inflates operator confidence beyond warranted levels.

Revisit Triggers¶

CaTE publication of additional guidebooks or standards beyond the LAWS TEVV guidebook
CaTE adoption of AI safety vocabulary (sycophancy, alignment, reward hacking)
CaTE expansion into GenAI/LLM trust calibration (current focus is autonomous systems, not language models)
Publication of the DEVCOM Armaments Center trust measurement data referenced in secondary sources