Skip to content

R0044/2026-03-29/Q004 — Assessment

BLUF

CaTE has published a Guidebook and companion guides focused on measuring and evaluating trustworthiness in AI systems, with specific application to LAWS (Lethal Autonomous Weapon Systems). CaTE's work addresses both system design properties (trustworthiness dimensions) and human trust calibration (measurement methods). However, it does NOT address the system-side behavior the query asks about — AI systems adjusting their output to match or counteract user expectations. CaTE's paradigm is "measure system trustworthiness, then help humans calibrate trust accordingly," not "constrain system behavior to prevent trust miscalibration." The concept of sycophancy does not appear in CaTE's vocabulary.

Probability

Rating: Almost certain (95-99%) that CaTE does not address system-side output behavior (sycophancy)

Confidence in assessment: Medium

Confidence rationale: CaTE Guidebook could not be fully extracted. Assessment is based on organizational descriptions, press materials, and secondary sources. If the full guidebook contains sections on AI system output adjustment, this assessment would need revision. However, the consistent human-focused framing across all accessible sources makes this unlikely.

Reasoning Chain

  1. CaTE Guidebook provides recommendations on "trust, trustworthiness, calibrated trust, and ethics" — system properties and human behavior, not system output behavior [SRC01-E01, High reliability, High relevance]
  2. Sablon's statement: "The human has to understand the capabilities and limitations of the AI system to use it responsibly" — explicitly human-focused [SRC02-E01, Medium-High reliability, High relevance]
  3. CaTE defines calibrated trust as "the human places an appropriate amount of trust in machine intelligence based on its strengths and weaknesses" — the calibration is of human trust, not system behavior [SRC02-E01]
  4. Sandia's TCMM, the closest related framework, also focuses on "communicating trustworthiness" not constraining behavior [SRC03-E01, Medium-High reliability, Medium relevance]
  5. JUDGMENT: The entire calibrated trust research community — CaTE, Sandia TCMM, and related work — operates on a paradigm where the system is a static object to be evaluated and the human is the dynamic agent to be calibrated. The question "should the AI system itself detect and counteract user overtrust?" has not been asked in this community.

Evidence Base Summary

Source Description Reliability Relevance Key Finding
SRC01 CaTE Guidebook High High Covers trust/trustworthiness measurement, not system output behavior
SRC02 CMU/SEI CaTE overview Medium-High High Human-focused framing throughout
SRC03 Sandia TCMM Medium-High Medium Confirms field-wide human-focused paradigm

Collection Synthesis

Dimension Assessment
Evidence quality Medium — primary source (Guidebook) inaccessible; secondary sources consistent
Source agreement High — all sources agree CaTE does not address system output behavior
Source independence Medium — CaTE and TCMM are from different institutions but the same funding ecosystem (DoD)
Outliers None

Detail

CaTE's approach is consistent with the broader finding from Q001: regulated industries address automation bias through system design properties and human oversight, not through constraining system output behavior. CaTE represents the most sophisticated version of this approach — it has a dedicated center, a formal vocabulary, and published frameworks — but it still operates within the "measure and inform" paradigm rather than the "constrain and prevent" paradigm.

Gaps

Missing Evidence Impact on Assessment
Full CaTE Guidebook text Could reveal sections on system behavior not visible in descriptions
CaTE companion guide full texts Human Machine Teaming Design Framework may address system behavior
CaTE publications since 2024 Center may have expanded scope

Researcher Bias Check

Declared biases: No researcher profile provided.

Influence assessment: The query embeds an assumption that CaTE has "the most sophisticated regulated-industry vocabulary." This was tested and partially confirmed — CaTE does have a sophisticated vocabulary, but it is a human-focused vocabulary, not a system-behavior vocabulary.

Cross-References

Entity ID File
Hypotheses H1, H2, H3 hypotheses/
Sources SRC01-SRC03 sources/
ACH Matrix ach-matrix.md
Self-Audit self-audit.md