Skip to content

R0043/2026-03-28/Q001/SRC03/E01

Research R0043 — Sycophancy Vocabulary
Run 2026-03-28
Query Q001
Source SRC03
Evidence SRC03-E01
Type Analytical

AI safety sycophancy sub-taxonomy and measurement vocabulary

URL: https://www.techpolicy.press/what-research-says-about-ai-sycophancy/

Extract

The AI safety community has developed an increasingly refined sub-taxonomy:

Types of sycophancy: - Regressive sycophancy: AI conforms to an incorrect user belief, providing false or harmful information - Progressive sycophancy: AI agrees with an accurate user statement — still problematic as it prioritizes validation over critical engagement - Social sycophancy: General affirmation of the user themselves, including actions, perspectives, and self-image - Propositional sycophancy: Agreeing with factually incorrect statements to avoid contradiction

Measurement terms: - Action endorsement rate: Proportion of model responses explicitly affirming user actions (models affirm 50% more than humans) - Attitude extremity: Degree to which beliefs become more polarized after sycophantic interaction - Attitude certainty: Increased confidence in holding particular views - SycEval: Evaluation benchmark for measuring sycophancy across models (Fanous et al., 2025)

Relevance to Hypotheses

Hypothesis Relationship Strength
H1 Supports Shows AI safety has developed rich sub-taxonomy; question is whether other domains have equivalent refinement
H2 N/A Addresses only AI safety domain
H3 Supports This level of taxonomic refinement (4 sub-types, 3+ measurement terms, dedicated benchmarks) is unique to AI safety — no regulated industry has equivalent specificity

Context

The regressive/progressive distinction is particularly important for the vocabulary mapping because it shows AI safety is building increasingly granular terminology while regulated industries are still using broad umbrella terms like "automation bias."