Skip to content

R0057/2026-04-01/C025/H1

Research R0057 — RLHF Yes-Men Claims v3
Run 2026-04-01
Claim C025
Hypothesis H1

Statement

All three taxonomies omit sycophancy

Status

Current: Supported

Supporting Evidence

Evidence Summary
SRC01-E01 Sycophancy does not appear in MIT AI Risk Repository, AIR 2024, or Standardized Threat Taxonomy

Contradicting Evidence

Evidence Summary
No contradicting evidence found

Reasoning

Direct verification: (1) MIT AI Risk Repository lists 7 domains with 24 subdomains — sycophancy is not a named category, though related risks appear under Human-Computer Interaction. (2) AIR 2024 has 314 risk types in 4 domains — the word sycophancy does not appear. (3) Standardized Threat Taxonomy has 9 domains and 53 sub-threats — the word sycophancy does not appear.

Relationship to Other Hypotheses

H1 represents full accuracy. H2 allows for partial correctness. H3 is eliminated by the evidence.