R0043/2026-04-01/Q003/SRC03/E01¶
9-domain, 53-threat AI taxonomy that excludes sycophancy
URL: https://arxiv.org/html/2511.21901
Extract¶
The taxonomy organizes AI threats into 9 domains with 53 sub-threats:
- Misuse (prompt injection, jailbreaking, deepfakes)
- Poisoning (backdooring, label flipping)
- Privacy (model inversion, membership inference)
- Adversarial (evasion attacks)
- Biases (representational harm, allocational harm)
- Unreliable Outputs (hallucinations, factual errors)
- Drift (concept drift, data distribution shifts)
- Supply Chain (compromised models)
- IP Threat (model theft)
The taxonomy bridges technical and business language by mapping threats to business loss categories (Confidentiality, Integrity, Availability, Legal, Reputation) and aligning with NIST AI RMF, ISO/IEC 42001, and the EU AI Act.
Sycophancy is NOT included. The closest category is "Unreliable Outputs" (domain 6), but this covers factual errors (hallucinations), not agreement-seeking behavior. A sycophantic model producing factually correct but agreement-biased output would not be captured.
JUDGMENT: This taxonomy's omission of sycophancy is significant — it represents one of the most comprehensive cross-domain AI threat taxonomies available, yet it has a blind spot for behavioral model properties that prioritize agreement over accuracy. The gap persists even in efforts specifically designed to bridge domains.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Partially supports | Active taxonomy effort exists |
| H2 | Contradicts | Taxonomy efforts exist |
| H3 | Strongly supports | Even the most comprehensive taxonomy excludes sycophancy |
Context¶
The 53-threat taxonomy's exclusion of sycophancy — despite including 53 other operationally defined threats — is strong evidence that sycophancy is not yet integrated into the cross-domain AI risk vocabulary, even by researchers specifically working on taxonomy bridging.