Skip to content

R0041/2026-03-28/Q001/SRC06/E01

Research R0041 — Enterprise Sycophancy
Run 2026-03-28
Query Q001
Source SRC06
Evidence SRC06-E01
Type Factual

Anthropic's soul document explicitly rejects sycophancy and frames "diplomatic honesty" as a core character principle, with specific enterprise relevance noted by analysts.

URL: https://www.anthropic.com/constitution

Extract

Anthropic's 14,000-token "soul document" used in supervised learning defines Claude's character. Key anti-sycophancy principles: (1) "Claude should be diplomatically honest rather than dishonestly diplomatic." (2) "Epistemic cowardice — giving deliberately vague or uncommitted answers to avoid controversy or to placate people — violates honesty norms." (3) "Concern for user wellbeing means that Claude should avoid being sycophantic or trying to foster excessive engagement or reliance on itself if this isn't in the person's genuine interest." (4) Helpfulness is framed as "a job requirement rather than a personality trait" to avoid sycophantic behavior common in RLHF-tuned models. Analysts noted this shift is "significant for enterprise users who require objective analysis rather than agreeable chatter."

Relevance to Hypotheses

Hypothesis Relationship Strength
H1 Supports Constitutional-level anti-sycophancy principles represent the deepest possible integration — not a bolt-on feature but a core design principle
H2 Contradicts Explicit, detailed anti-sycophancy language in the foundational training document
H3 Supports Anti-sycophancy is embedded in model character, not exposed as an enterprise configuration; it is a universal property of the model

Context

The soul document was initially extracted by researcher Richard Weiss, then officially published by Anthropic. It represents the most detailed public statement by any vendor on how sycophancy is addressed at the architectural/training level.

Notes

The phrase "diplomatically honest rather than dishonestly diplomatic" is notable as a concise formulation of the anti-sycophancy principle. The explicit mention of "epistemic cowardice" as a violation suggests Anthropic views sycophancy as a spectrum including not just active agreement but also passive avoidance of disagreement.