Skip to content

R0057/2026-04-01/C008

Claim: DeepSeek V3, trained with GRPO, was found to be among the most sycophantic models in an independent evaluation.

BLUF: Partially confirmed. DeepSeek was included in the Cheng et al. Science study evaluating 11 models for sycophancy. All models showed sycophantic behavior. However, the specific per-model ranking showing DeepSeek as among the most sycophantic could not be independently verified from available sources.

Probability: Likely (55-80%) | Confidence: Medium


Summary

Entity Description
Claim Definition Claim text, scope, status
Assessment Full analytical product with reasoning chain
ACH Matrix Evidence x hypotheses diagnosticity analysis
Self-Audit ROBIS-adapted 5-domain audit

Hypotheses

ID Hypothesis Status
H1 DeepSeek V3 was specifically identified as among the most sycophantic Plausible
H2 DeepSeek showed sycophancy but ranking is unclear Supported
H3 DeepSeek V3 was not notably sycophantic Not supported

Searches

ID Target Results Selected
S01 DeepSeek V3 GRPO sycophantic independent evaluation 10 1

Sources

Source Description Reliability Relevance
SRC01 Cheng et al. Science study (included DeepSeek in evaluation) High High

Revisit Triggers

  • If per-model sycophancy rankings from the Science study are published