Skip to content

R0056/2026-04-01/C008 — Claim Definition

Claim as Received

DeepSeek V3, trained with RLVR, was found to be the most sycophantic model in an independent evaluation.

Claim as Clarified

DeepSeek V3, trained with RLVR, was found to be the most sycophantic model in an independent evaluation.

BLUF

Partially correct with important corrections. DeepSeek V3 was the SECOND most sycophantic (not first). It was trained with GRPO, not RLVR.

Scope

  • Domain: AI safety / sycophancy research
  • Timeframe: Current (as of April 2026)
  • Testability: Verifiable against published research and public sources

Assessment Summary

Probability: Unlikely (20-45%)

Confidence: High

Hypothesis outcome: H2 prevailed.

[Full assessment in assessment.md.]

Status

Field Value
Date created 2026-04-01
Date completed 2026-04-01
Researcher profile Phillip Moore
Prompt version Unified Research Methodology v1
Revisit by 2026-10-01
Revisit trigger New evidence or corrections