C006 — Assessment¶


Research	R0057 — RLHF Yes-Men Claims v3
Run	2026-04-01
Claim	C006

BLUF¶

Confirmed. All six named alternatives are well-documented: DPO (2023), KTO (2024), GRPO (2024), Constitutional AI (2022), ORPO (2024), and RLVR (2024-2025). All are widely adopted or cited.

Probability¶

Rating: Almost certain (95-99%)

Confidence in assessment: High

Confidence rationale: These are all well-established in the ML literature with multiple implementations and adoption by major labs.

Reasoning Chain¶

All six alternatives are documented across multiple technical surveys. DPO eliminates the reward model. KTO uses binary feedback. GRPO uses group-relative advantages. Constitutional AI uses principle-based feedback. ORPO combines SFT and preference optimization. RLVR uses programmatic verifiers. [SRC01-E01, High reliability, High relevance]
JUDGMENT: Confirmed. All six named alternatives are well-documented: DPO (2023), KTO (2024), GRPO (2024), Constitutional AI (2022), ORPO (2024), and RLVR (2024-2025). All are widely adopted or cited.

Evidence Base Summary¶

Source	Description	Reliability	Relevance	Key Finding
SRC01	Multiple survey articles on RLHF alternatives	High	High	All six named alternatives (DPO, KTO, GRPO, Constitutional AI, ORPO, RLVR) are documented in the literature

Collection Synthesis¶

Dimension	Assessment
Evidence quality	High
Source agreement	High
Source independence	Medium
Outliers	None identified

Detail¶

The evidence supports the assessment. These are all well-established in the ML literature with multiple implementations and adoption by major labs.

Gaps¶

Missing Evidence	Impact on Assessment
Additional independent verification	Would strengthen confidence

Researcher Bias Check¶

Declared biases: Anti-sycophancy bias could influence interpretation toward confirming sycophancy claims.

Influence assessment: Mitigated by reliance on peer-reviewed and primary sources.

Cross-References¶

Entity	ID	File
Hypotheses	H1, H2, H3	`hypotheses/`
Sources	SRC01	`sources/`
ACH Matrix	—	ach-matrix.md
Self-Audit	—	self-audit.md