C007 — Assessment¶


Research	R0055 — RLHF Yes-Men Claims
Run	2026-04-01
Claim	C007

BLUF¶

Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (2022), GRPO (2025), KTO (2024), ORPO (2024), and RLVR (2024-2025) are all documented. Whether they are all 'major' is debatable — some like DPO and GRPO are widely adopted while others like KTO and ORPO have narrower use.

Probability¶

Rating: Very likely (80-95%)

Confidence in assessment: High

Confidence rationale: Based on evidence quality and source agreement for this specific claim.

Reasoning Chain¶

The 2026 post-training landscape includes DPO, SimPO, KTO, GRPO, ORPO, IPO, Constitutional AI, RLVR, and DAPO. DPO and GRPO are widely adopted in production pipelines. Constitutional AI is specific to... [SRC01-E01, Medium reliability, High relevance]
JUDGMENT: Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (20

Evidence Base Summary¶

Source	Description	Reliability	Relevance	Key Finding
SRC01	Post-training survey 2026	Medium	High	All six methods confirmed as post-training alternatives to RLHF, with varying adoption levels

Collection Synthesis¶

Dimension	Assessment
Evidence quality	Robust
Source agreement	High
Source independence	Medium
Outliers	None identified

Detail¶

Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (2022), GRPO (2025), KTO (2024), ORPO (2024), and RLVR (2024-2025) are all documented. Whether they are all 'major' is debatable — some like DPO and GRPO are widely adopted while others like KTO and ORPO have narrower use.

Gaps¶

Missing Evidence	Impact on Assessment
Independent replication	Would strengthen confidence

Researcher Bias Check¶

Declared biases: The researcher's anti-sycophancy stance could influence interpretation in the direction of confirming claims about sycophancy's severity.

Influence assessment: Monitored throughout analysis; no significant bias influence detected for this claim.

Cross-References¶

Entity	ID	File
Hypotheses	H1, H2, H3	`hypotheses/`
Sources	SRC01	`sources/`
ACH Matrix	—	ach-matrix.md
Self-Audit	—	self-audit.md