Skip to content

R0055/2026-04-01/C007 — Assessment

BLUF

Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (2022), GRPO (2025), KTO (2024), ORPO (2024), and RLVR (2024-2025) are all documented. Whether they are all 'major' is debatable — some like DPO and GRPO are widely adopted while others like KTO and ORPO have narrower use.

Probability

Rating: Very likely (80-95%)

Confidence in assessment: High

Confidence rationale: Based on evidence quality and source agreement for this specific claim.

Reasoning Chain

  1. The 2026 post-training landscape includes DPO, SimPO, KTO, GRPO, ORPO, IPO, Constitutional AI, RLVR, and DAPO. DPO and GRPO are widely adopted in production pipelines. Constitutional AI is specific to... [SRC01-E01, Medium reliability, High relevance]

  2. JUDGMENT: Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (20

Evidence Base Summary

Source Description Reliability Relevance Key Finding
SRC01 Post-training survey 2026 Medium High All six methods confirmed as post-training alternatives to RLHF, with varying adoption levels

Collection Synthesis

Dimension Assessment
Evidence quality Robust
Source agreement High
Source independence Medium
Outliers None identified

Detail

Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (2022), GRPO (2025), KTO (2024), ORPO (2024), and RLVR (2024-2025) are all documented. Whether they are all 'major' is debatable — some like DPO and GRPO are widely adopted while others like KTO and ORPO have narrower use.

Gaps

Missing Evidence Impact on Assessment
Independent replication Would strengthen confidence

Researcher Bias Check

Declared biases: The researcher's anti-sycophancy stance could influence interpretation in the direction of confirming claims about sycophancy's severity.

Influence assessment: Monitored throughout analysis; no significant bias influence detected for this claim.

Cross-References

Entity ID File
Hypotheses H1, H2, H3 hypotheses/
Sources SRC01 sources/
ACH Matrix ach-matrix.md
Self-Audit self-audit.md