R0055/2026-04-01/C007 — Assessment¶
BLUF¶
Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (2022), GRPO (2025), KTO (2024), ORPO (2024), and RLVR (2024-2025) are all documented. Whether they are all 'major' is debatable — some like DPO and GRPO are widely adopted while others like KTO and ORPO have narrower use.
Probability¶
Rating: Very likely (80-95%)
Confidence in assessment: High
Confidence rationale: Based on evidence quality and source agreement for this specific claim.
Reasoning Chain¶
-
The 2026 post-training landscape includes DPO, SimPO, KTO, GRPO, ORPO, IPO, Constitutional AI, RLVR, and DAPO. DPO and GRPO are widely adopted in production pipelines. Constitutional AI is specific to... [SRC01-E01, Medium reliability, High relevance]
-
JUDGMENT: Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (20
Evidence Base Summary¶
| Source | Description | Reliability | Relevance | Key Finding |
|---|---|---|---|---|
| SRC01 | Post-training survey 2026 | Medium | High | All six methods confirmed as post-training alternatives to RLHF, with varying adoption levels |
Collection Synthesis¶
| Dimension | Assessment |
|---|---|
| Evidence quality | Robust |
| Source agreement | High |
| Source independence | Medium |
| Outliers | None identified |
Detail¶
Substantially correct. All six named methods exist and have emerged as alternatives or complements to standard RLHF. DPO (2023), Constitutional AI (2022), GRPO (2025), KTO (2024), ORPO (2024), and RLVR (2024-2025) are all documented. Whether they are all 'major' is debatable — some like DPO and GRPO are widely adopted while others like KTO and ORPO have narrower use.
Gaps¶
| Missing Evidence | Impact on Assessment |
|---|---|
| Independent replication | Would strengthen confidence |
Researcher Bias Check¶
Declared biases: The researcher's anti-sycophancy stance could influence interpretation in the direction of confirming claims about sycophancy's severity.
Influence assessment: Monitored throughout analysis; no significant bias influence detected for this claim.
Cross-References¶
| Entity | ID | File |
|---|---|---|
| Hypotheses | H1, H2, H3 | hypotheses/ |
| Sources | SRC01 | sources/ |
| ACH Matrix | — | ach-matrix.md |
| Self-Audit | — | self-audit.md |