R0057/2026-04-01/C006
Claim: At least six major alternatives to RLHF have emerged since 2022 (DPO, KTO, GRPO, Constitutional AI, ORPO, RLVR).
BLUF: Confirmed. All six named alternatives are well-documented: DPO (2023), KTO (2024), GRPO (2024), Constitutional AI (2022), ORPO (2024), and RLVR (2024-2025). All are widely adopted or cited.
Probability: Almost certain (95-99%) | Confidence: High
Summary
Hypotheses
| ID |
Hypothesis |
Status |
| H1 |
All six alternatives exist and qualify as major |
Supported |
| H2 |
Most exist but some may not qualify as major alternatives |
Not supported |
| H3 |
Fewer than six alternatives exist |
Eliminated |
Searches
| ID |
Target |
Results |
Selected |
| S01 |
DPO KTO GRPO Constitutional AI ORPO RLVR alternatives RLHF |
10 |
1 |
Sources
| Source |
Description |
Reliability |
Relevance |
| SRC01 |
Multiple survey articles on RLHF alternatives |
High |
High |
Revisit Triggers
- If any of the six named alternatives is shown to not qualify as a major alternative