R0040/2026-04-01/Q001 — Self-Audit¶
ROBIS 4-Domain Audit¶
Domain 1: Eligibility Criteria¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| Evidence criteria defined before searching | Yes -- sought published methods, benchmarks, and production deployments |
| Criteria remained consistent | Yes -- no shifting of inclusion criteria after results |
| Criteria appropriate for the query | Yes -- open-ended survey question appropriately used broad criteria |
Notes: Eligibility was straightforward for this query -- any published or deployed alignment method qualifies.
Domain 2: Search Comprehensiveness¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| Multiple search strategies used | Yes -- 4 searches targeting different method families and a general survey |
| Searches designed for coverage | Yes -- vocabulary exploration identified DPO, RLAIF, GRPO, KTO, IPO, ORPO, RLVR, SPIN |
| All results dispositioned | Yes -- 50 results returned across 4 searches, all dispositioned (14 selected, 36 rejected) |
| Source diversity achieved | Yes -- peer-reviewed papers, lab publications, technical analyses, industry overviews |
Notes: 50 total results dispositioned across 4 searches. Coverage includes all major method families. Minor gap: ORPO details are thin -- fewer dedicated searches for this method.
Domain 3: Evaluation Consistency¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| All sources scored using same framework | Yes -- all 7 sources have GRADE+Cochrane scorecards |
| Evidence typed consistently | Yes -- Factual, Reported, and Analytical types applied consistently |
| Thematic clustering applied | Yes -- 5 thematic clusters identified from evidence |
Notes: Open-ended query used thematic clustering rather than ACH matrix, consistent with methodology for non-enumerable answer spaces.
Domain 4: Synthesis Fairness¶
Rating: Low risk
| Criterion | Assessment |
|---|---|
| All method families given fair coverage | Yes -- no alternative dismissed without evidence |
| Contradictory evidence surfaced | Yes -- Apple's DPO limitation finding, RLVR search-compression debate |
| Confidence calibrated to evidence | Yes -- High confidence reflects strong convergence of independent sources |
| Gaps acknowledged | Yes -- proprietary training details, head-to-head benchmarks, long-term stability |
Notes: The assessment avoids declaring any single winner, which is appropriate given the evidence showing method selection depends on task characteristics.
Domain 5: Source-Back Verification¶
Rating: Low risk
| Source | Claim in Assessment | Source Actually Says | Match? |
|---|---|---|---|
| SRC02 | DPO achieves 40-75% lower compute | Search results report "40-75% lower compute cost compared to RLHF" | Yes |
| SRC03 | GRPO improved GSM8K from 82.9% to 88.2% | Search results report these exact figures | Yes |
| SRC04 | KTO matches or exceeds DPO at 1B-30B | Paper abstract states "matches or exceeds...at scales from 1B to 30B" | Yes |
| SRC05 | RLVR gains mostly from search compression | Article states "Majority: Search compression" | Yes |
| SRC06 | RLAIF more harmless while maintaining helpfulness | Paper states "significantly more harmless...helpfulness remains on par" | Yes |
Discrepancies found: 0
Corrections applied: None needed
Unresolved flags: None
Notes: All claims verified against source content. No interpretation drift detected.
Overall Assessment¶
Overall risk of bias: Low risk
The query was factual and open-ended (what methods exist?), making bias less likely than for evaluative queries. The evidence base is strong, with peer-reviewed papers from independent groups. The main limitation is that proprietary details from major labs are unavailable.
Researcher Bias Check¶
- Confirmation bias risk: Low. The researcher's prior work on RLHF and sycophancy could lead to overemphasizing RLHF's limitations, but Q001 asks a neutral survey question (what alternatives exist?) rather than an evaluative one.
- Availability bias risk: Low. Methods that appear frequently in search results (DPO, GRPO) received more detailed coverage, but this reflects genuine adoption rates rather than search bias.
- Anchoring risk: Low. No prior hypothesis anchored the search -- the open-ended approach allowed all methods to emerge from the evidence.