Skip to content

R0055/2026-04-01/C002/S01

Research R0055 — RLHF Yes-Men Claims
Run 2026-04-01
Claim C002
Search S01

WebSearch — RLHF training methodology human labelers preferences

Summary

Field Value
Source/Database WebSearch
Query terms RLHF training methodology human labelers preferences
Filters None
Results returned 10
Results selected 2
Results rejected 8

Selected Results

Result Title URL Rationale
S01-R01 Towards Understanding Sycophancy in Language Model https://arxiv.org/pdf/2310.13548 Primary source for claim verification
S01-R02 Secondary source Supporting evidence

Rejected Results

Result Title URL Rationale
S01-R03 Other results Less relevant or duplicative

Notes

Search targeted the specific factual assertions in the claim.