Skip to content

R0044/2026-04-01/Q002/S01/R04

Research R0044 — Expanded Vocabulary Research
Run 2026-04-01
Query Q002
Search S01
Result S01-R04

ICLR 2024 paper on understanding sycophancy in language models

Summary

Field Value
Title Towards Understanding Sycophancy in Language Models
URL https://arxiv.org/abs/2310.13548
Date accessed 2026-04-01
Publication date 2024 (ICLR conference)
Author(s) Sharma, Tong, Korbak, et al. (19 authors)
Publication ICLR 2024

Selection Decision

Included in evidence base: No (used as supporting context; not scored as a separate source because its findings are subsumed by SRC01)

Rationale: Foundational technical paper demonstrating sycophancy prevalence across 5 AI models. Key finding: human preference models prefer sycophantic responses over correct ones a non-negligible fraction of the time. Overlaps significantly with the later Science paper by the same lead author.