Skip to content

SRC05 — Sycophancy Hides Linearly in the Attention Heads

Source

Title Sycophancy Hides Linearly in the Attention Heads
Publisher arXiv
Authors Rifo Genadi, Munachiso Nwadike, Nurdaulet Mukhituly, Hilal Alquabeh, Tatsuya Hiraoka, Kentaro Inui
Date January 2026
URL https://arxiv.org/abs/2601.16644
Type Pre-print

Summary Ratings

Dimension Rating
Reliability Medium
Relevance High
Missing data bias Low
Measurement bias Low
Selective reporting bias Low
Randomization bias N/A
Protocol deviation bias Low
COI / Funding bias Low

Rationale

Dimension Rationale
Reliability Recent pre-print, not yet peer-reviewed; but builds on established interpretability methods
Relevance Provides mechanistic understanding of where sycophancy lives in model internals

Evidence Extracts

Evidence Summary
SRC05-E01 Sycophancy is linearly separable in attention heads and distinct from truthfulness directions