Skip to content

SRC04 — From Yes-Men to Truth-Tellers: Addressing Sycophancy with Pinpoint Tuning

Source

Title From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning
Publisher ICML 2024 / arXiv
Authors Wei Chen, Zhen Huang, Liang Xie, et al.
Date September 2024 (accepted ICML 2024; revised February 2025)
URL https://arxiv.org/abs/2409.01658
Type Peer-reviewed conference paper

Summary Ratings

Dimension Rating
Reliability High
Relevance High
Missing data bias Low
Measurement bias Low
Selective reporting bias Low
Randomization bias N/A
Protocol deviation bias Low
COI / Funding bias Low

Rationale

Dimension Rationale
Reliability Peer-reviewed at ICML 2024; quantified results with multiple model sizes
Relevance Directly proposes a method to reduce sycophancy by targeting specific model components

Evidence Extracts

Evidence Summary
SRC04-E01 Supervised Pinpoint Tuning reduces sycophancy by targeting <5% of model modules