Skip to content

SRC02 — Sycophancy in GPT-4o: What Happened and What We're Doing About It

Source

Title Sycophancy in GPT-4o: What happened and what we're doing about it
Publisher OpenAI Blog
Authors OpenAI
Date April 2025
URL https://openai.com/index/sycophancy-in-gpt-4o/
Type Corporate blog post / incident report

Summary Ratings

Dimension Rating
Reliability Medium-High
Relevance High
Missing data bias Medium
Measurement bias Medium
Selective reporting bias High
Randomization bias N/A
Protocol deviation bias N/A
COI / Funding bias High

Rationale

Dimension Rationale
Reliability First-party incident report from the company that experienced the problem; high factual accuracy for what happened but may downplay root causes
Relevance Direct real-world case study of RLHF-induced sycophancy at scale
Selective reporting OpenAI has incentive to frame the incident as a fixable bug rather than a fundamental RLHF limitation
COI / Funding OpenAI is commercially invested in RLHF-based training and has incentive to minimize the structural nature of the problem

Evidence Extracts

Evidence Summary
SRC02-E01 GPT-4o sycophancy caused by reward signals from thumbs-up/down overpowering safeguards
SRC02-E02 OpenAI rolled back the update and committed to training method changes