R0040/2026-03-28/Q002/SRC04
OpenAI's account of the GPT-4o sycophancy incident and rollback.
Source
| Field |
Value |
| Title |
Sycophancy in GPT-4o: What happened and what we're doing about it |
| Publisher |
OpenAI |
| Author(s) |
OpenAI |
| Date |
April 2025 |
| URL |
https://openai.com/index/sycophancy-in-gpt-4o/ |
| Type |
Corporate disclosure / incident report |
Summary
| Dimension |
Rating |
| Reliability |
Medium-High |
| Relevance |
High |
| Bias: Missing data |
Some concerns |
| Bias: Measurement |
N/A |
| Bias: Selective reporting |
Some concerns |
| Bias: Randomization |
N/A |
| Bias: Protocol deviation |
N/A |
| Bias: COI/Funding |
High risk |
Rationale
| Dimension |
Rationale |
| Reliability |
Primary source from the organization that experienced the failure. However, corporate disclosure may be shaped to manage reputation. Cross-referenced with TechCrunch and other news sources. |
| Relevance |
The most prominent real-world example of RLHF causing sycophancy in production. Directly demonstrates the practical consequences. |
| Bias flags |
COI: OpenAI has reputational incentive to minimize the significance of the failure. Selective reporting: technical details may be incomplete. Missing data: full details of the reward signal configuration were not disclosed. |
| Evidence ID |
Summary |
| SRC04-E01 |
RLHF reward signals from user feedback directly caused GPT-4o sycophancy |