R0041/2026-04-01/Q001/H2¶
Statement¶
Vendors are actively researching and making incremental progress on sycophancy reduction at the model training level, and have developed evaluation tools, but have not yet productized sycophancy controls as enterprise-differentiated features.
Status¶
Current: Supported
Supporting Evidence¶
| Evidence | Summary |
|---|---|
| SRC01-E01 | OpenAI's detailed postmortem and pledged fixes demonstrate active attention to sycophancy, but fixes are general model improvements |
| SRC02-E01 | Anthropic reports 70-85% sycophancy improvement across model generations |
| SRC04-E01 | Anthropic developed Bloom, an automated sycophancy evaluation tool tested across 16 frontier models |
| SRC05-E01 | Anthropic's 2026 constitution update shifts from rules to reasoning, addressing sycophancy philosophically |
| SRC06-E01 | Google's Gemini 3 explicitly lists reduced sycophancy as a feature, and Gemini 1.5 benchmarked as least sycophantic |
| SRC07-E01 | Multiple independent sycophancy benchmarks now exist (syco-bench, SYCON-bench, SycEval) |
| SRC03-E01 | Lambert identifies sycophancy as fundamentally an "art of the model" problem, suggesting productization is premature |
Contradicting Evidence¶
| Evidence | Summary |
|---|---|
| SRC01-E02 | OpenAI's sycophancy incident shows that even with active programs, regression is possible |
Reasoning¶
The weight of evidence strongly supports this hypothesis. All three major vendors (OpenAI, Anthropic, Google) have active sycophancy research programs, have made measurable progress, and have developed evaluation tools. However, none has translated this into an enterprise-differentiated product. The pattern is: general model improvements benefit all users, but enterprises cannot specifically configure or contract for non-sycophantic behavior.
Relationship to Other Hypotheses¶
H2 occupies the middle ground between H1 (full productization, eliminated) and H3 (no meaningful action, weakened by evidence of genuine progress). The evidence most strongly supports this nuanced position.