R0041/2026-03-28/Q002/H1¶
Statement¶
Yes, at least some enterprise or government AI deployments have explicitly required sycophancy reduction as a design goal or evaluation criterion, using the term "sycophancy" or a direct functional equivalent.
Status¶
Current: Partially supported
No deployment was found that uses the specific term "sycophancy" in formal requirements. However, Georgetown CSET's military AI risk analysis explicitly discusses the danger of AI models that "cave to user's expectations" and recommends circumscribed deployment as mitigation. The Mass General Brigham study demonstrated that LLMs in healthcare "prioritize helpfulness over critical thinking" — a functional description of sycophancy — and researchers recommended that healthcare AI place "greater emphasis on harmlessness even if it comes at the expense of helpfulness."
Supporting Evidence¶
| Evidence | Summary |
|---|---|
| SRC01-E01 | Georgetown CSET explicitly identifies AI "caving to user expectations" as a military risk |
| SRC02-E01 | Mass General Brigham study recommends healthcare AI prioritize harmlessness over helpfulness |
Contradicting Evidence¶
| Evidence | Summary |
|---|---|
| SRC05-E01 | FAA AI safety roadmap does not address sycophancy specifically |
| SRC06-E01 | FINRA AI governance guidance does not mention sycophancy |
Reasoning¶
H1 is partially supported because the functional concern (AI that agrees rather than being accurate) is acknowledged in defense and healthcare contexts, but it has not been formalized as a procurement requirement or regulatory mandate. The closest examples are academic/research recommendations, not binding requirements.
Relationship to Other Hypotheses¶
H1 and H3 overlap significantly. The distinction is whether "sycophancy" is named explicitly (H1) vs. addressed under different terminology (H3). Evidence more strongly supports H3, with H1 receiving partial support from the Georgetown CSET and Mass General Brigham findings.