R0041/2026-03-28/Q001 — ACH Matrix¶
Matrix¶
| H1: Enterprise products exist | H2: No vendor attention | H3: General alignment only | |
|---|---|---|---|
| SRC01-E01: Anthropic 70-85% sycophancy reduction via RL | + | -- | ++ |
| SRC02-E01: OpenAI GPT-4o rollback, reward signal fix | + | -- | ++ |
| SRC03-E01: ELEPHANT study — all 11 LLMs sycophantic | + | -- | + |
| SRC04-E01: Petri dedicated sycophancy evaluation tool | ++ | -- | + |
| SRC05-E01: OpenAI Model Spec anti-sycophancy principle | + | -- | ++ |
| SRC06-E01: Anthropic soul document anti-sycophancy | + | -- | ++ |
| S04 null result: No enterprise products found | -- | N/A | ++ |
Legend:
- ++ Strongly supports
- + Supports
- -- Strongly contradicts
- - Contradicts
- N/A Not applicable to this hypothesis
Diagnosticity Analysis¶
Most Diagnostic Evidence¶
| Evidence ID | Why Diagnostic |
|---|---|
| S04 (null result) | The absence of enterprise sycophancy products in vendor comparisons and Gartner analyses strongly discriminates between H1 and H3. If enterprise products existed, they would appear in procurement guidance. |
| SRC01-E01 | Anthropic's 70-85% reduction is achieved via RL training, not enterprise configuration — discriminates H1 from H3 on delivery mechanism. |
Least Diagnostic Evidence¶
| Evidence ID | Why Non-Diagnostic |
|---|---|
| SRC03-E01 | The ELEPHANT study documents the problem but does not discriminate between vendor response approaches. It supports all hypotheses that acknowledge sycophancy exists. |
Outcome¶
Hypothesis supported: H3 — Vendors address sycophancy through general model alignment and training, not as configurable enterprise features. Every vendor investment identified operates at the model level.
Hypotheses eliminated: H2 — Conclusively eliminated. Multiple vendors have made significant, documented investments in sycophancy reduction.
Hypotheses inconclusive: H1 — Partially supported in that vendor investment is real and significant, but the delivery mechanism is model-level, not enterprise-configurable. Anthropic's Petri tool and constitutional principles come closest to dedicated anti-sycophancy infrastructure, but remain model-level rather than customer-configurable.