R0041/2026-03-28/Q003 — ACH Matrix¶
Matrix¶
| H1: RLVR eliminates sycophancy broadly | H2: RLVR cannot address sycophancy | H3: RLVR works narrowly, not broadly | |
|---|---|---|---|
| SRC01-E01: RLVR mechanism and domain limits | + | -- | ++ |
| SRC02-E01: Preference methods cause sycophancy | ++ | -- | ++ |
| SRC03-E01: Mathematical proof of RLHF amplification | ++ | -- | ++ |
| SRC04-E01: DeepSeek-R1 narrow domain acknowledgment | - | - | ++ |
| SRC05-E01: Modular stack (RLVR + preference) | -- | - | ++ |
Legend:
- ++ Strongly supports
- + Supports
- -- Strongly contradicts
- - Contradicts
- N/A Not applicable to this hypothesis
Diagnosticity Analysis¶
Most Diagnostic Evidence¶
| Evidence ID | Why Diagnostic |
|---|---|
| SRC05-E01 | The modular stack (RLVR + preference methods coexisting) is the most diagnostic: it directly contradicts H1 (which predicts RLVR could replace preference methods) and strongly supports H3 (which predicts coexistence) |
| SRC04-E01 | DeepSeek-R1's own acknowledgment of "limited performance in broader areas" is diagnostic because it comes from RLVR's most prominent implementation team |
Least Diagnostic Evidence¶
| Evidence ID | Why Non-Diagnostic |
|---|---|
| SRC02-E01 | The LessWrong analysis of preference-based sycophancy supports both H1 and H3 equally — both predict preference methods cause sycophancy |
Outcome¶
Hypothesis supported: H3 — RLVR reduces sycophancy in narrow verifiable domains but cannot replace preference methods in the subjective domains where sycophancy is most problematic. The modular training stack confirms this.
Hypotheses eliminated: H2 — RLVR does structurally avoid the sycophancy mechanism in its applicable domains.
Hypotheses inconclusive: H1 — Partially supported on mechanism (RLVR does avoid preference bias) but contradicted on scope (narrow, not broad).