Skip to content

R0041/2026-03-28/Q003 — ACH Matrix

Matrix

H1: RLVR eliminates sycophancy broadly H2: RLVR cannot address sycophancy H3: RLVR works narrowly, not broadly
SRC01-E01: RLVR mechanism and domain limits + -- ++
SRC02-E01: Preference methods cause sycophancy ++ -- ++
SRC03-E01: Mathematical proof of RLHF amplification ++ -- ++
SRC04-E01: DeepSeek-R1 narrow domain acknowledgment - - ++
SRC05-E01: Modular stack (RLVR + preference) -- - ++

Legend: - ++ Strongly supports - + Supports - -- Strongly contradicts - - Contradicts - N/A Not applicable to this hypothesis

Diagnosticity Analysis

Most Diagnostic Evidence

Evidence ID Why Diagnostic
SRC05-E01 The modular stack (RLVR + preference methods coexisting) is the most diagnostic: it directly contradicts H1 (which predicts RLVR could replace preference methods) and strongly supports H3 (which predicts coexistence)
SRC04-E01 DeepSeek-R1's own acknowledgment of "limited performance in broader areas" is diagnostic because it comes from RLVR's most prominent implementation team

Least Diagnostic Evidence

Evidence ID Why Non-Diagnostic
SRC02-E01 The LessWrong analysis of preference-based sycophancy supports both H1 and H3 equally — both predict preference methods cause sycophancy

Outcome

Hypothesis supported: H3 — RLVR reduces sycophancy in narrow verifiable domains but cannot replace preference methods in the subjective domains where sycophancy is most problematic. The modular training stack confirms this.

Hypotheses eliminated: H2 — RLVR does structurally avoid the sycophancy mechanism in its applicable domains.

Hypotheses inconclusive: H1 — Partially supported on mechanism (RLVR does avoid preference bias) but contradicted on scope (narrow, not broad).