Q002 — ACH Matrix¶


Research	R0020 — Prompt Engineering Gaps
Run	2026-03-25
Query	Q002

Matrix¶

	H1: Mainstream guides address sycophancy	H2: Not addressed in mainstream	H3: Emerging, inconsistent coverage
SRC01-E01: Four causes, academic techniques	+	--	++
SRC01-E02: Five critical research gaps	-	N/A	++
SRC02-E01: Question reframing (24pp reduction)	+	--	++
SRC03-E01: NNG behavioral mitigations	+	--	+
SRC04-E01: Industry strategies (~29% prompt contribution)	+	--	+

Legend: - ++ Strongly supports - + Supports - -- Strongly contradicts - - Contradicts - N/A Not applicable to this hypothesis

Diagnosticity Analysis¶

Most Diagnostic Evidence¶

Evidence ID	Why Diagnostic
SRC02-E01	Question reframing outperforming direct instruction is uniquely diagnostic: it shows effective techniques exist (contradicts H2) but are academic, not mainstream (supports H3 over H1)
SRC01-E02	Research gaps (measurement inconsistency, scalability) explain why mainstream guides can't yet provide reliable techniques (supports H3, weakens H1)

Least Diagnostic Evidence¶

Evidence ID	Why Non-Diagnostic
SRC04-E01	Supports H1, H3 equally; unverifiable claims reduce discriminating power
SRC03-E01	NNG coverage supports both H1 (mainstream awareness) and H3 (behavioral not technical)

Outcome¶

Hypothesis supported: H3 — Sycophancy is increasingly discussed in mainstream contexts (post-GPT-4o incident), but coverage is inconsistent, often behavioral rather than technical, and the most effective prompt-level techniques remain in academic literature.

Hypotheses eliminated: H2 — Multiple mainstream sources discuss sycophancy.

Hypotheses inconclusive: H1 — Partially supported by awareness growth but undermined by the depth gap between academic and mainstream coverage.