Q003-H3¶


Research	R0049 — Landscape Scan
Run	2026-03-31
Query	Q003
Hypothesis	H3

Statement¶

AI-assisted research tools implement individual structured analytical features in isolation, but no tool achieves comprehensive coverage of the five target features. The dominant design pattern prioritizes research efficiency (speed, volume, coverage) over analytical rigor (confidence calibration, bias assessment, self-audit).

Status¶

Supported. This is the best-supported hypothesis. The evidence reveals a clear landscape pattern: tools optimize for different value propositions that do not include analytical rigor as a primary design goal.

Supporting Evidence¶

Evidence	Summary
SRC01-E01	PaperQA2 — most advanced academic agent, optimizes for citation accuracy, lacks analytical rigor features
SRC02-E01	STORM — optimizes for knowledge breadth and multi-perspective coverage
SRC03-E01	Elicit — optimizes for screening efficiency, approaching human-level accuracy
SRC04-E01	Scite — implements citation context (one feature) but not comprehensive framework
SRC05-E01	MS Copilot Critique — cross-model audit (one feature), not full analytical rigor
SRC06-E01	Open Synthesis — implements ACH (one feature), maintenance mode, no AI
SRC07-E01	GPT Researcher — volume-based quality ("most frequent information"), no formal rigor

Contradicting Evidence¶

Evidence	Summary
—	No contradicting evidence found

Reasoning¶

The landscape can be categorized by primary design goal:

Citation accuracy: PaperQA2, scite
Knowledge breadth: STORM, Perplexity Deep Research
Screening efficiency: Elicit, ASReview
Information aggregation: GPT Researcher, OpenAI Deep Research
Cross-model verification: Microsoft Copilot Critique/Council

None of these categories prioritizes the analytical rigor features described in Q003. The closest approaches are scite's Smart Citations (partial evidence quality) and Microsoft's Critique (partial audit). Both are single-feature implementations.

Relationship to Other Hypotheses¶

Subsumes H1 (eliminated) and H2 (eliminated). Consistent with the findings from Q001 (no comprehensive prompts) and Q002 (no unified methodology) — the gap in the tools landscape mirrors the gaps in the prompts and methodology landscapes.