Q003-SRC01-E01¶


Research	R0049 — Landscape Scan
Run	2026-03-31
Query	Q003
Source	SRC01
Evidence	E01

Extract¶

PaperQA2 outperforms human experts on answering questions across scientific literature, produces summaries more factual than Wikipedia on average, and can detect contradictions at scale. Costs $1-$3 per query. Implements relevance-based scoring (ranked chunk retrieval + reranking + contextual summarization) but not formal bias assessment, calibrated probability, competing hypotheses, search transparency logging, or self-audit. Quality strategy relies on retrieval accuracy and citation grounding rather than analytical rigor frameworks.

Relevance to Hypotheses¶

Hypothesis	Relationship	Strength
H1	Contradicts — most advanced research agent lacks comprehensive framework	Strong
H2	Supports for specific features — none of five target features present	Moderate
H3	Supports — high quality without analytical rigor features demonstrates the dominant design pattern	Strong

Context¶

PaperQA2 represents the state of the art in AI scientific research. Its performance demonstrates that high accuracy is achievable through retrieval engineering alone, which may explain why the field has not invested in analytical rigor frameworks — the accuracy-focused approach "works well enough" for many use cases.

Notes¶

The contradiction detection feature ("contractrow" setting) is the closest PaperQA2 comes to structured analytical methodology, but it operates at the individual claim level rather than as part of a competing hypotheses framework.