Q003 — Assessment¶


Research	R0049 — Landscape Scan
Run	2026-03-31-02
Query	Q003

BLUF¶

A rich ecosystem of AI research tools exists, but none implements the five analytical rigor dimensions queried (calibrated probability language, formal bias assessment, competing hypotheses, search transparency logging, self-audit mechanisms). Tools excel at citation and discovery but are analytically thin. The gap between information gathering and analytical rigor remains wide.

Answer¶

Confidence: High. The tool landscape is well-documented through multiple independent reviews and academic evaluations.

Tool Landscape Overview¶

Tool	Type	Key Structured Feature	Queried Dimensions Implemented
Elicit	Commercial	Structured data extraction tables	0.5/5 (partial search transparency)
Scite	Commercial	Smart Citations (support/contrast)	0.5/5 (partial evidence classification)
Semantic Scholar	Free	TLDR + structured tables	0.5/5 (partial search transparency)
STORM	Open-source	Multi-perspective question asking	1/5 (partial competing perspectives + search transparency)
GPT-Researcher	Open-source	Autonomous multi-agent research	0.5/5 (partial search transparency)
Perplexity Deep Research	Commercial	Sentence-level source attribution	0.5/5 (partial search transparency)
OpenAI Deep Research	Commercial	Multi-step reasoning + synthesis	0/5
Khoj	Open-source	Source traceability + self-hosting	0.5/5 (partial search transparency)

Feature Gap Analysis¶

Queried Dimension	Tools Implementing	Assessment
Calibrated probability language	None	No tool uses standardized probability expressions
Formal bias assessment	None	No tool includes risk-of-bias scoring or cognitive bias checks
Competing hypotheses	STORM (partial)	STORM's multi-perspective approach is conceptually related but not formal ACH
Search transparency logging	All (partial)	All tools cite sources; none logs complete search methodology (terms, rejected results)
Self-audit mechanisms	None	No tool includes methodological self-checking

Reasoning Chain¶

Survey of 8 tools across commercial, free, and open-source categories (SRC01 through SRC08) found that all implement some form of citation transparency but none implements formal analytical rigor features.
The most structured tool, Elicit (SRC01-E01), achieves 99.4% data extraction accuracy and offers systematic review workflows, but does not implement probability calibration, bias assessment, or self-audit.
Scite's Smart Citations (SRC02-E01) classify 1.6B+ citations as supporting or contrasting — the closest feature to evidence weighting — but operate at the citation level, not the hypothesis/claim level.
STORM (SRC04-E01) is the most methodologically innovative, implementing multi-perspective question asking that conceptually parallels competing hypotheses, but does not formalize this as ACH or equivalent.
The most current academic evaluation (JMIR, SRC06-E01, published 2026-03-26) explicitly concludes that deep research agents lack analytical rigor and should be used as assistive tools, not pseudoexperts.
Commercial deep research tools (Perplexity, OpenAI) delegate analytical quality to model training rather than implementing it as structured features in the user experience.

Evidence Base Summary¶

Source	Reliability	Relevance	Key Finding
SRC01	High	High	Most structured tool; 0.5/5 dimensions
SRC02	High	High	Unique citation classification; 0.5/5
SRC03	High	Medium	Discovery-focused; 0.5/5
SRC04	High	Medium-High	Multi-perspective; 1/5
SRC05	Medium	Medium	Citation quality; 0.5/5
SRC06	High	High	Academic evaluation confirms gap
SRC07	Medium-High	High	Citation standard-setter; 0.5/5
SRC08	Medium	Medium	Self-hostable; 0.5/5

Gaps¶

Specialized intelligence analysis tools: Palantir AIP and similar enterprise platforms may implement analytical frameworks not visible in public documentation.
Custom GPT ecosystem: Specialized custom GPTs (e.g., Plessas ACH GPT) may implement features not captured in this survey.
Internal enterprise tools: Research organizations may have internal tools with more analytical rigor.
Rapid evolution: The tool landscape changes rapidly; features may be added between this assessment and its reading.

Researcher Bias Check¶

Tool coverage bias: Commercial tools are better documented than open-source alternatives, potentially over-representing commercial features and under-representing open-source innovation.
Feature framing bias: The five queried dimensions come from intelligence analysis tradition, potentially creating a framework that existing tools were never designed to match.