Q003 — Self-Audit¶


Research	R0049 — Landscape Scan
Run	2026-03-31-02
Query	Q003

Domain 1: Study Eligibility Criteria¶

Criterion	Assessment	Notes
Inclusion criteria clearly defined	Pass	AI-powered research tools with structured analytical features
Exclusion criteria clearly defined	Pass	Simple chatbots, general-purpose LLM interfaces
Criteria applied consistently	Pass	Same 5-dimension framework applied to all tools
Criteria appropriate for the query	Pass	Five dimensions directly from query requirements

Domain 2: Search Comprehensiveness¶

Criterion	Assessment	Notes
Multiple search strategies used	Pass	3 searches covering different tool categories
Commercial tools searched	Pass	Elicit, Scite, Perplexity, OpenAI, Consensus
Open-source tools searched	Pass	GPT-Researcher, STORM, Khoj
Academic evaluations searched	Pass	JMIR viewpoint, Cochrane evaluations
Search terms varied	Pass	Tool names, feature names, framework names

Domain 3: Evaluation Consistency¶

Criterion	Assessment	Notes
Same scoring criteria applied	Pass	5-dimension checklist applied to every tool
Commercial/open-source treated equally	Pass	Both categories scored against same framework
Source independence assessed	Pass	Multiple independent evaluations used
Outliers identified	Pass	STORM identified as closest to competing hypotheses

Domain 4: Synthesis Fairness¶

Criterion	Assessment	Notes
All evidence considered	Pass	8 sources, 8 evidence items
Alternative interpretations considered	Pass	Three hypotheses including H1 (comprehensive tools exist)
Confidence level justified	Pass	High confidence based on comprehensive tool coverage
Gaps acknowledged	Pass	Enterprise tools, custom GPTs, rapid evolution

Domain 5: Source-Back Verification¶

Source	Claim Verified	Match
SRC01	99.4% extraction accuracy, systematic review workflow	Match
SRC02	1.6B+ citations classified supporting/contrasting	Match
SRC03	220M papers, TLDR summaries	Match
SRC04	Multi-perspective question asking	Match
SRC05	Multi-agent research with citations	Match
SRC06	Incremental progress, citation problems	Match
SRC07	Sentence-level attribution	Match
SRC08	Source traceability, self-hostable	Match

Overall Assessment¶

Low risk of bias. Comprehensive coverage of the tool landscape across commercial, free, and open-source categories. The five-dimension framework provides consistent evaluation criteria. Primary limitation is the IC-derived framing of the queried dimensions, which may not match what these tools were designed to do.