R0049/2026-03-31/Q003-SRC03-E01¶
Extract¶
Elicit achieves 81.4% overall accuracy in systematic review tasks compared to 86.7% for human reviewers — a statistically non-significant difference. Researchers report up to 80% time savings. Elicit automates screening and data extraction while partially supporting search and report generation. However, Elicit does not implement calibrated probability language, formal bias assessment, competing hypotheses analysis, or self-audit mechanisms. Search transparency logging is partial (search strategies are visible but not formally documented in a reproducibility-focused format).
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Contradicts — leading SR tool lacks comprehensive analytical framework | Moderate |
| H2 | Largely supports — none of five target features formally implemented | Moderate |
| H3 | Supports — high efficiency without analytical rigor features | Strong |
Context¶
Elicit represents the commercial mainstream of AI research tools. Its value proposition is efficiency (time savings) and accuracy (approaching human performance) rather than analytical rigor. This aligns with market demand: most researchers seek faster systematic reviews, not more rigorous analytical frameworks.
Notes¶
Elicit's partial search transparency (visible search strategies) is worth noting as the closest any commercial tool comes to search logging, but it falls well short of formal search transparency as defined in Q003.