Skip to content

R0049/2026-03-31-02

Research R0049 — Landscape Scan
Mode Query
Run date 2026-03-31
Queries 3
Prompt ai-research-methodology:research
Model claude-opus-4-6 (1M context)

Queries

Q001 — AI Research Prompt Frameworks

Has anyone published a complete, usable AI/LLM system prompt implementing a full analytical rigor framework for research?

Very unlikely (05-20%). No published system prompt implements a complete analytical rigor framework.

Hypothesis Status
H1 — Complete prompts exist Not supported
H2 — Only narrow-task prompts exist Partially supported
H3 — Partial implementations exist Supported

Full analysis

Q002 — Unified IC + Scientific Methodology

Has anyone published a systematic combination of IC analytical standards with scientific methodology frameworks into a unified methodology?

Very unlikely (05-20%). Multiple scholars have called for bridging but none has produced a unified framework.

Hypothesis Status
H1 — Unified frameworks exist Not supported
H2 — Domains entirely siloed Partially supported
H3 — Partial bridges exist Supported

Full analysis

Q003 — AI Research Tools with Structured Frameworks

What AI research tools implement structured analytical frameworks beyond simple chat?

Confidence: High. Rich tool ecosystem exists but none implements the five queried analytical rigor dimensions.

Hypothesis Status
H1 — Comprehensive framework tools exist Not supported
H2 — No structured features in any tool Not supported
H3 — Partial structured features exist Supported

Full analysis

Collection Analysis

Cross-Cutting Patterns

  1. The "partial but not comprehensive" pattern: All three queries converge on H3 (partial implementations exist). The field has produced narrow-task tools (Q001), parallel-but-separate traditions (Q002), and citation-focused platforms (Q003), but no comprehensive integration in any dimension.

  2. The prompt-vs-code divide: Comprehensive research systems exist (Agent Laboratory, AI-Researcher) but are implemented in code rather than as system prompts (Q001). This suggests that the complexity of full analytical rigor exceeds what current prompt architecture can effectively encode.

  3. The citation-transparency ceiling: All major AI research tools implement some form of citation transparency (Q003), but none goes beyond citing sources to implement analytical rigor — probability calibration, bias assessment, competing hypotheses, or self-audit. Citation is the floor, not the ceiling, of analytical rigor.

  4. The parallel traditions gap: IC analytical standards and scientific methodology frameworks address the same fundamental challenges (evidence quality, uncertainty, bias) through independently developed solutions (Q002). Neither community has adopted the other's specific tools, creating an integration opportunity that remains unfilled.

Statistics

Metric Value
Queries investigated 3
Hypotheses tested 9
Hypotheses supported 3 (all H3 variants)
Searches executed 10
Results dispositioned 26 selected + 74 rejected = 100 total
Sources scored 20
Evidence extracts 20

Source Independence

Sources across the three queries are largely independent:

  • Q001 sources: Academic surveys (Prompt Report), leaked prompts (Perplexity, OpenAI), open-source implementations (sroberts), conference papers (Agent Laboratory, AI-Researcher)
  • Q002 sources: IC methodology (CIA Primer, Heuer & Pherson), academic bridging (Treverton, Prunckun, Tecuci, Marcoci)
  • Q003 sources: Commercial platforms (Elicit, Scite, Perplexity), open-source tools (STORM, GPT-Researcher, Khoj), academic evaluation (JMIR)

Cross-query overlap: Perplexity appears in both Q001 (system prompt analysis) and Q003 (tool evaluation), providing consistent evidence from different angles.

Collection Gaps

  1. Classified IC literature: The intelligence community may have internal frameworks or AI implementations not accessible through open search.
  2. Custom GPT ecosystem: Thousands of custom GPTs with private prompts may implement analytical frameworks not publicly documented.
  3. Enterprise tools: Palantir AIP, IBM Watson, and similar enterprise platforms may implement analytical rigor features not covered in public reviews.
  4. Non-English sources: Research was limited to English-language sources.
  5. Rapid evolution: The AI tool landscape changes faster than any point-in-time assessment can capture.

Collection Self-Audit

Criterion Assessment
Search comprehensiveness Pass — 10 searches across 3 queries covering academic, commercial, open-source, and leaked sources
Evidence quality Pass — 20 sources scored, mix of peer-reviewed and primary artifacts
Hypothesis testing Pass — 9 hypotheses with supporting and contradicting evidence evaluated
Bias management Pass — Researcher bias checks performed for each query
Convergence Strong — All three queries converge independently on the same meta-finding (partial implementations only)
Independence Pass — Sources across queries are largely independent

Resources

Summary

Metric Value
Queries investigated 3
Files produced 101
Sources scored 20
Evidence extracts 20
Results dispositioned 26 selected + 74 rejected = 100 total

Tool Breakdown

Tool Uses Purpose
WebSearch 22 Search queries
WebFetch 5 Page content retrieval
Write 101 File creation
Bash 5 Directory creation and management

Token Distribution

Category Tokens
Input (context) ~150,000
Output (generation) ~80,000
Total ~230,000