Skip to content

Q001 — Assessment

BLUF

Very unlikely (05-20%) that a complete, published AI/LLM system prompt implementing a full analytical rigor framework for research exists. Extensive search across academic literature, GitHub repositories, prompt collections, leaked commercial prompts, and preprint servers found only partial implementations and code-based research pipelines. The field has bifurcated: commercial tools delegate rigor to model training (not prompts), while research systems implement methodology in code (not prompts).

Probability Assessment

Hypothesis Probability Assessment
H1 — Complete prompts exist Very unlikely (05-10%) No supporting evidence found
H2 — No such prompts exist (only narrow tasks) Unlikely (30-40%) Too absolute; partial implementations exceed "narrow tasks"
H3 — Partial implementations exist Very likely (85-95%) Strongly supported by converging evidence

Reasoning Chain

  1. The most comprehensive survey of prompting techniques (SRC01-E01) catalogs 58 techniques and finds no category for research methodology framework prompts, establishing that this is not a recognized prompt category.

  2. Leaked commercial system prompts from the two leading AI research tools — Perplexity (SRC04-E01) and OpenAI Deep Research — contain zero analytical framework components. Both delegate analytical quality to model training and retrieval architecture.

  3. The most relevant partial implementation (SRC02-E01) covers only 3 of 66 structured analytic techniques, is implemented in Python/LangChain rather than as a system prompt, and is explicitly described as "basic."

  4. The most comprehensive research agent systems — Agent Laboratory (SRC05-E01) and AI-Researcher (SRC06-E01) — implement complete research pipelines but as code-based multi-agent architectures, not as system prompts, and do not implement named analytical rigor frameworks (ICD 203, GRADE, PRISMA, etc.).

  5. The academic literature focuses on using AI within existing frameworks (PRISMA-trAIce reporting, SRC03-E01) rather than implementing frameworks as AI prompts.

  6. No search produced evidence of any prompt implementing ICD 203, GRADE, Cochrane, or IPCC as an integrated AI system prompt. The intelligence analysis community's focus remains on human training and evaluation.

Evidence Base Summary

Source Reliability Relevance Key Finding
SRC01 High High 58 techniques cataloged; no research framework prompt category
SRC02 Medium High 3 SATs implemented via LLM; partial, code-based
SRC03 High Medium PRISMA extension for reporting, not operational prompt
SRC04 Medium High Leading research tool has no analytical framework in prompt
SRC05 High Medium Complete pipeline in code, no named frameworks
SRC06 High Medium NeurIPS spotlight; code-based, no named frameworks

Collection Synthesis

The evidence converges strongly on H3. Six independent source types (academic surveys, leaked prompts, open-source implementations, conference papers, reporting standards, prompt collections) all point to the same conclusion: partial implementations exist across a spectrum, but no complete analytical rigor framework has been published as a system prompt. The gap is at the specific intersection of "system prompt" (not code) and "full framework" (not single technique).

Gaps

  1. Custom GPTs: The OpenAI GPT Store contains custom GPTs (e.g., Plessas ACH GPT) whose full prompts are not publicly visible. Some may implement more comprehensive frameworks than what is publicly documented.
  2. Proprietary internal prompts: Organizations (intelligence agencies, consultancies, research institutions) may have internal system prompts implementing these frameworks but have not published them.
  3. Non-English sources: Search was limited to English-language sources.
  4. Anthropic Claude: No leaked system prompt for Claude's research capabilities was found for comparison with Perplexity and OpenAI.

Researcher Bias Check

  • Confirmation bias risk: The researcher's own methodology implements a comprehensive framework, potentially creating motivation to confirm that no one else has done it. Mitigated by: systematic search design, inclusion of partial implementations as evidence, and explicit search for confirming evidence (H1 searches).
  • Selection bias: Commercial leaked prompts (Perplexity, OpenAI) were available while others (Claude, Gemini) were not, creating incomplete coverage of the commercial landscape.

Cross-References

  • Q003 (AI research tools) provides additional context on what structured features these tools actually implement
  • Q002 (unified IC + scientific methodology) establishes the absence of a unified source framework that could serve as basis for such a prompt