Reliability: HIGH — Peer-reviewed journal article in a leading journalism
studies journal; institutional affiliation (LSE).
Relevance: DIRECT — Proposes an epistemological framework for
fact-checking; directly addresses the query.
Bias assessment:
Funding: Not disclosed in search results.
Institutional: Academic (LSE); no apparent industry ties.
Ideological: Supportive of fact-checking as practice while acknowledging
critiques.
Selection: Draws on existing epistemological critiques.
Temporal: Current (2025).
Geographic: European perspective.
Key contribution: Defines three epistemological challenges (objectivism,
truth regimes, causal relations) across five aspects of fact-checking. This is
a DESCRIPTIVE framework analyzing existing epistemological tensions, NOT a
PRESCRIPTIVE evidence evaluation methodology comparable to GRADE.
Key contribution: Argues fact-checking methods "fail to stand up to the
rigors of scientific inquiry" — implicitly identifies the absence of formal
epistemological standards.
Reliability: HIGH — Peer-reviewed; based on participatory observation,
interviews, and textual analysis.
Relevance: HIGH — Demonstrates how time pressure pushes fact-checking
toward "confirmative epistemology" reliant on predefined source credibility
rather than formal evidence evaluation.
Bias assessment:
Funding: Not disclosed.
Institutional: Academic (OsloMet, Kristiania).
Ideological: Critical of live fact-checking but supportive of fact-checking
in principle.
Selection: Single case study (Faktisk.no, Norway).
Temporal: Based on 2021 data, published 2023-2024.
Geographic: Norwegian context.
Key contribution: Shows that under pressure, fact-checkers default to
source-authority heuristics rather than structured evidence evaluation — no
formal framework existed to fall back on.
Reliability: HIGH — Peer-reviewed; based on 13 in-depth interviews and
participant observation.
Relevance: HIGH — Documents fact-checkers adopting epistemic practices
from OSINT investigators and intelligence actors, introducing "substantiated
verification" that transcends the true/false paradigm.
Bias assessment:
Funding: Not disclosed.
Institutional: Academic.
Ideological: Sympathetic to practice improvement.
Selection: Single case (Faktisk Verifiserbar, Norway).
Temporal: Current (2026).
Geographic: Norwegian context.
Key contribution: IMPORTANT NUANCE — Shows convergence of intelligence
and journalism epistemic practices in visual verification. This is the closest
evidence to fact-checking borrowing formal methodology from adjacent
disciplines, but it is domain-specific (visual verification) and ad hoc
rather than a formal, codified framework.
Reliability: HIGH — Peer-reviewed; large empirical dataset (22,349
articles).
Relevance: MODERATE — Demonstrates rating disagreement between
fact-checkers, implying absence of shared evidence evaluation standards.
Bias assessment:
Funding: Not disclosed in search results.
Institutional: Academic (Penn State).
Ideological: Neutral empirical study.
Selection: Limited to Snopes and PolitiFact (US-centric).
Temporal: Data from 2016-2022.
Geographic: US only.
Key contribution: Found only 1 conflicting bottom-line verdict among 749
matching claims, but disagreement was higher in the ambiguous middle range —
exactly where a formal evidence grading system would be most needed.
Reliability: HIGH — Peer-reviewed survey in top NLP venue; widely cited.
Relevance: HIGH — Comprehensive survey of automated fact-checking
pipelines; describes the three-stage framework (claim detection, evidence
retrieval, claim verification) but no evidence quality grading.
Bias assessment:
Institutional: Academic (Cambridge).
Ideological: Technical/neutral.
Temporal: 2022; field has advanced since.
Geographic: International scope.
Key contribution: The automated fact-checking pipeline focuses on verdict
prediction and justification production, but does NOT include any
GRADE-comparable evidence quality assessment stage. Evidence is retrieved and
used, but its quality/reliability is not formally graded.
Reliability: HIGH — Established shared task series.
Relevance: MODERATE — Shows current state of computational fact-checking
tasks; no evidence quality grading task exists.
Key contribution: The shared task pipeline includes check-worthiness,
previously fact-checked claim detection, evidence retrieval, and claim
verification — but no evidence quality assessment subtask.
Reliability: HIGH — The primary industry standard; 31 assessment criteria
reviewed by independent assessors.
Relevance: DIRECT — The closest thing to a shared fact-checking
methodology standard.
Bias assessment:
Funding: Poynter Institute.
Institutional: Industry body.
Ideological: Pro-fact-checking.
Geographic: International.
Key contribution: CRITICAL FINDING — The IFCN Code addresses process
transparency (5 commitments: nonpartisanship, source transparency, funding
transparency, methodology transparency, corrections policy). It does NOT
include: hierarchical evidence quality scales, calibrated confidence language,
structured bias assessment of sources, or source reliability tiering. It
prescribes WHAT to disclose, not HOW to evaluate evidence quality.
Reliability: HIGH — Peer-reviewed rebuttal in the same journal.
Relevance: HIGH — Defends fact-checking epistemology against Uscinski &
Butler but does not propose a formal evidence evaluation framework.
Key contribution: Argues that dedicated fact-checkers (PolitiFact,
FactCheck.org) use more rigorous methods than Uscinski & Butler's sample
suggests, but the defense is based on professional practice norms, not a
formal evidence grading system.
Relevance: MODERATE — Structured data schema for encoding fact-check
verdicts, not for evaluating evidence quality.
Key contribution: ClaimReview captures WHAT was claimed, WHO claimed it,
and WHAT the verdict is. It does not capture HOW the evidence was evaluated,
what quality the evidence was, or what confidence level applies to the
verdict. Each fact-checker supplies their own rating system.
Key contribution: Confirms that ClaimReview requires a textual verdict
(reviewRating with bestRating/worstRating) but each publisher defines their
own scale.
Reliability: HIGH — Official US Government directive.
Relevance: REFERENCE — The comparison benchmark. Nine tradecraft
standards including source credibility description, uncertainty expression,
assumption identification, alternative analysis, and logic/reasoning.
Key contribution: Establishes the standard that fact-checking lacks: formal
requirements for source credibility description, calibrated uncertainty
language, explicit assumption identification, and consideration of
alternatives.
Reliability: HIGH — Peer-reviewed (ACL Findings); comprehensive survey.
Relevance: HIGH — Explicitly notes that no datasets account for
"differing levels and strength of evidence."
Key contribution: CRITICAL — Identifies as a gap that existing
fact-checking resources do not consider "disagreeing evidence, or differing
levels and strength of evidence" — exactly what GRADE provides in medicine.
Relevance: MODERATE — Documents inter-rater disagreement between
fact-checkers.
Key contribution: Demonstrates that fact-checkers can reach different
conclusions on the same claim, particularly in ambiguous cases — evidence
that no shared evidence evaluation standard exists.
Reliability: HIGH — Peer-reviewed; broad sample (40 organizations, 50+
countries).
Relevance: DIRECT — Examines the actual epistemology fact-checkers employ.
Bias assessment:
Selection: Large cross-national sample.
Ideological: Neutral/descriptive.
Key contribution: Found "isomorphic norms, practices, and structures" but
these are INFORMAL shared beliefs (confidence in objective truth, transparent
process, reproducibility), not a FORMAL codified framework. Fact-checkers
share an epistemology through institutional isomorphism, not through an
explicit standard.
Reliability: HIGH — Established academic research center.
Relevance: MODERATE — Documents 417+ active fact-checkers globally with
diverse rating scales.
Key contribution: The diversity of rating scales across 417+ organizations
(from PolitiFact's six-point scale to India Today's three-crow system)
demonstrates the absence of any standardized evidence evaluation framework.