Skip to content

R0031/2026-03-29

Research R0031 — Plural Voice Claims (Blind)
Mode Claim
Run date 2026-03-29
Claims 14
Prompt Unified Research Standard 1.0-draft
Model claude-opus-4-6 (1M context)

Blind rerun of 14 claims from the Plural Voice article. 11 claims confirmed (Almost certain or Very likely), 3 confirmed with caveats (Likely — attribution errors or NeurIPS exception). No claims found to be materially wrong.

Claims

C001 — KPMG AI trust statistics — Almost certain (95-99%)

Claim: Public trust in AI sits at 46% globally, with 39% in advanced economies versus 57% in emerging ones (attributed to KPMG/University of Melbourne, 48,340 respondents across 47 countries).

Verdict: All four sub-assertions confirmed by primary sources.

Hypothesis Status Probability
H1: Accurate as stated
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C002 — Ipsos use despite trust — Likely (55-80%)

Claim: 66% of people use AI despite the majority not trusting it (attributed to Ipsos, 31-country survey).

Verdict: Phenomenon real but attribution wrong. The 66% figure comes from KPMG/Melbourne (47 countries), not an Ipsos 31-country survey.

Hypothesis Status Probability
H1: Accurate as stated Eliminated
H2: Partially correct Supported 55-80%
H3: Materially wrong Eliminated

Confidence: Medium · Sources: 3 · Searches: 2

Full analysis

C003 — Workers hide AI use — Almost certain (95-99%)

Claim: 57% of workers hide their AI use at work (attributed to KPMG/University of Melbourne study of over 48,000 workers across 47 countries).

Verdict: Confirmed. KPMG press release states "over half (57%) of employees say they hide their use of AI."

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C004 — Students AI submissions — Very likely (80-95%)

Claim: 22% of students admit to submitting AI-generated content as their own work.

Verdict: Confirmed from BestColleges March 2023 survey (1,000 students). Figure likely outdated — by October 2023, 56% reported AI use on assignments.

Hypothesis Status Probability
H1: Accurate as stated Supported 80-95%
H2: Partially correct Inconclusive
H3: Materially wrong Eliminated

Confidence: Medium · Sources: 1 · Searches: 1

Full analysis

C005 — UK misconduct cases — Almost certain (95-99%)

Claim: UK universities reported over 7,000 formal academic misconduct cases involving AI in a single year (2023-2024).

Verdict: Confirmed. Guardian FOI investigation of 131 universities found ~7,000 AI-related cases, 5.1 per 1,000 students.

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C006 — Journals ban AI author — Likely (55-80%)

Claim: Every major journal and conference — Nature, Science, ACM, IEEE, NeurIPS, and all five major academic publishers — has issued a formal policy prohibiting AI as an author.

Verdict: Mostly correct but NeurIPS does NOT prohibit AI authorship. NeurIPS takes a permissive approach. All other named entities do prohibit it.

Hypothesis Status Probability
H1: Accurate as stated Eliminated
H2: Partially correct Supported 55-80%
H3: Materially wrong Eliminated

Confidence: High · Sources: 3 · Searches: 2

Full analysis

C007 — NeurIPS methodology only — Very likely (80-95%)

Claim: NeurIPS requires AI disclosure only when AI is part of the methodology.

Verdict: Substantially accurate. NeurIPS requires disclosure when LLMs are important to methodology, not for writing/editing. Slightly broader scope than "only methodology."

Hypothesis Status Probability
H1: Accurate as stated Supported 80-95%
H2: Partially correct Inconclusive
H3: Materially wrong Eliminated

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C008 — Science full prompts — Almost certain (95-99%)

Claim: Science requires full prompts to be included in the methods section.

Verdict: Confirmed. Science/AAAS editorial policy requires "the full prompt used in the production of the work, as well as the AI tool and its version."

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C009 — ACM/IEEE acknowledgment — Almost certain (95-99%)

Claim: ACM and IEEE require acknowledgment of AI use with varying specificity.

Verdict: Confirmed. ACM requires Acknowledgements section disclosure. IEEE requires more specific disclosure (AI system, sections, application method).

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C010 — Kurosawa Shakespeare — Almost certain (95-99%)

Claim: Kurosawa's Throne of Blood (1957) is based on Macbeth, The Bad Sleep Well (1960) is based on Hamlet, and Ran (1985) is based on King Lear.

Verdict: All three attributions confirmed. Well-established in film scholarship.

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C011 — IBM Attribution Toolkit — Almost certain (95-99%)

Claim: IBM published an AI Attribution Toolkit in 2025 that captures contribution type, amount, and review process, and describes itself as "a first pass" at a voluntary standard.

Verdict: Confirmed in all respects. Published May 2025. Captures three dimensions. "A first pass" language confirmed verbatim.

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C012 — AIA icon system — Likely (55-80%)

Claim: The AIA icon system was proposed by Avery, Abril, and del Riego with graduated visual indicators for AI involvement levels: Generated, Edited, Suggested (published in Journal of Technology, Innovation & Practice, 2024).

Verdict: Authors, year, and general concept confirmed. Journal name is WRONG — should be "Northwestern Journal of Technology and Intellectual Property." Specific icon labels (Generated, Edited, Suggested) could not be confirmed.

Hypothesis Status Probability
H1: Accurate as stated Eliminated
H2: Partially correct Supported 55-80%
H3: Materially wrong Eliminated

Confidence: Medium · Sources: 1 · Searches: 1

Full analysis

C013 — CHI 2025 AI credit — Almost certain (95-99%)

Claim: Research presented at CHI 2025 (He et al.) found that AI receives less credit than humans for equivalent work.

Verdict: Confirmed. He, Houde, and Weisz (IBM Research) found participants assigned less authorship credit to AI than humans for equivalent contributions.

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C014 — CRediT taxonomy — Almost certain (95-99%)

Claim: The CRediT taxonomy (NISO Z39.104-2022) covers 14 types of human contribution but has no provision for AI.

Verdict: Confirmed. Exactly 14 roles. No mention of AI in the official documentation.

Hypothesis Status Probability
H1: Accurate as stated Supported 95-99%
H2: Partially correct Eliminated
H3: Materially wrong Eliminated

Confidence: High · Sources: 1 · Searches: 1

Full analysis


Collection Analysis

Cross-Cutting Patterns

Pattern Claims Affected Significance
Attribution errors C002, C012 Two claims have incorrect source attribution — one misattributes a statistic to the wrong survey organization, another gets the journal name wrong
NeurIPS exception pattern C006, C007 NeurIPS consistently takes a more permissive approach to AI than other venues — this is relevant across multiple claims
KPMG/Melbourne as primary source C001, C002, C003 Three claims draw from the same KPMG/Melbourne 2025 study — source concentration risk
Temporal decay C004 The 22% student figure from March 2023 is likely severely outdated

Collection Statistics

Metric Value
Claims investigated 14
Fully confirmed (Almost certain) 9 (C001, C003, C005, C008, C009, C010, C011, C013, C014)
Confirmed with nuance (Very likely) 2 (C004, C007)
Confirmed with caveats (Likely) 3 (C002, C006, C012)
Materially wrong 0

Source Independence Assessment

The evidence base shows moderate source concentration. Three claims (C001, C002, C003) rely heavily on the KPMG/University of Melbourne 2025 study. This is appropriate because the claims explicitly cite this study, but it means a retraction or correction of the KPMG study would affect multiple claims simultaneously.

The academic publishing policy claims (C006-C009) are well-diversified — each venue's policy was verified independently against its own official documentation.

The IBM Research connection spans two claims (C011, C013) — both the AI Attribution Toolkit and the CHI 2025 credit study are IBM Research products. This is not a bias concern but represents the same research group working across related topics.

Collection Gaps

Gap Impact Mitigation
Full PDF of KPMG/Melbourne report not reviewed Low Key statistics confirmed via HTML pages
Ipsos AI Monitor editions not exhaustively reviewed Medium Could not conclusively rule out a 31-country Ipsos report with 66% figure
AIA paper full text not accessed Medium Journal name confirmed but specific icon labels unverified
Temporal decay of student AI usage data High The 22% figure (March 2023) is likely severely outdated
Science editorial policy page returned 403 Low Policy confirmed via multiple secondary sources

Collection Self-Audit

Domain Rating Notes
Eligibility criteria Low risk Consistent criteria applied across all 14 claims
Search comprehensiveness Some concerns Some claims could benefit from additional searches; constrained by tool access
Evaluation consistency Low risk Same scoring framework applied to all sources
Synthesis fairness Low risk Attribution errors surfaced and prominently reported despite researcher's interest in claims being correct

Resources

Summary

Metric Value
Claims investigated 14
Files produced 335
Sources scored 24
Evidence extracts 24
Results dispositioned 32 selected + 128 rejected = 160 total
Duration (wall clock) 21m 14s
Tool uses (total) 115

Tool Breakdown

Tool Uses Purpose
WebSearch 16 Search queries across all claims
WebFetch 10 Page content retrieval for key sources
Write ~50 File creation (core analytical files)
Read 2 Reading methodology and output format specs
Edit 0 No edits needed
Bash ~15 Directory creation, batch file generation

Token Distribution

Category Tokens
Input (context) ~150,000
Output (generation) ~80,000
Total ~230,000