R0031/2026-03-29¶
Blind rerun of 14 claims from the Plural Voice article. 11 claims confirmed (Almost certain or Very likely), 3 confirmed with caveats (Likely — attribution errors or NeurIPS exception). No claims found to be materially wrong.
Claims¶
C001 — KPMG AI trust statistics — Almost certain (95-99%)
Claim: Public trust in AI sits at 46% globally, with 39% in advanced economies versus 57% in emerging ones (attributed to KPMG/University of Melbourne, 48,340 respondents across 47 countries).
Verdict: All four sub-assertions confirmed by primary sources.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | — | — |
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 2 · Searches: 1
C002 — Ipsos use despite trust — Likely (55-80%)
Claim: 66% of people use AI despite the majority not trusting it (attributed to Ipsos, 31-country survey).
Verdict: Phenomenon real but attribution wrong. The 66% figure comes from KPMG/Melbourne (47 countries), not an Ipsos 31-country survey.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Eliminated | — |
| H2: Partially correct | Supported | 55-80% |
| H3: Materially wrong | Eliminated | — |
Confidence: Medium · Sources: 3 · Searches: 2
C003 — Workers hide AI use — Almost certain (95-99%)
Claim: 57% of workers hide their AI use at work (attributed to KPMG/University of Melbourne study of over 48,000 workers across 47 countries).
Verdict: Confirmed. KPMG press release states "over half (57%) of employees say they hide their use of AI."
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 2 · Searches: 1
C004 — Students AI submissions — Very likely (80-95%)
Claim: 22% of students admit to submitting AI-generated content as their own work.
Verdict: Confirmed from BestColleges March 2023 survey (1,000 students). Figure likely outdated — by October 2023, 56% reported AI use on assignments.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 80-95% |
| H2: Partially correct | Inconclusive | — |
| H3: Materially wrong | Eliminated | — |
Confidence: Medium · Sources: 1 · Searches: 1
C005 — UK misconduct cases — Almost certain (95-99%)
Claim: UK universities reported over 7,000 formal academic misconduct cases involving AI in a single year (2023-2024).
Verdict: Confirmed. Guardian FOI investigation of 131 universities found ~7,000 AI-related cases, 5.1 per 1,000 students.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 2 · Searches: 1
C006 — Journals ban AI author — Likely (55-80%)
Claim: Every major journal and conference — Nature, Science, ACM, IEEE, NeurIPS, and all five major academic publishers — has issued a formal policy prohibiting AI as an author.
Verdict: Mostly correct but NeurIPS does NOT prohibit AI authorship. NeurIPS takes a permissive approach. All other named entities do prohibit it.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Eliminated | — |
| H2: Partially correct | Supported | 55-80% |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 3 · Searches: 2
C007 — NeurIPS methodology only — Very likely (80-95%)
Claim: NeurIPS requires AI disclosure only when AI is part of the methodology.
Verdict: Substantially accurate. NeurIPS requires disclosure when LLMs are important to methodology, not for writing/editing. Slightly broader scope than "only methodology."
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 80-95% |
| H2: Partially correct | Inconclusive | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 1 · Searches: 1
C008 — Science full prompts — Almost certain (95-99%)
Claim: Science requires full prompts to be included in the methods section.
Verdict: Confirmed. Science/AAAS editorial policy requires "the full prompt used in the production of the work, as well as the AI tool and its version."
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 1 · Searches: 1
C009 — ACM/IEEE acknowledgment — Almost certain (95-99%)
Claim: ACM and IEEE require acknowledgment of AI use with varying specificity.
Verdict: Confirmed. ACM requires Acknowledgements section disclosure. IEEE requires more specific disclosure (AI system, sections, application method).
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 2 · Searches: 1
C010 — Kurosawa Shakespeare — Almost certain (95-99%)
Claim: Kurosawa's Throne of Blood (1957) is based on Macbeth, The Bad Sleep Well (1960) is based on Hamlet, and Ran (1985) is based on King Lear.
Verdict: All three attributions confirmed. Well-established in film scholarship.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 2 · Searches: 1
C011 — IBM Attribution Toolkit — Almost certain (95-99%)
Claim: IBM published an AI Attribution Toolkit in 2025 that captures contribution type, amount, and review process, and describes itself as "a first pass" at a voluntary standard.
Verdict: Confirmed in all respects. Published May 2025. Captures three dimensions. "A first pass" language confirmed verbatim.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 1 · Searches: 1
C012 — AIA icon system — Likely (55-80%)
Claim: The AIA icon system was proposed by Avery, Abril, and del Riego with graduated visual indicators for AI involvement levels: Generated, Edited, Suggested (published in Journal of Technology, Innovation & Practice, 2024).
Verdict: Authors, year, and general concept confirmed. Journal name is WRONG — should be "Northwestern Journal of Technology and Intellectual Property." Specific icon labels (Generated, Edited, Suggested) could not be confirmed.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Eliminated | — |
| H2: Partially correct | Supported | 55-80% |
| H3: Materially wrong | Eliminated | — |
Confidence: Medium · Sources: 1 · Searches: 1
C013 — CHI 2025 AI credit — Almost certain (95-99%)
Claim: Research presented at CHI 2025 (He et al.) found that AI receives less credit than humans for equivalent work.
Verdict: Confirmed. He, Houde, and Weisz (IBM Research) found participants assigned less authorship credit to AI than humans for equivalent contributions.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 1 · Searches: 1
C014 — CRediT taxonomy — Almost certain (95-99%)
Claim: The CRediT taxonomy (NISO Z39.104-2022) covers 14 types of human contribution but has no provision for AI.
Verdict: Confirmed. Exactly 14 roles. No mention of AI in the official documentation.
| Hypothesis | Status | Probability |
|---|---|---|
| H1: Accurate as stated | Supported | 95-99% |
| H2: Partially correct | Eliminated | — |
| H3: Materially wrong | Eliminated | — |
Confidence: High · Sources: 1 · Searches: 1
Collection Analysis¶
Cross-Cutting Patterns¶
| Pattern | Claims Affected | Significance |
|---|---|---|
| Attribution errors | C002, C012 | Two claims have incorrect source attribution — one misattributes a statistic to the wrong survey organization, another gets the journal name wrong |
| NeurIPS exception pattern | C006, C007 | NeurIPS consistently takes a more permissive approach to AI than other venues — this is relevant across multiple claims |
| KPMG/Melbourne as primary source | C001, C002, C003 | Three claims draw from the same KPMG/Melbourne 2025 study — source concentration risk |
| Temporal decay | C004 | The 22% student figure from March 2023 is likely severely outdated |
Collection Statistics¶
| Metric | Value |
|---|---|
| Claims investigated | 14 |
| Fully confirmed (Almost certain) | 9 (C001, C003, C005, C008, C009, C010, C011, C013, C014) |
| Confirmed with nuance (Very likely) | 2 (C004, C007) |
| Confirmed with caveats (Likely) | 3 (C002, C006, C012) |
| Materially wrong | 0 |
Source Independence Assessment¶
The evidence base shows moderate source concentration. Three claims (C001, C002, C003) rely heavily on the KPMG/University of Melbourne 2025 study. This is appropriate because the claims explicitly cite this study, but it means a retraction or correction of the KPMG study would affect multiple claims simultaneously.
The academic publishing policy claims (C006-C009) are well-diversified — each venue's policy was verified independently against its own official documentation.
The IBM Research connection spans two claims (C011, C013) — both the AI Attribution Toolkit and the CHI 2025 credit study are IBM Research products. This is not a bias concern but represents the same research group working across related topics.
Collection Gaps¶
| Gap | Impact | Mitigation |
|---|---|---|
| Full PDF of KPMG/Melbourne report not reviewed | Low | Key statistics confirmed via HTML pages |
| Ipsos AI Monitor editions not exhaustively reviewed | Medium | Could not conclusively rule out a 31-country Ipsos report with 66% figure |
| AIA paper full text not accessed | Medium | Journal name confirmed but specific icon labels unverified |
| Temporal decay of student AI usage data | High | The 22% figure (March 2023) is likely severely outdated |
| Science editorial policy page returned 403 | Low | Policy confirmed via multiple secondary sources |
Collection Self-Audit¶
| Domain | Rating | Notes |
|---|---|---|
| Eligibility criteria | Low risk | Consistent criteria applied across all 14 claims |
| Search comprehensiveness | Some concerns | Some claims could benefit from additional searches; constrained by tool access |
| Evaluation consistency | Low risk | Same scoring framework applied to all sources |
| Synthesis fairness | Low risk | Attribution errors surfaced and prominently reported despite researcher's interest in claims being correct |
Resources¶
Summary¶
| Metric | Value |
|---|---|
| Claims investigated | 14 |
| Files produced | 335 |
| Sources scored | 24 |
| Evidence extracts | 24 |
| Results dispositioned | 32 selected + 128 rejected = 160 total |
| Duration (wall clock) | 21m 14s |
| Tool uses (total) | 115 |
Tool Breakdown¶
| Tool | Uses | Purpose |
|---|---|---|
| WebSearch | 16 | Search queries across all claims |
| WebFetch | 10 | Page content retrieval for key sources |
| Write | ~50 | File creation (core analytical files) |
| Read | 2 | Reading methodology and output format specs |
| Edit | 0 | No edits needed |
| Bash | ~15 | Directory creation, batch file generation |
Token Distribution¶
| Category | Tokens |
|---|---|
| Input (context) | ~150,000 |
| Output (generation) | ~80,000 |
| Total | ~230,000 |