R0031/2026-03-29¶


Research	R0031 — Plural Voice Claims (Blind)
Mode	Claim
Run date	2026-03-29
Claims	14
Prompt	Unified Research Standard 1.0-draft
Model	claude-opus-4-6 (1M context)

Blind rerun of 14 claims from the Plural Voice article. 11 claims confirmed (Almost certain or Very likely), 3 confirmed with caveats (Likely — attribution errors or NeurIPS exception). No claims found to be materially wrong.

Claims¶

C001 — KPMG AI trust statistics — Almost certain (95-99%)

Claim: Public trust in AI sits at 46% globally, with 39% in advanced economies versus 57% in emerging ones (attributed to KPMG/University of Melbourne, 48,340 respondents across 47 countries).

Verdict: All four sub-assertions confirmed by primary sources.

Hypothesis	Status	Probability
H1: Accurate as stated	—	—
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C002 — Ipsos use despite trust — Likely (55-80%)

Claim: 66% of people use AI despite the majority not trusting it (attributed to Ipsos, 31-country survey).

Verdict: Phenomenon real but attribution wrong. The 66% figure comes from KPMG/Melbourne (47 countries), not an Ipsos 31-country survey.

Hypothesis	Status	Probability
H1: Accurate as stated	Eliminated	—
H2: Partially correct	Supported	55-80%
H3: Materially wrong	Eliminated	—

Confidence: Medium · Sources: 3 · Searches: 2

Full analysis

C003 — Workers hide AI use — Almost certain (95-99%)

Claim: 57% of workers hide their AI use at work (attributed to KPMG/University of Melbourne study of over 48,000 workers across 47 countries).

Verdict: Confirmed. KPMG press release states "over half (57%) of employees say they hide their use of AI."

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C004 — Students AI submissions — Very likely (80-95%)

Claim: 22% of students admit to submitting AI-generated content as their own work.

Verdict: Confirmed from BestColleges March 2023 survey (1,000 students). Figure likely outdated — by October 2023, 56% reported AI use on assignments.

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	80-95%
H2: Partially correct	Inconclusive	—
H3: Materially wrong	Eliminated	—

Confidence: Medium · Sources: 1 · Searches: 1

Full analysis

C005 — UK misconduct cases — Almost certain (95-99%)

Claim: UK universities reported over 7,000 formal academic misconduct cases involving AI in a single year (2023-2024).

Verdict: Confirmed. Guardian FOI investigation of 131 universities found ~7,000 AI-related cases, 5.1 per 1,000 students.

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C006 — Journals ban AI author — Likely (55-80%)

Claim: Every major journal and conference — Nature, Science, ACM, IEEE, NeurIPS, and all five major academic publishers — has issued a formal policy prohibiting AI as an author.

Verdict: Mostly correct but NeurIPS does NOT prohibit AI authorship. NeurIPS takes a permissive approach. All other named entities do prohibit it.

Hypothesis	Status	Probability
H1: Accurate as stated	Eliminated	—
H2: Partially correct	Supported	55-80%
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 3 · Searches: 2

Full analysis

C007 — NeurIPS methodology only — Very likely (80-95%)

Claim: NeurIPS requires AI disclosure only when AI is part of the methodology.

Verdict: Substantially accurate. NeurIPS requires disclosure when LLMs are important to methodology, not for writing/editing. Slightly broader scope than "only methodology."

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	80-95%
H2: Partially correct	Inconclusive	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C008 — Science full prompts — Almost certain (95-99%)

Claim: Science requires full prompts to be included in the methods section.

Verdict: Confirmed. Science/AAAS editorial policy requires "the full prompt used in the production of the work, as well as the AI tool and its version."

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C009 — ACM/IEEE acknowledgment — Almost certain (95-99%)

Claim: ACM and IEEE require acknowledgment of AI use with varying specificity.

Verdict: Confirmed. ACM requires Acknowledgements section disclosure. IEEE requires more specific disclosure (AI system, sections, application method).

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C010 — Kurosawa Shakespeare — Almost certain (95-99%)

Claim: Kurosawa's Throne of Blood (1957) is based on Macbeth, The Bad Sleep Well (1960) is based on Hamlet, and Ran (1985) is based on King Lear.

Verdict: All three attributions confirmed. Well-established in film scholarship.

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 2 · Searches: 1

Full analysis

C011 — IBM Attribution Toolkit — Almost certain (95-99%)

Claim: IBM published an AI Attribution Toolkit in 2025 that captures contribution type, amount, and review process, and describes itself as "a first pass" at a voluntary standard.

Verdict: Confirmed in all respects. Published May 2025. Captures three dimensions. "A first pass" language confirmed verbatim.

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C012 — AIA icon system — Likely (55-80%)

Claim: The AIA icon system was proposed by Avery, Abril, and del Riego with graduated visual indicators for AI involvement levels: Generated, Edited, Suggested (published in Journal of Technology, Innovation & Practice, 2024).

Verdict: Authors, year, and general concept confirmed. Journal name is WRONG — should be "Northwestern Journal of Technology and Intellectual Property." Specific icon labels (Generated, Edited, Suggested) could not be confirmed.

Hypothesis	Status	Probability
H1: Accurate as stated	Eliminated	—
H2: Partially correct	Supported	55-80%
H3: Materially wrong	Eliminated	—

Confidence: Medium · Sources: 1 · Searches: 1

Full analysis

C013 — CHI 2025 AI credit — Almost certain (95-99%)

Claim: Research presented at CHI 2025 (He et al.) found that AI receives less credit than humans for equivalent work.

Verdict: Confirmed. He, Houde, and Weisz (IBM Research) found participants assigned less authorship credit to AI than humans for equivalent contributions.

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 1 · Searches: 1

Full analysis

C014 — CRediT taxonomy — Almost certain (95-99%)

Claim: The CRediT taxonomy (NISO Z39.104-2022) covers 14 types of human contribution but has no provision for AI.

Verdict: Confirmed. Exactly 14 roles. No mention of AI in the official documentation.

Hypothesis	Status	Probability
H1: Accurate as stated	Supported	95-99%
H2: Partially correct	Eliminated	—
H3: Materially wrong	Eliminated	—

Confidence: High · Sources: 1 · Searches: 1

Full analysis

Collection Analysis¶

Cross-Cutting Patterns¶

Pattern	Claims Affected	Significance
Attribution errors	C002, C012	Two claims have incorrect source attribution — one misattributes a statistic to the wrong survey organization, another gets the journal name wrong
NeurIPS exception pattern	C006, C007	NeurIPS consistently takes a more permissive approach to AI than other venues — this is relevant across multiple claims
KPMG/Melbourne as primary source	C001, C002, C003	Three claims draw from the same KPMG/Melbourne 2025 study — source concentration risk
Temporal decay	C004	The 22% student figure from March 2023 is likely severely outdated

Collection Statistics¶

Metric	Value
Claims investigated	14
Fully confirmed (Almost certain)	9 (C001, C003, C005, C008, C009, C010, C011, C013, C014)
Confirmed with nuance (Very likely)	2 (C004, C007)
Confirmed with caveats (Likely)	3 (C002, C006, C012)
Materially wrong	0

Source Independence Assessment¶

The evidence base shows moderate source concentration. Three claims (C001, C002, C003) rely heavily on the KPMG/University of Melbourne 2025 study. This is appropriate because the claims explicitly cite this study, but it means a retraction or correction of the KPMG study would affect multiple claims simultaneously.

The academic publishing policy claims (C006-C009) are well-diversified — each venue's policy was verified independently against its own official documentation.

The IBM Research connection spans two claims (C011, C013) — both the AI Attribution Toolkit and the CHI 2025 credit study are IBM Research products. This is not a bias concern but represents the same research group working across related topics.

Collection Gaps¶

Gap	Impact	Mitigation
Full PDF of KPMG/Melbourne report not reviewed	Low	Key statistics confirmed via HTML pages
Ipsos AI Monitor editions not exhaustively reviewed	Medium	Could not conclusively rule out a 31-country Ipsos report with 66% figure
AIA paper full text not accessed	Medium	Journal name confirmed but specific icon labels unverified
Temporal decay of student AI usage data	High	The 22% figure (March 2023) is likely severely outdated
Science editorial policy page returned 403	Low	Policy confirmed via multiple secondary sources

Collection Self-Audit¶

Domain	Rating	Notes
Eligibility criteria	Low risk	Consistent criteria applied across all 14 claims
Search comprehensiveness	Some concerns	Some claims could benefit from additional searches; constrained by tool access
Evaluation consistency	Low risk	Same scoring framework applied to all sources
Synthesis fairness	Low risk	Attribution errors surfaced and prominently reported despite researcher's interest in claims being correct

Resources¶

Summary¶

Metric	Value
Claims investigated	14
Files produced	335
Sources scored	24
Evidence extracts	24
Results dispositioned	32 selected + 128 rejected = 160 total
Duration (wall clock)	21m 14s
Tool uses (total)	115

Tool Breakdown¶

Tool	Uses	Purpose
WebSearch	16	Search queries across all claims
WebFetch	10	Page content retrieval for key sources
Write	~50	File creation (core analytical files)
Read	2	Reading methodology and output format specs
Edit	0	No edits needed
Bash	~15	Directory creation, batch file generation

Token Distribution¶

Category	Tokens
Input (context)	~150,000
Output (generation)	~80,000
Total	~230,000