R0027/2026-03-26/Q001/SRC07/E01¶
Per-language accuracy data for Hindi, Mandarin, and Arabic — all below English
URL: https://arxiv.org/html/2504.17720v2
Extract¶
Average performance across educational tasks: English 70.9%, Arabic 67.4%, German 66.8%, Farsi 66.2%, Mandarin 64.6%, Hindi 63.1%, Czech 55.3%, Telugu 49.7%. Performance correlates with CommonCrawl representation: Telugu (0.02% of CommonCrawl) showed the poorest results.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Supports | Quantifies the gap for the specific languages asked about in Q001 |
| H2 | Contradicts | Clear numerical differences across all tested languages |
| H3 | Supports | Gap magnitude varies: Arabic is closer to English (3.5pp) than Hindi (7.8pp) or Telugu (21.2pp) |
Context¶
This study is particularly valuable because it tests Hindi, Mandarin, and Arabic — three of the four languages specifically named in Q001.