R0028/2026-03-26/C022/SRC01/E01¶
Primary evidence supporting the claim assessment.
URL: https://lilt.com/blog/multilingual-llm-performance-gap-analysis
Extract¶
Partially correct. Research confirms significant performance gaps between English and non-English languages in LLMs. The LILT analysis found model limitations drive 72-87% of errors. However, the specific claim that Arabic shows the smallest gap (3 points) is contradicted by evidence showing Arabic actually requires 3x more tokens than English and can collapse to much lower performance. The 3-30 point range is broadly consistent with documented gaps.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Partially supports | Direct evidence |
| H2 | Supports | Direct evidence |
| H3 | Contradicts | Evidence contradicts material wrongness |
Context¶
Evidence gathered 2026-03-26.