E01¶


Research	R0028 — Prompt Engineering Claims
Run	2026-03-26
Claim	C022
Source	SRC01
Evidence	SRC01-E01
Type	Factual

Primary evidence supporting the claim assessment.

URL: https://lilt.com/blog/multilingual-llm-performance-gap-analysis

Extract¶

Partially correct. Research confirms significant performance gaps between English and non-English languages in LLMs. The LILT analysis found model limitations drive 72-87% of errors. However, the specific claim that Arabic shows the smallest gap (3 points) is contradicted by evidence showing Arabic actually requires 3x more tokens than English and can collapse to much lower performance. The 3-30 point range is broadly consistent with documented gaps.

Relevance to Hypotheses¶

Hypothesis	Relationship	Strength
H1	Partially supports	Direct evidence
H2	Supports	Direct evidence
H3	Contradicts	Evidence contradicts material wrongness

Context¶

Evidence gathered 2026-03-26.