E01¶


Research	R0027 — Multilingual prompt engineering challenges
Run	2026-03-26
Query	Q001
Source	SRC03
Evidence	SRC03-E01
Type	Statistical

XLT prompting achieves 10+ point improvement and reduces cross-language performance gaps

URL: https://aclanthology.org/2023.findings-emnlp.826/

Extract¶

XLT "brings over 10 points of average improvement in arithmetic reasoning and open-domain question-answering tasks" and "significantly reduces the gap between the average performance and the best performance of each task in different languages." The paper's title — "Not All Languages Are Created Equal in LLMs" — itself confirms that language inequality is a documented phenomenon.

Relevance to Hypotheses¶

Hypothesis	Relationship	Strength
H1	Supports	The existence of a 10+ point gap that can be reduced by prompting technique confirms the gap exists
H2	Contradicts	Clear quantified performance differences across languages
H3	Supports	The gap is reducible through technique choice, confirming conditionality

Context¶

This is a foundational, peer-reviewed paper in the field. It establishes both the problem (language inequality) and a mitigation (XLT prompting). Highly cited in subsequent multilingual LLM research.