R0027/2026-03-26/Q001/S02/R01¶
17-language multilingual evaluation suite for LLMs
Summary¶
| Field | Value |
|---|---|
| Title | BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models |
| URL | https://arxiv.org/html/2502.07346v1 |
| Date accessed | 2026-03-26 |
| Publication date | 2025-02 |
| Author(s) | Xu Huang, Wenhao Zhu, Hanxu Hu, Conghui He, Lei Li, Shujian Huang, Fei Yuan |
| Publication | ICML 2025 |
Selection Decision¶
Included in evidence base: Yes
Rationale: Rigorous 17-language benchmark with human post-editing quality assurance. Covers Japanese, Arabic, Korean, Bengali, Chinese — directly relevant to the named languages.