R0027/2026-03-26/Q002 — ACH Matrix¶
Matrix¶
| H1: Structure is primary challenge | H2: Computation is primary, structure secondary | H3: Structure causes challenges via computational mediation | |
|---|---|---|---|
| SRC01-E01: Linguistic features influence effectiveness | ++ | - | + |
| SRC01-E02: Japanese subjects, Arabic gender, Finnish cases | ++ | - | + |
| SRC02-E01: Arabic complexity defeats Arabic-centric models | + | + | ++ |
| SRC03-E01: Tokenization fertility predicts accuracy (8-18pp) | + | ++ | ++ |
| SRC04-E01: ~2% language nuances, 72-87% model limitations | -- | ++ | ++ |
| SRC05-E01: Agglutinative languages break tokenization | + | + | ++ |
Legend:
++Strongly supports+Supports--Strongly contradicts-ContradictsN/ANot applicable to this hypothesis
Diagnosticity Analysis¶
Most Diagnostic Evidence¶
| Evidence ID | Why Diagnostic |
|---|---|
| SRC04-E01 | Strongly contradicts H1 while strongly supporting H2 and H3 — the ~2% vs 72-87% split is the most discriminating finding |
| SRC02-E01 | Supports H3 over H1 — if the challenge were structural, Arabic-centric models should handle Arabic better, but they do not |
Least Diagnostic Evidence¶
| Evidence ID | Why Non-Diagnostic |
|---|---|
| SRC01-E01 | Supports all three hypotheses to some degree — confirms challenges exist but does not identify the mechanism |
Outcome¶
Hypothesis supported: H3 — Linguistic structure causes challenges, but primarily through tokenization and training data mediation
Hypotheses eliminated: None fully eliminated — H1 and H2 each capture part of the picture
Hypotheses inconclusive: H1 (partially supported — challenges exist but mechanism is computational); H2 (partially supported — computation dominates but linguistic structure is the upstream cause)