SEA Performance
Overall SEA Average
Average of 30 bootstraps. 95% CI are shown.
Model Size: ≤200B
Open instruct models only
![]() ![]() 32B 60.63±0.06 |
![]() ![]() 80B MoE 60.55±0.05 |
![]() ![]() 27B 59.74±0.06 |
![]() ![]() 27B 59.63±0.06 |
![]() ![]() 32B 58.40±0.06 |
![]() ![]() 70B 57.44±0.07 |
![]() ![]() 12B 56.55±0.06 |
![]() ![]() 109B MoE 55.67±0.04 |
![]() ![]() 30B MoE 55.54±0.05 |
![]() ![]() 70B 54.47±0.08 |
![]() ![]() 70B 53.09±0.05 |
![]() ![]() 14B 53.02±0.05 |
![]() ![]() 72B 52.81±0.05 |
![]() ![]() 27B 50.91±0.07 |
![]() ![]() 32B 49.89±0.06 |
![]() ![]() 9B 49.75±0.07 |
![]() ![]() 70B 49.41±0.07 |
![]() ![]() 123B 48.86±0.09 |
![]() ![]() 111B 48.02±0.08 |
![]() ![]() 8B 46.67±0.05 |
![]() ![]() 9B 44.52±0.07 |
![]() ![]() 14B 43.58±0.07 |
![]() ![]() 8B 43.42±0.09 |
![]() ![]() 10B 43.32±0.08 |
![]() ![]() 32B 41.03±0.07 |
![]() ![]() 21B MoE 40.30±0.10 |
![]() ![]() 70B 39.22±0.05 |
![]() ![]() 20B 36.93±0.07 |
![]() ![]() 104B 36.63±0.09 |
![]() ![]() 7B 36.20±0.05 |
![]() ![]() 8B 35.81±0.07 |
![]() ![]() 32B 35.08±0.09 |
![]() ![]() 8B 33.87±0.06 |
![]() ![]() 32B 33.84±0.08 |
![]() ![]() 24B 33.71±0.10 |
![]() ![]() 8B 32.78±0.08 |
![]() ![]() 70B 31.25±0.11 |
![]() ![]() 14B 30.21±0.11 |
![]() ![]() 8B 29.53±0.06 |
![]() ![]() 83B 27.53±0.12 |
![]() ![]() 8B 26.45±0.11 |
![]() ![]() 9B 25.93±0.07 |
![]() ![]() 8B 25.41±0.04 |
![]() ![]() 7B 24.90±0.08 |
![]() ![]() 13B 22.61±0.06 |
![]() ![]() 7B 20.29±0.10 |
![]() ![]() 8B 18.35±0.10 |
![]() ![]() 7B 15.11±0.06 |
Language Performance by Model
Average of 30 bootstraps. 95% CI are shown.
Model Size: ≤200B
Open instruct models only
Model | SEA | MY | TL | ID | MS | TA | TH | VI | EN |
---|---|---|---|---|---|---|---|---|---|
![]() ![]() SEA-LION v4 (Qwen) 32B AISG | 60.63 ± 0.06 | 48.28 ± 0.14 | 65.35 ± 0.14 | 66.59 ± 0.10 | 61.36 ± 0.14 | 62.30 ± 0.15 | 57.91 ± 0.13 | 62.63 ± 0.14 | 67.10 ± 0.17 |
![]() ![]() Qwen 3 Next 80B MoE Alibaba | 60.55 ± 0.05 | 43.68 ± 0.16 | 66.48 ± 0.13 | 67.11 ± 0.10 | 62.80 ± 0.12 | 60.05 ± 0.13 | 58.09 ± 0.09 | 65.68 ± 0.10 | 67.31 ± 0.10 |
![]() ![]() SEA-LION v4 (Gemma) 27B AISG | 59.74 ± 0.06 | 46.50 ± 0.16 | 68.10 ± 0.14 | 64.33 ± 0.14 | 61.10 ± 0.16 | 64.43 ± 0.16 | 53.46 ± 0.14 | 60.26 ± 0.17 | 63.68 ± 0.21 |
![]() ![]() Gemma 3 27B | 59.63 ± 0.06 | 47.36 ± 0.17 | 67.70 ± 0.12 | 64.12 ± 0.15 | 60.92 ± 0.17 | 64.36 ± 0.22 | 52.88 ± 0.12 | 60.06 ± 0.17 | 63.55 ± 0.17 |
![]() ![]() Qwen 3 32B Alibaba | 58.40 ± 0.06 | 43.18 ± 0.19 | 62.23 ± 0.13 | 65.67 ± 0.11 | 59.67 ± 0.17 | 58.88 ± 0.24 | 56.98 ± 0.15 | 62.19 ± 0.12 | 68.02 ± 0.15 |
![]() ![]() SEA-LION v3 (Llama) 70B AISG | 57.44 ± 0.07 | 36.78 ± 0.28 | 66.38 ± 0.17 | 64.04 ± 0.18 | 59.94 ± 0.17 | 57.99 ± 0.21 | 54.60 ± 0.15 | 62.37 ± 0.23 | 65.20 ± 0.18 |
![]() ![]() Gemma 3 12B | 56.55 ± 0.06 | 41.42 ± 0.15 | 65.00 ± 0.11 | 61.80 ± 0.11 | 57.96 ± 0.12 | 59.86 ± 0.21 | 50.75 ± 0.13 | 59.07 ± 0.15 | 58.06 ± 0.21 |
![]() ![]() Llama 4 Scout 109B MoE Meta | 55.67 ± 0.04 | 44.27 ± 0.17 | 61.84 ± 0.11 | 61.27 ± 0.11 | 57.38 ± 0.11 | 58.69 ± 0.15 | 48.44 ± 0.08 | 57.78 ± 0.09 | 63.86 ± 0.14 |
![]() ![]() Qwen 3 30B MoE Alibaba | 55.54 ± 0.05 | 25.55 ± 0.12 | 61.39 ± 0.12 | 63.29 ± 0.10 | 61.13 ± 0.14 | 56.06 ± 0.16 | 55.77 ± 0.11 | 65.56 ± 0.14 | 62.49 ± 0.15 |
![]() ![]() Tulu 3 70B AI2 | 54.47 ± 0.08 | 33.71 ± 0.17 | 62.96 ± 0.24 | 62.66 ± 0.17 | 57.39 ± 0.19 | 50.61 ± 0.23 | 54.25 ± 0.16 | 59.74 ± 0.20 | 58.73 ± 0.19 |
![]() ![]() Llama 3.3 70B Meta | 53.09 ± 0.05 | 21.72 ± 0.15 | 63.21 ± 0.11 | 63.17 ± 0.09 | 58.80 ± 0.15 | 54.74 ± 0.15 | 50.05 ± 0.08 | 59.92 ± 0.11 | 66.50 ± 0.16 |
![]() ![]() Qwen 3 14B Alibaba | 53.02 ± 0.05 | 31.19 ± 0.14 | 57.37 ± 0.13 | 61.37 ± 0.12 | 55.04 ± 0.11 | 52.30 ± 0.17 | 54.82 ± 0.15 | 59.03 ± 0.14 | 65.00 ± 0.16 |
![]() ![]() Qwen 2.5 72B Alibaba | 52.81 ± 0.05 | 26.39 ± 0.20 | 61.94 ± 0.13 | 64.82 ± 0.08 | 59.48 ± 0.14 | 42.51 ± 0.15 | 53.80 ± 0.16 | 60.70 ± 0.13 | 63.38 ± 0.18 |
![]() ![]() Gemma 2 27B | 50.91 ± 0.07 | 22.81 ± 0.24 | 61.19 ± 0.16 | 59.79 ± 0.15 | 54.06 ± 0.22 | 53.44 ± 0.23 | 49.78 ± 0.17 | 55.33 ± 0.18 | 42.75 ± 0.24 |
![]() ![]() Qwen 2.5 32B Alibaba | 49.89 ± 0.06 | 25.74 ± 0.17 | 56.49 ± 0.16 | 61.82 ± 0.10 | 53.71 ± 0.13 | 44.29 ± 0.22 | 50.00 ± 0.12 | 57.15 ± 0.15 | 58.61 ± 0.16 |
![]() ![]() SEA-LION v3 (Gemma 2) 9B AISG | 49.75 ± 0.07 | 14.66 ± 0.22 | 60.75 ± 0.22 | 59.14 ± 0.13 | 54.70 ± 0.18 | 53.89 ± 0.21 | 48.95 ± 0.18 | 56.15 ± 0.20 | 45.14 ± 0.19 |
![]() ![]() Llama 3.1 70B Meta | 49.41 ± 0.07 | 18.55 ± 0.24 | 61.82 ± 0.17 | 60.48 ± 0.14 | 55.67 ± 0.18 | 46.08 ± 0.21 | 47.14 ± 0.20 | 56.10 ± 0.17 | 59.59 ± 0.18 |
![]() ![]() Mistral Large 2411 123B Mistral AI | 48.86 ± 0.09 | 25.10 ± 0.29 | 58.98 ± 0.15 | 58.23 ± 0.24 | 52.73 ± 0.21 | 49.44 ± 0.29 | 45.34 ± 0.17 | 52.21 ± 0.20 | 64.31 ± 0.15 |
![]() ![]() Command A 03-2025 111B CohereLabs | 48.02 ± 0.08 | 16.20 ± 0.22 | 49.22 ± 0.21 | 66.73 ± 0.17 | 54.76 ± 0.17 | 53.29 ± 0.20 | 34.58 ± 0.21 | 61.39 ± 0.19 | 63.92 ± 0.18 |
![]() ![]() Qwen 3 8B Alibaba | 46.67 ± 0.05 | 25.16 ± 0.20 | 50.42 ± 0.16 | 58.58 ± 0.14 | 53.83 ± 0.15 | 33.23 ± 0.16 | 50.66 ± 0.14 | 54.80 ± 0.15 | 60.57 ± 0.18 |
![]() ![]() Gemma 2 9B | 44.52 ± 0.07 | 8.95 ± 0.17 | 53.79 ± 0.17 | 54.92 ± 0.18 | 49.72 ± 0.21 | 48.17 ± 0.23 | 44.65 ± 0.17 | 51.44 ± 0.17 | 32.65 ± 0.19 |
![]() ![]() Qwen 2.5 14B Alibaba | 43.58 ± 0.07 | 12.45 ± 0.19 | 51.62 ± 0.12 | 58.49 ± 0.12 | 51.49 ± 0.13 | 32.49 ± 0.22 | 46.32 ± 0.13 | 52.22 ± 0.15 | 55.84 ± 0.16 |
![]() ![]() SEA-LION v3 (Llama) 8B AISG | 43.42 ± 0.09 | 17.54 ± 0.26 | 49.07 ± 0.22 | 52.51 ± 0.20 | 51.32 ± 0.28 | 42.13 ± 0.32 | 42.37 ± 0.19 | 49.00 ± 0.24 | 46.31 ± 0.28 |
![]() ![]() MERaLiON 2 10B A*STAR | 43.32 ± 0.08 | 9.71 ± 0.16 | 52.26 ± 0.20 | 55.06 ± 0.15 | 49.12 ± 0.22 | 45.29 ± 0.25 | 41.93 ± 0.15 | 49.85 ± 0.20 | 31.41 ± 0.19 |
![]() ![]() Aya Expanse 32B CohereLabs | 41.03 ± 0.07 | 6.44 ± 0.15 | 47.65 ± 0.13 | 59.65 ± 0.15 | 48.35 ± 0.20 | 40.58 ± 0.15 | 30.47 ± 0.18 | 54.04 ± 0.15 | 37.21 ± 0.26 |
![]() ![]() ERNIE 4.5 21B MoE Baidu | 40.30 ± 0.10 | 17.16 ± 0.15 | 45.78 ± 0.35 | 49.00 ± 0.16 | 45.83 ± 0.18 | 40.75 ± 0.18 | 42.75 ± 0.18 | 40.86 ± 0.22 | 61.47 ± 0.16 |
![]() ![]() Llama 3 70B Meta | 39.22 ± 0.05 | 13.09 ± 0.17 | 52.32 ± 0.14 | 49.96 ± 0.14 | 43.42 ± 0.10 | 31.13 ± 0.16 | 39.51 ± 0.09 | 45.09 ± 0.11 | 51.23 ± 0.14 |
![]() ![]() Sailor2 20B SAIL | 36.93 ± 0.07 | 8.55 ± 0.11 | 51.82 ± 0.15 | 51.53 ± 0.16 | 45.19 ± 0.17 | 35.23 ± 0.20 | 37.12 ± 0.15 | 29.04 ± 0.18 | 32.57 ± 0.14 |
![]() ![]() Command R+ 08-2024 104B CohereLabs | 36.63 ± 0.09 | 5.67 ± 0.18 | 45.26 ± 0.29 | 50.74 ± 0.22 | 44.12 ± 0.26 | 29.43 ± 0.39 | 32.19 ± 0.18 | 48.97 ± 0.21 | 31.53 ± 0.15 |
![]() ![]() Qwen 2.5 7B Alibaba | 36.20 ± 0.05 | 7.01 ± 0.15 | 38.20 ± 0.16 | 52.56 ± 0.13 | 46.94 ± 0.13 | 21.63 ± 0.13 | 39.89 ± 0.16 | 47.19 ± 0.13 | 43.29 ± 0.18 |
![]() ![]() Sailor2 8B SAIL | 35.81 ± 0.07 | 11.65 ± 0.13 | 51.41 ± 0.16 | 46.68 ± 0.18 | 43.90 ± 0.19 | 20.59 ± 0.16 | 35.75 ± 0.15 | 40.67 ± 0.17 | 15.26 ± 0.11 |
![]() ![]() Olmo 2 0325 32B AI2 | 35.08 ± 0.09 | 4.38 ± 0.15 | 53.00 ± 0.24 | 50.75 ± 0.21 | 48.28 ± 0.23 | 19.42 ± 0.31 | 32.99 ± 0.22 | 36.77 ± 0.27 | 34.44 ± 0.09 |
![]() ![]() Tulu 3 8B AI2 | 33.87 ± 0.06 | 9.78 ± 0.13 | 33.94 ± 0.24 | 42.96 ± 0.13 | 42.86 ± 0.14 | 24.58 ± 0.25 | 39.85 ± 0.17 | 43.13 ± 0.23 | 38.12 ± 0.17 |
![]() ![]() Command R 08-2024 32B CohereLabs | 33.84 ± 0.08 | 3.91 ± 0.15 | 38.24 ± 0.25 | 49.15 ± 0.20 | 39.00 ± 0.21 | 36.43 ± 0.27 | 27.84 ± 0.22 | 42.34 ± 0.25 | 29.20 ± 0.24 |
![]() ![]() Mistral Small 3.1 2503 24B Mistral AI | 33.71 ± 0.10 | 2.02 ± 0.11 | 47.51 ± 0.33 | 53.67 ± 0.22 | 44.39 ± 0.22 | 10.39 ± 0.28 | 31.49 ± 0.25 | 46.52 ± 0.24 | 49.53 ± 0.17 |
![]() ![]() Llama 3.1 8B Meta | 32.78 ± 0.08 | 7.07 ± 0.19 | 39.63 ± 0.22 | 47.23 ± 0.21 | 45.49 ± 0.22 | 17.53 ± 0.26 | 33.65 ± 0.18 | 38.83 ± 0.20 | 39.62 ± 0.20 |
![]() ![]() Apertus 70B Swiss AI | 31.25 ± 0.11 | 12.28 ± 0.25 | 34.68 ± 0.29 | 33.31 ± 0.22 | 42.69 ± 0.24 | 28.48 ± 0.30 | 29.18 ± 0.23 | 38.12 ± 0.25 | 26.92 ± 0.17 |
![]() ![]() phi-4 14B Microsoft | 30.21 ± 0.11 | 5.85 ± 0.18 | 29.72 ± 0.23 | 49.14 ± 0.28 | 39.71 ± 0.24 | 22.74 ± 0.23 | 25.91 ± 0.26 | 38.42 ± 0.23 | 57.87 ± 0.20 |
![]() ![]() Aya Expanse 8B CohereLabs | 29.53 ± 0.06 | 3.03 ± 0.13 | 29.70 ± 0.16 | 49.78 ± 0.15 | 44.12 ± 0.18 | 17.05 ± 0.23 | 17.22 ± 0.13 | 45.78 ± 0.17 | 24.06 ± 0.21 |
![]() ![]() Babel 83B Alibaba-DAMO | 27.53 ± 0.12 | 8.95 ± 0.22 | 29.79 ± 0.41 | 36.39 ± 0.32 | 28.79 ± 0.27 | 25.82 ± 0.33 | 25.76 ± 0.30 | 37.19 ± 0.32 | 29.20 ± 0.20 |
![]() ![]() Apertus 8B Swiss AI | 26.45 ± 0.11 | 8.58 ± 0.21 | 23.09 ± 0.39 | 33.71 ± 0.29 | 38.91 ± 0.31 | 17.21 ± 0.27 | 29.35 ± 0.30 | 34.32 ± 0.34 | 22.00 ± 0.22 |
![]() ![]() Babel 9B Alibaba-DAMO | 25.93 ± 0.07 | 7.00 ± 0.18 | 27.56 ± 0.27 | 32.20 ± 0.26 | 31.66 ± 0.22 | 16.70 ± 0.21 | 32.17 ± 0.22 | 34.19 ± 0.20 | 20.99 ± 0.13 |
![]() ![]() Llama 3 8B Meta | 25.41 ± 0.04 | 3.20 ± 0.06 | 30.16 ± 0.15 | 38.80 ± 0.15 | 35.77 ± 0.16 | 8.95 ± 0.16 | 25.16 ± 0.15 | 35.81 ± 0.15 | 29.88 ± 0.17 |
![]() ![]() SeaLLMs V3 7B Alibaba-DAMO | 24.90 ± 0.08 | 6.21 ± 0.16 | 26.67 ± 0.34 | 32.57 ± 0.22 | 36.82 ± 0.28 | 11.92 ± 0.18 | 29.69 ± 0.27 | 30.44 ± 0.23 | 21.87 ± 0.11 |
![]() ![]() Olmo 2 1124 13B AI2 | 22.61 ± 0.06 | 1.84 ± 0.09 | 35.13 ± 0.29 | 36.46 ± 0.26 | 38.54 ± 0.19 | 8.48 ± 0.18 | 14.08 ± 0.18 | 23.71 ± 0.25 | 31.12 ± 0.17 |
![]() ![]() Command R7B 12-2024 7B CohereLabs | 20.29 ± 0.10 | 2.65 ± 0.13 | 25.28 ± 0.33 | 33.07 ± 0.25 | 28.08 ± 0.26 | 13.32 ± 0.18 | 14.64 ± 0.20 | 24.98 ± 0.31 | 31.63 ± 0.15 |
![]() ![]() Ministral 2410 8B Mistral AI | 18.35 ± 0.10 | 3.66 ± 0.13 | 19.24 ± 0.35 | 28.04 ± 0.21 | 23.78 ± 0.30 | 11.11 ± 0.19 | 17.12 ± 0.22 | 25.51 ± 0.29 | 26.03 ± 0.26 |
![]() ![]() Olmo 2 1124 7B AI2 | 15.11 ± 0.06 | 2.18 ± 0.12 | 14.84 ± 0.19 | 25.27 ± 0.27 | 29.81 ± 0.22 | 6.98 ± 0.14 | 11.14 ± 0.19 | 15.57 ± 0.22 | 23.09 ± 0.13 |