SEA Performance
Overall SEA Average
Average of 8 runs. 95% CI are shown.
Model Size: ≤200B
Open instruct models only
![]() ![]() 27B 67.52±0.11 |
![]() ![]() 27B 67.35±0.08 |
![]() ![]() 32B 65.00±0.16 |
![]() ![]() 12B 64.88±0.10 |
![]() ![]() 70B 64.44±0.38 |
![]() ![]() 109B MoE 64.07±0.11 |
![]() ![]() 30B MoE 62.68±0.09 |
![]() ![]() 70B 62.37±0.09 |
![]() ![]() 72B 61.04±0.11 |
![]() ![]() 14B 61.03±0.10 |
![]() ![]() 70B 60.19±0.13 |
![]() ![]() 27B 59.77±1.05 |
![]() ![]() 32B 58.55±0.12 |
![]() ![]() 9B 57.90±0.55 |
![]() ![]() 123B 57.78±0.45 |
![]() ![]() 70B 57.33±0.53 |
![]() ![]() 111B 56.94±1.38 |
![]() ![]() 8B 56.12±0.18 |
![]() ![]() 8B 54.04±0.38 |
![]() ![]() 9B 53.23±0.72 |
![]() ![]() 14B 52.97±0.14 |
![]() ![]() 10B 52.82±0.73 |
![]() ![]() 21B MoE 51.70±0.73 |
![]() ![]() 32B 50.86±0.40 |
![]() ![]() 70B 48.46±0.37 |
![]() ![]() 104B 47.99±0.76 |
![]() ![]() 7B 46.63±0.43 |
![]() ![]() 32B 45.74±0.27 |
![]() ![]() 8B 45.35±0.33 |
![]() ![]() 32B 45.06±0.99 |
![]() ![]() 8B 44.81±0.36 |
![]() ![]() 8B 44.51±0.35 |
![]() ![]() 20B 44.01±0.16 |
![]() ![]() 24B 42.75±0.72 |
![]() ![]() 8B 40.68±0.23 |
![]() ![]() 83B 40.11±2.70 |
![]() ![]() 14B 39.86±0.99 |
![]() ![]() 9B 37.86±0.90 |
![]() ![]() 8B 36.02±0.32 |
![]() ![]() 7B 35.69±0.76 |
![]() ![]() 13B 34.56±0.29 |
![]() ![]() 8B 31.65±0.82 |
![]() ![]() 7B 31.22±1.79 |
![]() ![]() 7B 26.93±0.83 |
Language Performance by Model
Average of 8 runs. 95% CI are shown.
Model Size: ≤200B
Open instruct models only
Model | SEA | MY | TL | ID | MS | TA | TH | VI | EN |
---|---|---|---|---|---|---|---|---|---|
![]() ![]() SEA-LION v4 27B AISG | 67.52 ± 0.11 | 57.18 ± 0.42 | 74.53 ± 0.28 | 71.89 ± 0.33 | 71.31 ± 0.43 | 68.47 ± 0.30 | 63.18 ± 0.16 | 66.10 ± 0.32 | 70.89 ± 0.29 |
![]() ![]() Gemma 3 27B | 67.35 ± 0.08 | 57.78 ± 0.43 | 74.09 ± 0.12 | 71.52 ± 0.26 | 71.20 ± 0.34 | 68.45 ± 0.47 | 62.79 ± 0.32 | 65.64 ± 0.42 | 70.90 ± 0.24 |
![]() ![]() Qwen 3 32B Alibaba | 65.00 ± 0.16 | 43.03 ± 0.63 | 69.72 ± 0.18 | 72.81 ± 0.18 | 70.01 ± 0.23 | 64.10 ± 0.39 | 65.36 ± 0.33 | 69.94 ± 0.38 | 73.82 ± 0.29 |
![]() ![]() Gemma 3 12B | 64.88 ± 0.10 | 52.82 ± 0.22 | 72.02 ± 0.31 | 70.17 ± 0.23 | 68.68 ± 0.52 | 65.83 ± 0.63 | 60.27 ± 0.22 | 64.39 ± 0.49 | 65.95 ± 0.15 |
![]() ![]() SEA-LION v3 (Llama) 70B AISG | 64.44 ± 0.38 | 40.16 ± 2.57 | 72.84 ± 0.46 | 72.15 ± 0.48 | 69.82 ± 0.43 | 63.77 ± 0.69 | 62.67 ± 0.31 | 69.65 ± 0.55 | 71.35 ± 0.45 |
![]() ![]() Llama 4 Scout 109B MoE Meta | 64.07 ± 0.11 | 54.76 ± 0.23 | 69.94 ± 0.21 | 69.72 ± 0.24 | 67.58 ± 0.22 | 64.22 ± 0.28 | 58.52 ± 0.06 | 63.73 ± 0.19 | 70.38 ± 0.18 |
![]() ![]() Qwen 3 30B MoE Alibaba | 62.68 ± 0.09 | 25.88 ± 0.27 | 70.06 ± 0.17 | 72.36 ± 0.28 | 71.55 ± 0.28 | 61.89 ± 0.30 | 64.57 ± 0.21 | 72.49 ± 0.20 | 69.49 ± 0.13 |
![]() ![]() Tulu 3 70B AI2 | 62.37 ± 0.09 | 40.52 ± 0.30 | 69.96 ± 0.54 | 71.00 ± 0.19 | 67.83 ± 0.44 | 57.35 ± 0.43 | 61.09 ± 0.40 | 68.85 ± 0.24 | 65.81 ± 0.19 |
![]() ![]() Qwen 2.5 72B Alibaba | 61.04 ± 0.11 | 33.37 ± 0.40 | 69.65 ± 0.35 | 71.09 ± 0.44 | 69.35 ± 0.34 | 52.63 ± 0.34 | 62.91 ± 0.44 | 68.27 ± 0.29 | 70.11 ± 0.35 |
![]() ![]() Qwen 3 14B Alibaba | 61.03 ± 0.10 | 35.03 ± 0.51 | 65.76 ± 0.21 | 70.55 ± 0.09 | 66.09 ± 0.18 | 59.14 ± 0.29 | 63.01 ± 0.32 | 67.62 ± 0.15 | 71.66 ± 0.24 |
![]() ![]() Llama 3.3 70B Meta | 60.19 ± 0.13 | 23.82 ± 0.59 | 70.26 ± 0.29 | 70.90 ± 0.16 | 68.37 ± 0.23 | 60.90 ± 0.25 | 59.73 ± 0.23 | 67.36 ± 0.19 | 72.16 ± 0.15 |
![]() ![]() Gemma 2 27B | 59.77 ± 1.05 | 38.95 ± 6.64 | 68.03 ± 0.32 | 67.16 ± 0.26 | 64.42 ± 0.45 | 59.04 ± 0.68 | 57.80 ± 0.36 | 62.98 ± 0.47 | 52.83 ± 0.40 |
![]() ![]() Qwen 2.5 32B Alibaba | 58.55 ± 0.12 | 32.15 ± 0.32 | 64.68 ± 0.40 | 69.47 ± 0.25 | 64.64 ± 0.18 | 53.81 ± 0.25 | 60.10 ± 0.27 | 64.97 ± 0.30 | 65.77 ± 0.39 |
![]() ![]() SEA-LION v3 (Gemma 2) 9B AISG | 57.90 ± 0.55 | 21.69 ± 3.72 | 68.43 ± 0.50 | 67.80 ± 0.35 | 65.76 ± 0.37 | 60.04 ± 0.65 | 57.24 ± 0.65 | 64.34 ± 0.42 | 54.98 ± 0.47 |
![]() ![]() Mistral Large 2411 123B Mistral AI | 57.78 ± 0.45 | 31.23 ± 2.11 | 66.60 ± 0.44 | 67.08 ± 1.36 | 63.95 ± 1.33 | 57.64 ± 1.61 | 55.62 ± 2.15 | 62.36 ± 2.29 | 69.92 ± 0.31 |
![]() ![]() Llama 3.1 70B Meta | 57.33 ± 0.53 | 24.14 ± 3.01 | 69.03 ± 0.36 | 68.33 ± 0.40 | 65.44 ± 0.31 | 53.13 ± 0.93 | 57.28 ± 0.33 | 63.96 ± 0.34 | 66.39 ± 0.24 |
![]() ![]() Command A 03-2025 111B CohereLabs | 56.94 ± 1.38 | 25.41 ± 2.48 | 58.17 ± 1.66 | 74.75 ± 0.66 | 66.34 ± 3.38 | 59.92 ± 1.67 | 44.92 ± 4.59 | 69.10 ± 0.58 | 70.32 ± 0.35 |
![]() ![]() Qwen 3 8B Alibaba | 56.12 ± 0.18 | 30.49 ± 0.76 | 60.81 ± 0.12 | 67.80 ± 0.33 | 64.72 ± 0.37 | 45.75 ± 0.73 | 58.57 ± 0.25 | 64.68 ± 0.24 | 68.26 ± 0.34 |
![]() ![]() SEA-LION v3 (Llama) 8B AISG | 54.04 ± 0.38 | 27.35 ± 1.82 | 60.38 ± 0.38 | 63.97 ± 0.44 | 62.70 ± 0.36 | 52.98 ± 1.40 | 53.05 ± 0.53 | 57.87 ± 0.63 | 55.86 ± 0.16 |
![]() ![]() Gemma 2 9B | 53.23 ± 0.72 | 16.18 ± 3.65 | 63.06 ± 0.65 | 64.17 ± 0.46 | 61.02 ± 0.45 | 54.87 ± 0.74 | 53.68 ± 0.51 | 59.64 ± 0.51 | 44.79 ± 0.85 |
![]() ![]() Qwen 2.5 14B Alibaba | 52.97 ± 0.14 | 21.05 ± 0.48 | 60.86 ± 0.34 | 66.67 ± 0.29 | 62.32 ± 0.29 | 43.64 ± 0.32 | 56.23 ± 0.39 | 60.02 ± 0.25 | 64.21 ± 0.28 |
![]() ![]() MERaLiON 2 10B A*STAR | 52.82 ± 0.73 | 20.20 ± 3.33 | 61.98 ± 0.76 | 64.29 ± 0.27 | 60.46 ± 0.61 | 52.54 ± 0.92 | 51.78 ± 0.81 | 58.52 ± 0.61 | 43.43 ± 0.74 |
![]() ![]() ERNIE 4.5 21B MoE Baidu | 51.70 ± 0.73 | 27.33 ± 1.13 | 57.79 ± 2.05 | 61.33 ± 1.19 | 58.03 ± 1.75 | 50.54 ± 1.30 | 52.11 ± 0.37 | 54.78 ± 3.78 | 68.63 ± 0.35 |
![]() ![]() Aya Expanse 32B CohereLabs | 50.86 ± 0.40 | 16.82 ± 1.67 | 57.63 ± 0.35 | 67.84 ± 0.33 | 60.29 ± 0.84 | 50.67 ± 0.39 | 40.66 ± 0.34 | 62.14 ± 0.31 | 47.94 ± 0.32 |
![]() ![]() Llama 3 70B Meta | 48.46 ± 0.37 | 21.48 ± 2.59 | 60.08 ± 0.27 | 59.62 ± 0.28 | 53.99 ± 0.35 | 40.11 ± 0.45 | 48.80 ± 0.13 | 55.14 ± 0.22 | 59.21 ± 0.28 |
![]() ![]() Command R+ 08-2024 104B CohereLabs | 47.99 ± 0.76 | 13.26 ± 2.31 | 55.84 ± 0.68 | 61.61 ± 0.90 | 56.48 ± 0.67 | 43.75 ± 3.06 | 44.05 ± 0.96 | 60.93 ± 1.24 | 41.67 ± 0.49 |
![]() ![]() Qwen 2.5 7B Alibaba | 46.63 ± 0.43 | 12.01 ± 1.50 | 50.88 ± 0.27 | 63.06 ± 0.22 | 58.57 ± 0.33 | 34.29 ± 0.76 | 49.64 ± 2.70 | 57.96 ± 0.27 | 54.24 ± 0.36 |
![]() ![]() Olmo 2 0325 32B AI2 | 45.74 ± 0.27 | 7.17 ± 1.41 | 61.97 ± 0.50 | 62.21 ± 0.39 | 59.66 ± 0.40 | 35.76 ± 1.18 | 44.09 ± 0.97 | 49.35 ± 0.89 | 42.66 ± 0.53 |
![]() ![]() Tulu 3 8B AI2 | 45.35 ± 0.33 | 18.87 ± 0.59 | 48.02 ± 0.94 | 56.55 ± 0.38 | 54.99 ± 0.85 | 35.87 ± 0.78 | 48.34 ± 0.63 | 54.80 ± 0.61 | 49.54 ± 0.11 |
![]() ![]() Command R 08-2024 32B CohereLabs | 45.06 ± 0.99 | 13.44 ± 1.87 | 50.45 ± 1.54 | 59.42 ± 0.38 | 51.15 ± 1.97 | 47.02 ± 1.01 | 40.20 ± 0.79 | 53.76 ± 3.73 | 41.50 ± 0.66 |
![]() ![]() Llama 3.1 8B Meta | 44.81 ± 0.36 | 14.92 ± 1.18 | 52.90 ± 0.58 | 59.92 ± 0.42 | 57.14 ± 0.43 | 30.63 ± 2.24 | 46.93 ± 0.57 | 51.21 ± 0.62 | 50.37 ± 0.25 |
![]() ![]() Sailor2 8B SAIL | 44.51 ± 0.35 | 15.82 ± 2.50 | 60.13 ± 0.17 | 56.72 ± 0.64 | 55.28 ± 0.22 | 27.63 ± 2.40 | 47.38 ± 0.34 | 48.59 ± 0.86 | 27.50 ± 0.32 |
![]() ![]() Sailor2 20B SAIL | 44.01 ± 0.16 | 8.56 ± 0.30 | 59.46 ± 0.23 | 60.28 ± 0.27 | 56.09 ± 0.35 | 42.22 ± 0.54 | 47.52 ± 0.48 | 33.92 ± 1.13 | 44.04 ± 0.14 |
![]() ![]() Mistral Small 3.1 2503 24B Mistral AI | 42.75 ± 0.72 | 4.57 ± 2.19 | 58.60 ± 1.46 | 63.76 ± 0.55 | 54.73 ± 3.70 | 19.50 ± 2.79 | 43.94 ± 0.73 | 54.16 ± 1.61 | 58.03 ± 1.82 |
![]() ![]() Aya Expanse 8B CohereLabs | 40.68 ± 0.23 | 2.93 ± 0.24 | 44.54 ± 0.42 | 61.16 ± 0.31 | 55.77 ± 0.29 | 30.14 ± 0.92 | 33.07 ± 0.56 | 57.18 ± 0.42 | 37.28 ± 0.51 |
![]() ![]() Babel 83B Alibaba-DAMO | 40.11 ± 2.70 | 19.42 ± 6.02 | 44.11 ± 6.04 | 49.21 ± 3.64 | 40.21 ± 6.22 | 38.57 ± 3.77 | 38.88 ± 4.47 | 50.39 ± 5.33 | 40.23 ± 0.58 |
![]() ![]() phi-4 14B Microsoft | 39.86 ± 0.99 | 9.69 ± 1.63 | 35.82 ± 1.58 | 60.52 ± 1.30 | 51.05 ± 3.48 | 33.31 ± 4.06 | 39.41 ± 2.54 | 49.25 ± 2.75 | 64.86 ± 0.39 |
![]() ![]() Babel 9B Alibaba-DAMO | 37.86 ± 0.90 | 12.14 ± 2.23 | 42.18 ± 2.86 | 48.08 ± 1.45 | 45.27 ± 0.58 | 29.64 ± 4.45 | 42.78 ± 0.98 | 44.97 ± 2.17 | 33.47 ± 0.75 |
![]() ![]() Llama 3 8B Meta | 36.02 ± 0.32 | 5.91 ± 0.45 | 44.85 ± 0.46 | 51.32 ± 0.52 | 48.13 ± 0.38 | 14.98 ± 1.96 | 39.28 ± 0.54 | 47.65 ± 0.63 | 41.46 ± 0.39 |
![]() ![]() SeaLLMs V3 7B Alibaba-DAMO | 35.69 ± 0.76 | 9.97 ± 1.81 | 42.29 ± 2.33 | 46.71 ± 1.61 | 50.26 ± 0.53 | 17.68 ± 1.76 | 41.00 ± 2.35 | 41.91 ± 3.76 | 32.62 ± 1.04 |
![]() ![]() Olmo 2 1124 13B AI2 | 34.56 ± 0.29 | 2.40 ± 0.73 | 48.53 ± 0.91 | 50.37 ± 0.30 | 50.95 ± 0.81 | 18.91 ± 1.84 | 25.67 ± 1.26 | 45.08 ± 1.53 | 43.29 ± 0.46 |
![]() ![]() Ministral 2410 8B Mistral AI | 31.65 ± 0.82 | 13.22 ± 2.97 | 36.21 ± 1.39 | 42.77 ± 0.70 | 37.96 ± 2.25 | 21.58 ± 3.26 | 32.06 ± 1.64 | 37.71 ± 2.67 | 38.63 ± 0.55 |
![]() ![]() Command R7B 12-2024 7B CohereLabs | 31.22 ± 1.79 | 3.87 ± 1.49 | 40.99 ± 1.25 | 47.11 ± 2.13 | 39.17 ± 2.51 | 21.16 ± 3.91 | 30.14 ± 2.73 | 36.11 ± 6.91 | 43.51 ± 0.50 |
![]() ![]() Olmo 2 1124 7B AI2 | 26.93 ± 0.83 | 2.28 ± 0.35 | 28.95 ± 2.15 | 41.90 ± 1.18 | 43.61 ± 0.95 | 15.06 ± 3.63 | 26.04 ± 1.30 | 30.65 ± 2.83 | 36.68 ± 0.34 |