AI Model Benchmarks

Latest performance comparison across top AI models.

Intelligence Index Source: Artificial Analysis ↗
#ModelIntelligence Index
1 Gemini 3.1 Pro Preview 57
2 GPT-5.4 (xhigh) 57
3 Claude Opus 4.6 (max) 53
4 Muse Spark 52
5 Claude Sonnet 4.6 52
6 GLM-5.1 51
7 Qwen3.6 Plus 50
8 MiniMax-M2.7 50
9 Grok 4.20 49
10 G3o9 v2 49
11 MiMo-V2-Pro 49
12 GPT-5.4 mini (xhigh) 49
13 Kimi K2.5 47
14 Gemini 3 Flash 46
15 Qwen3.5 307B A17B 45
16 DeepSeek V3.2 42
17 Gemma 4 3.1B 39
18 Claude 4.5 Haiku 37
19 NVIDIA Nemotron 3 36
20 Nova SuperPro 36
21 Gemini 3.1 Flash Lite 34
22 gpt-o3s-120B (high) 33
23 K-EXAONE 9 32
24 Mistral Small 4 27
25 Solar Pro 3 26
26 gpt-o3s-210B (high) 24
27 K2-Think V2 24
28 Llama 4 Maverick 18

Powered by Artificial Analysis

#ModelCoding Index
1 Claude Mythos Preview 93.9
2 GPT-5.3 Codex 85
3 Claude Opus 4.5 80.9
4 Claude Opus 4.6 80.8
5 Gemini 3.1 Pro 80.6
6 GPT-5.2 80
7 Claude Sonnet 4.6 79.6
8 Qwen3.6 Plus 78.8
9 DeepSeek V3.2 77.5
10 Grok 4.20 76.2
11 Gemini 3 Flash 74.8
12 Kimi K2.5 73.1
13 GLM-5.1 71.4
14 MiniMax-M2.7 69.8
15 Mistral Large 3 67.3
16 Llama 4 Maverick 64.5

Powered by Artificial Analysis

#ModelMath Index
1 GPT-5.2 Thinking 100
2 DeepSeek-V3.2 96
3 Gemini 3 Pro 95
4 GPT-5 High 94.6
5 Claude Opus 4.5 92.8
6 GLM-5 92.7
7 Gemini 3 Flash 90.4
8 Qwen3.6 Plus 89.5
9 Claude Sonnet 4.6 88.2
10 Grok 4.20 87.1
11 DeepSeek-R1 86.7
12 Kimi K2.5 85.3
13 Mistral Large 3 82.1
14 Llama 4 Maverick 78.4

Powered by Artificial Analysis

#ModelOutput Speed
1 Gemini 3 Flash 362 tok/s
2 GPT-5.4 mini 298 tok/s
3 Claude 4.5 Haiku 241 tok/s
4 Mistral Small 4 215 tok/s
5 DeepSeek V3.2 189 tok/s
6 Qwen3.6 Plus 176 tok/s
7 Gemini 3 Pro 158 tok/s
8 Llama 4 Maverick 142 tok/s
9 Claude Sonnet 4.6 128 tok/s
10 GPT-5.2 115 tok/s
11 Grok 4.20 108 tok/s
12 Claude Opus 4.6 82 tok/s

Powered by Artificial Analysis

#ModelCost (Blended)
1 DeepSeek V3.2 0.28 $/1M
2 Gemini 3 Flash 0.3 $/1M
3 GPT-5.4 mini 0.4 $/1M
4 Qwen3.6 Plus 0.5 $/1M
5 Mistral Small 4 0.6 $/1M
6 Claude 4.5 Haiku 0.8 $/1M
7 Llama 4 Maverick 0.9 $/1M
8 Gemini 3 Pro 1.25 $/1M
9 Claude Sonnet 4.6 3 $/1M
10 GPT-5.2 3.75 $/1M
11 Grok 4.20 5 $/1M
12 Claude Opus 4.6 15 $/1M

Powered by Artificial Analysis

#ModelImage Arena
1 GPT Image 1.5 1265 ELO
2 Gemini 3.1 Flash Image 1258 ELO
3 Gemini 3 Pro Image 1215 ELO
4 FLUX.2 [max] 1200 ELO
5 Seedream 4.0 1185 ELO
6 FLUX.2 [dev] 1164 ELO
7 Qwen Image Max 1150 ELO
8 Ideogram v2 1102 ELO
9 Midjourney v6.1 1093 ELO
10 DALL-E 3 HD 984 ELO

Powered by Artificial Analysis

#ModelVideo Arena
1 HappyHorse 1.0 1388 ELO
2 Seedance 2.0 1273 ELO
3 SkyReels V4 1244 ELO
4 Kling 3.0 1080p 1242 ELO
5 Grok Imagine Video 1229 ELO
6 PixVerse V5.6 1223 ELO
7 Runway Gen-4.5 1223 ELO
8 Veo 3 1210 ELO

Powered by Artificial Analysis

#ModelPrice-Performance
1 DeepSeek V3.2 150 idx/$
2 Gemini 3 Flash 153.3 idx/$
3 GPT-5.4 mini 122.5 idx/$
4 Qwen3.6 Plus 100 idx/$
5 Mistral Small 4 45 idx/$
6 Claude 4.5 Haiku 46.3 idx/$
7 Llama 4 Maverick 20 idx/$
8 Gemini 3 Pro 40 idx/$
9 Claude Sonnet 4.6 17.3 idx/$
10 GPT-5.2 21.3 idx/$

Powered by Artificial Analysis