AI Stats
290 LLM models tracked with pricing & performance benchmarks.
Data sourced from ArtificialAnalysis.com & Epoch.ai
Leaderboard
Switch metrics to see different top-5 rankings.
- 1Gemini 3 Pro Preview (high)Google89.8%
- 2Claude Opus 4.5 (Reasoning)Anthropic89.5%
- 3Gemini 3 Pro Preview (low)Google89.5%
- 4Gemini 3 Flash Preview (Reasoning)Google89.0%
- 5Claude Opus 4.5 (Non-reasoning)Anthropic88.9%
- 1Gemini 3 Pro Preview (high)Google90.8%
- 2GPT-5.2 (xhigh)OpenAI90.3%
- 3Gemini 3 Flash Preview (Reasoning)Google89.8%
- 4Gemini 3 Pro Preview (low)Google88.7%
- 5Grok 4xAI87.7%
- 1Gemini 3 Pro Preview (high)Google37.2%
- 2GPT-5.2 (xhigh)OpenAI35.4%
- 3Gemini 3 Flash Preview (Reasoning)Google34.7%
- 4KAT-Coder-Pro V1KwaiKAT33.4%
- 5Claude Opus 4.5 (Reasoning)Anthropic28.4%
- 1GPT-5 (high)OpenAI95.7%
- 2Grok 4xAI94.3%
- 3o4-mini (high)OpenAI94.0%
- 4Qwen3 235B A22B 2507 (Reasoning)Alibaba94.0%
- 5Grok 3 mini Reasoning (high)xAI93.3%
- 1Gemini 3 Pro Preview (high)Google56.1%
- 2GPT-5.2 (xhigh)OpenAI52.1%
- 3Gemini 3 Flash Preview (Reasoning)Google50.6%
- 4Gemini 3 Pro Preview (low)Google49.9%
- 5Gemini 3 Flash Preview (Non-reasoning)Google49.9%
- 1GPT-5 (high)OpenAI99.4%
- 2o3OpenAI99.2%
- 3Grok 3 mini Reasoning (high)xAI99.2%
- 4GPT-5 (medium)OpenAI99.1%
- 5Claude 4 Sonnet (Reasoning)Anthropic99.1%
- 1Gemini 3 Pro Preview (high)Google91.7%
- 2Gemini 3 Flash Preview (Reasoning)Google90.8%
- 3DeepSeek V3.2 SpecialeDeepSeek89.6%
- 4GPT-5.2 (medium)OpenAI89.4%
- 5GLM-4.7 (Reasoning)Z AI89.4%
- 1GPT-5.2 (xhigh)OpenAI99.0%
- 2GPT-5 Codex (high)OpenAI98.7%
- 3Gemini 3 Flash Preview (Reasoning)Google97.0%
- 4GPT-5.2 (medium)OpenAI96.7%
- 5DeepSeek V3.2 SpecialeDeepSeek96.7%
- 1GPT-5.2 (xhigh)OpenAI48.7%
- 2Claude Opus 4.5 (Reasoning)Anthropic47.8%
- 3Gemini 2.5 Pro Preview (Mar' 25)Google46.7%
- 4Gemini 3 Pro Preview (high)Google46.5%
- 5GPT-5.1 (high)OpenAI44.7%
- 1GPT-5.2 (xhigh)OpenAI51.1%
- 2Claude Opus 4.5 (Reasoning)Anthropic49.6%
- 3Gemini 3 Pro Preview (high)Google48.4%
- 4GPT-5.1 (high)OpenAI47.5%
- 5Gemini 3 Flash Preview (Reasoning)Google46.2%
- 1gemini 3 pro previewGoogle DeepMind92.6%
- 2gpt 5.1 highOpenAI87.6%
- 3gpt 5 2025 08 07 highOpenAI86.2%
- 4claude opus 4 5 20251101 32KAnthropic86.1%
- 5claude opus 4 5 20251101 16KAnthropic85.5%
- 1o3 2025 04 16 highOpenAI31.8%
- 2gpt 5.1 highOpenAI31.2%
- 3gemini 2.5 proGoogle DeepMind25.6%
- 4o4 mini 2025 04 16 highOpenAI21.4%
- 5claude sonnet 4 5 20250929 32KAnthropic18.5%
- 1claude sonnet 4 20250514Anthropic72.8%
- 2claude 3 7 sonnet 20250219Anthropic62.3%
- 3gpt 5 mini 2025 08 07 highOpenAI61.6%
- 4claude haiku 4 5 20251001Anthropic60.6%
- 5DeepSeek V3 0324DeepSeek54.8%
- 1o3 2025 04 16 highOpenAI96.8%
- 2gpt 5.1 highOpenAI95.8%
- 3DeepSeek R1 0528DeepSeek94.5%
- 4gemini 2.5 proGoogle DeepMind91.2%
- 5o3 mini 2025 01 31 highOpenAI91.2%
- 1o3 2025 04 16 highOpenAI87.6%
- 2gpt 5.1 highOpenAI84.2%
- 3o3 2025 04 16 mediumOpenAI75.2%
- 4gemini 2.5 proGoogle DeepMind32.5%
- 5claude sonnet 4 5 20250929 32KAnthropic21.8%
- 1gpt 5.1 highOpenAI93.2%
- 2gemini 3 pro previewGoogle DeepMind92.8%
- 3gpt 5 2025 08 07 highOpenAI92.1%
- 4gemini 2.5 proGoogle DeepMind91.5%
- 5o3 2025 04 16 highOpenAI90.8%
- 1gpt 5.1 highOpenAI97.8%
- 2o3 2025 04 16 highOpenAI97.5%
- 3DeepSeek R1 0528DeepSeek97.2%
- 4gemini 2.5 proGoogle DeepMind96.8%
- 5claude sonnet 4 5 20250929 32KAnthropic96.5%
- 1claude sonnet 4 5 20250929Anthropic85.6%
- 2gpt 5.1 highOpenAI84.2%
- 3gemini 2.5 proGoogle DeepMind82.1%
- 4DeepSeek V3 0324DeepSeek78.8%
- 5gpt 4.1 2025 04 14OpenAI72.4%
All models
Search and scan every tracked model.
Showing 20 of 290 models
No models found
Try adjusting your search or filters.
GPT-5.2 (xhigh)
OpenAI
Claude Opus 4.5 (Reasoning)
Anthropic
Gemini 3 Pro Preview (high)
GPT-5.1 (high)
OpenAI
Gemini 3 Flash Preview (Reasoning)
GPT-5.2 (medium)
OpenAI
GPT-5 (high)
OpenAI
GPT-5 Codex (high)
OpenAI
Claude Opus 4.5 (Non-reasoning)
Anthropic
Claude 4.5 Sonnet (Reasoning)
Anthropic
GPT-5 (medium)
OpenAI
GPT-5.1 Codex (high)
OpenAI
GLM-4.7 (Reasoning)
Z AI
DeepSeek V3.2 (Reasoning)
DeepSeek
Grok 4
xAI
GPT-5 mini (high)
OpenAI
Gemini 3 Pro Preview (low)
o3
OpenAI
o3-pro
OpenAI
Kimi K2 Thinking
Kimi
MiniMax-M2.1
MiniMax
MiMo-V2-Flash (Reasoning)
Xiaomi
GPT-5 mini (medium)
OpenAI
GPT-5 (low)
OpenAI
Claude 4 Sonnet (Reasoning)
Anthropic
GPT-5.1 Codex mini (high)
OpenAI
Grok 4.1 Fast (Reasoning)
xAI
Claude 4.5 Sonnet (Non-reasoning)
Anthropic
Claude 4.5 Haiku (Reasoning)
Anthropic
KAT-Coder-Pro V1
KwaiKAT
MiniMax-M2
MiniMax
Nova 2.0 Pro Preview (medium)
Amazon
Gemini 3 Flash Preview (Non-reasoning)
Grok 4 Fast (Reasoning)
xAI
Claude 3.7 Sonnet (Reasoning)
Anthropic
Gemini 2.5 Pro
GLM-4.7 (Non-reasoning)
Z AI
DeepSeek V3.1 Terminus (Reasoning)
DeepSeek
Doubao Seed Code
ByteDance Seed
GPT-5.2 (Non-reasoning)
OpenAI
gpt-oss-120B (high)
OpenAI
o4-mini (high)
OpenAI
Claude 4 Sonnet (Non-reasoning)
Anthropic
DeepSeek V3.2 Exp (Reasoning)
DeepSeek
Grok 3 mini Reasoning (high)
xAI
GLM-4.6 (Reasoning)
Z AI
Qwen3 Max Thinking
Alibaba
Nova 2.0 Pro Preview (low)
Amazon
DeepSeek V3.2 (Non-reasoning)
DeepSeek
Claude 4.1 Opus (Reasoning)
Anthropic
Qwen3 Max
Alibaba
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)
Claude 3.7 Sonnet (Non-reasoning)
Anthropic
Claude 4.5 Haiku (Non-reasoning)
Anthropic
o1
OpenAI
MiMo-V2-Flash (Non-reasoning)
Xiaomi
Gemini 2.5 Pro Preview (Mar' 25)
GLM-4.6 (Non-reasoning)
Z AI
Nova 2.0 Lite (medium)
Amazon
Gemini 2.5 Pro Preview (May' 25)
Qwen3 235B A22B 2507 (Reasoning)
Alibaba
DeepSeek V3.2 Speciale
DeepSeek
ERNIE 5.0 Thinking Preview
Baidu
Qwen3 VL 32B (Reasoning)
Alibaba
DeepSeek V3.1 Terminus (Non-reasoning)
DeepSeek
DeepSeek V3.2 Exp (Non-reasoning)
DeepSeek
MiniMax-Text-01
MiniMax
Kimi K2 0905
Kimi
DeepSeek V3.1 (Reasoning)
DeepSeek
DeepSeek V3.1 (Non-reasoning)
DeepSeek
Nova 2.0 Omni (medium)
Amazon
Qwen3 VL 235B A22B (Reasoning)
Alibaba
Magistral Medium 1.2
Mistral
Claude 4 Opus (Reasoning)
Anthropic
Gemini 2.5 Flash (Reasoning)
GPT-5.1 (Non-reasoning)
OpenAI
DeepSeek R1 0528 (May '25)
DeepSeek
GLM-4.5 (Reasoning)
Z AI
GPT-5 nano (high)
OpenAI
Qwen3 Next 80B A3B (Reasoning)
Alibaba
Grok Code Fast 1
xAI
Qwen3 Max (Preview)
Alibaba
GPT-5 nano (medium)
OpenAI
o3-mini
OpenAI
Kimi K2
Kimi
GPT-4.1
OpenAI
o1-pro
OpenAI
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)
Grok 3
xAI
o3-mini (high)
OpenAI
Seed-OSS-36B-Instruct
ByteDance Seed
Nova 2.0 Lite (low)
Amazon
Qwen3 Coder 480B A35B Instruct
Alibaba
NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)
NVIDIA
gpt-oss-20B (high)
OpenAI
Qwen3 235B A22B 2507 Instruct
Alibaba
GPT-5 (minimal)
OpenAI
MiniMax M1 80k
MiniMax
Nova 2.0 Omni (low)
Amazon
GLM-4.6V (Reasoning)
Z AI
GLM-4.5-Air
Z AI
gpt-oss-120B (low)
OpenAI
Grok 4.1 Fast (Non-reasoning)
xAI
Nova 2.0 Pro Preview (Non-reasoning)
Amazon
o1-preview
OpenAI
Claude 4.1 Opus (Non-reasoning)
Anthropic
GPT-4.1 mini
OpenAI
Qwen3 30B A3B 2507 (Reasoning)
Alibaba
Grok 4 Fast (Non-reasoning)
xAI
DeepSeek V3 0324
DeepSeek
Ring-1T
InclusionAI
Mistral Large 3
Mistral
Magistral Small 1.2
Mistral
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)
INTELLECT-3
Prime Intellect
Claude 4 Opus (Non-reasoning)
Anthropic
CompactifAI Llama 4 Scout Slim
Multiverse Computing
GPT-5 (ChatGPT)
OpenAI
Hermes 4 - Llama-3.1 405B (Reasoning)
Nous Research
GPT-5 mini (minimal)
OpenAI
gpt-oss-20B (low)
OpenAI
Mistral Medium 3.1
Mistral
Gemini 2.5 Flash (Non-reasoning)
Qwen3 VL 235B A22B Instruct
Alibaba
Ring-flash-2.0
InclusionAI
Hermes 4 - Llama-3.1 70B (Reasoning)
Nous Research
Qwen3 Coder 30B A3B Instruct
Alibaba
Qwen3 Next 80B A3B Instruct
Alibaba
Codestral (Jan '25)
Mistral
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)
Qwen3 235B A22B (Reasoning)
Alibaba
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)
NVIDIA
Ling-flash-2.0
InclusionAI
Qwen3 VL 30B A3B (Reasoning)
Alibaba
QwQ 32B
Alibaba
Llama Nemotron Super 49B v1.5 (Reasoning)
NVIDIA
Reka Core
Reka AI
GLM-4.5V (Reasoning)
Z AI
Nova Premier
Amazon
Magistral Medium 1
Mistral
GPT-4o (Aug '24)
OpenAI
Llama 4 Maverick
Meta
Gemini 2.0 Flash (Feb '25)
Devstral Medium
Mistral
Nova 2.0 Lite (Non-reasoning)
Amazon
Claude 3.5 Haiku
Anthropic
DeepSeek R1 (Jan '25)
DeepSeek
Reka Flash (Feb '24)
Reka AI
GPT-4o (March 2025, chatgpt-4o-latest)
OpenAI
Reka Edge
Reka AI
Gemini 2.5 Flash-Lite (Reasoning)
Sonar Reasoning
Perplexity
Devstral Small (May '25)
Mistral
Llama 3.1 Instruct 405B
Meta
Mistral Medium 3
Mistral
GLM-4.6V (Non-reasoning)
Z AI
Nova 2.0 Omni (Non-reasoning)
Amazon
Qwen3 235B A22B (Non-reasoning)
Alibaba
ERNIE 4.5 300B A47B
Baidu
Qwen3 VL 32B Instruct
Alibaba
DeepSeek R1 Distill Qwen 32B
DeepSeek
Qwen3 VL 8B (Reasoning)
Alibaba
Qwen3 32B (Reasoning)
Alibaba
DeepSeek V3 (Dec '24)
DeepSeek
Hermes 4 - Llama-3.1 405B (Non-reasoning)
Nous Research
EXAONE 4.0 32B (Reasoning)
LG AI Research
Qwen3 14B (Reasoning)
Alibaba
Magistral Small 1
Mistral
Olmo 3 7B Think
Allen Institute for AI
CompactifAI Llama 3.3 70B Slim
Multiverse Computing
CompactifAI Mistral Small 3.1 Slim
Multiverse Computing
Qwen3 VL 30B A3B Instruct
Alibaba
DeepSeek R1 0528 Qwen3 8B
DeepSeek
Ministral 3 14B
Mistral
Qwen2.5 Max
Alibaba
DeepSeek R1 Distill Llama 70B
DeepSeek
Claude 3.5 Sonnet (Oct '24)
Anthropic
DeepSeek R1 Distill Qwen 14B
DeepSeek
Qwen3 30B A3B (Reasoning)
Alibaba
Qwen3 Omni 30B A3B (Reasoning)
Alibaba
Devstral Small (Jul '25)
Mistral
Qwen3 30B A3B 2507 Instruct
Alibaba
Llama Nemotron Super 49B v1.5 (Non-reasoning)
NVIDIA
NVIDIA Nemotron Nano 9B V2 (Reasoning)
NVIDIA
Sonar
Perplexity
Mistral Small 3.2
Mistral
Qwen3 8B (Reasoning)
Alibaba
Ministral 3 8B
Mistral
NVIDIA Nemotron Nano 12B v2 VL (Reasoning)
NVIDIA
QwQ 32B-Preview
Alibaba
Sonar Pro
Perplexity
Llama 3.3 Instruct 70B
Meta
Ling-mini-2.0
InclusionAI
GPT-4o (Nov '24)
OpenAI
Qwen3 VL 8B Instruct
Alibaba
Gemini 2.0 Flash-Lite (Feb '25)
Mistral Large 2 (Nov '24)
Mistral
Gemini 2.0 Flash-Lite (Preview)
GPT-4o (May '24)
OpenAI
Qwen3 32B (Non-reasoning)
Alibaba
Phi-3 Medium Instruct 14B
Microsoft Azure
Reka Flash 3
Reka AI
Claude 3.5 Sonnet (June '24)
Anthropic
Qwen3 4B (Reasoning)
Alibaba
Llama 3.1 Nemotron Instruct 70B
NVIDIA
GPT-4o (ChatGPT)
OpenAI
GPT-5 nano (minimal)
OpenAI
Llama 4 Scout
Meta
NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)
NVIDIA
Mistral Small 3.1
Mistral
Pixtral Large
Mistral
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)
NVIDIA
Command A
Cohere
GPT-4 Turbo
OpenAI
Hermes 4 - Llama-3.1 70B (Non-reasoning)
Nous Research
Aya Expanse 32B
Cohere
Nova Pro
Amazon
Qwen3 14B (Non-reasoning)
Alibaba
GPT-4.1 nano
OpenAI
Phi-4
Microsoft Azure
GLM-4.5V (Non-reasoning)
Z AI
Gemini 2.5 Flash-Lite (Non-reasoning)
Llama 3.1 Instruct 70B
Meta
Qwen3 1.7B (Reasoning)
Alibaba
Mistral Large 2 (Jul '24)
Mistral
Qwen2.5 Coder Instruct 32B
Alibaba
GPT-4
OpenAI
Mistral Small 3
Mistral
GPT-4o mini
OpenAI
Qwen3 4B (Non-reasoning)
Alibaba
Claude 3 Opus
Anthropic
Nova Lite
Amazon
Qwen3 30B A3B (Non-reasoning)
Alibaba
Ministral 8B
Mistral
Llama 3.1 Instruct 8B
Meta
Ministral 3 3B
Mistral
Olmo 3.1 32B Instruct
Allen Institute for AI
Reka Flash (Sep '24)
Reka AI
EXAONE 4.0 32B (Non-reasoning)
LG AI Research
Qwen2.5 Turbo
Alibaba
Llama 3.2 Instruct 90B (Vision)
Meta
Solar Mini
Upstage
Granite 4.0 H Small
IBM
CompactifAI Llama 3.1 8B Slim
Multiverse Computing
Llama 3.2 Instruct 11B (Vision)
Meta
Ministral 3B
Mistral
Granite 3.3 8B (Non-reasoning)
IBM
Qwen3 8B (Non-reasoning)
Alibaba
Qwen3 Omni 30B A3B Instruct
Alibaba
Jamba 1.5 Large
AI21 Labs
Jamba 1.6 Large
AI21 Labs
Hermes 3 - Llama-3.1 70B
Nous Research
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)
NVIDIA
Nova Micro
Amazon
Claude 3 Sonnet
Anthropic
Mistral Small (Sep '24)
Mistral
Llama 3 Instruct 70B
Meta
Phi-3 Mini Instruct 3.8B
Microsoft Azure
Aya Expanse 8B
Cohere
Mistral Large (Feb '24)
Mistral
Llama 3.2 Instruct 3B
Meta
Llama 2 Chat 7B
Meta
Jamba 1.7 Large
AI21 Labs
Claude 3 Haiku
Anthropic
Llama 3.2 Instruct 1B
Meta
Mistral Small (Feb '24)
Mistral
GPT-3.5 Turbo
OpenAI
Mistral Medium
Mistral
Pixtral 12B (2409)
Mistral
Llama 3 Instruct 8B
Meta
Command-R+ (Apr '24)
Cohere
Olmo 3 7B Instruct
Allen Institute for AI
Jamba 1.5 Mini
AI21 Labs
Jamba 1.6 Mini
AI21 Labs
Jamba 1.7 Mini
AI21 Labs
Gemma 2 9B
Mixtral 8x7B Instruct
Mistral
Command-R (Mar '24)
Cohere
Mistral 7B Instruct
Mistral
Command-R+ (Aug '24)
Cohere
Qwen3 1.7B (Non-reasoning)
Alibaba
Qwen3 0.6B (Reasoning)
Alibaba
Gemma 3n E4B Instruct
Qwen3 0.6B (Non-reasoning)
Alibaba
Mistral NeMo
Mistral
Command-R (Aug '24)
Cohere
Cogito v2.1 (Reasoning)
Deep Cogito
DeepSeek-OCR
DeepSeek
Grok 3 mini Reasoning (Low)
xAI
Doubao-Seed-1.8
ByteDance Seed
| Model | Provider | Input | Output | MMLU | GPQA | TPS |
|---|---|---|---|---|---|---|
| GPT-5.2 (xhigh) | OpenAI | $1.750 | $14.000 | 87.4% | 90.3% | 101.7 |
| Claude Opus 4.5 (Reasoning) | Anthropic | $5.000 | $25.000 | 89.5% | 86.6% | 83.5 |
| Gemini 3 Pro Preview (high) | $2.000 | $12.000 | 89.8% | 90.8% | 116.3 | |
| GPT-5.1 (high) | OpenAI | $1.250 | $10.000 | 87.0% | 87.3% | 93.4 |
| Gemini 3 Flash Preview (Reasoning) | $0.500 | $3.000 | 89.0% | 89.8% | 200.6 | |
| GPT-5.2 (medium) | OpenAI | $1.750 | $14.000 | 85.9% | 86.4% | - |
| GPT-5 (high) | OpenAI | $1.250 | $10.000 | 87.1% | 85.4% | 112.4 |
| GPT-5 Codex (high) | OpenAI | $1.250 | $10.000 | 86.5% | 83.7% | 156.2 |
| Claude Opus 4.5 (Non-reasoning) | Anthropic | $5.000 | $25.000 | 88.9% | 81.0% | 79.0 |
| Claude 4.5 Sonnet (Reasoning) | Anthropic | $3.000 | $15.000 | 87.5% | 83.4% | 77.6 |
| GPT-5 (medium) | OpenAI | $1.250 | $10.000 | 86.7% | 84.2% | 103.1 |
| GPT-5.1 Codex (high) | OpenAI | $1.250 | $10.000 | 86.0% | 86.0% | 202.5 |
| GLM-4.7 (Reasoning) | Z AI | $0.550 | $2.150 | 85.6% | 85.9% | 123.4 |
| DeepSeek V3.2 (Reasoning) | DeepSeek | $0.280 | $0.420 | 86.2% | 84.0% | 29.3 |
| Grok 4 | xAI | $3.000 | $15.000 | 86.6% | 87.7% | 39.7 |
| GPT-5 mini (high) | OpenAI | $0.250 | $2.000 | 83.7% | 82.8% | 75.5 |
| Gemini 3 Pro Preview (low) | $2.000 | $12.000 | 89.5% | 88.7% | 126.5 | |
| o3 | OpenAI | $2.000 | $8.000 | 85.3% | 82.7% | 217.4 |
| o3-pro | OpenAI | $20.000 | $80.000 | — | 84.5% | 29.4 |
| Kimi K2 Thinking | Kimi | $0.600 | $2.500 | 84.8% | 83.8% | 81.2 |
| MiniMax-M2.1 | MiniMax | $0.300 | $1.200 | 87.5% | 83.0% | 70.4 |
| MiMo-V2-Flash (Reasoning) | Xiaomi | $0.100 | $0.300 | 84.3% | 84.6% | 102.1 |
| GPT-5 mini (medium) | OpenAI | $0.250 | $2.000 | 82.8% | 80.3% | 74.8 |
| GPT-5 (low) | OpenAI | $1.250 | $10.000 | 86.0% | 80.8% | 98.9 |
| Claude 4 Sonnet (Reasoning) | Anthropic | $3.000 | $15.000 | 84.2% | 77.7% | 78.9 |
| GPT-5.1 Codex mini (high) | OpenAI | $0.250 | $2.000 | 82.0% | 81.3% | 116.5 |
| Grok 4.1 Fast (Reasoning) | xAI | $0.200 | $0.500 | 85.4% | 85.3% | 121.1 |
| Claude 4.5 Sonnet (Non-reasoning) | Anthropic | $3.000 | $15.000 | 86.0% | 72.7% | 54.1 |
| Claude 4.5 Haiku (Reasoning) | Anthropic | $1.000 | $5.000 | 76.0% | 67.2% | 101.2 |
| KAT-Coder-Pro V1 | KwaiKAT | $0.300 | $1.200 | 81.3% | 76.4% | 67.0 |
| MiniMax-M2 | MiniMax | $0.300 | $1.200 | 82.0% | 77.7% | 85.6 |
| Nova 2.0 Pro Preview (medium) | Amazon | $1.250 | $10.000 | 83.0% | 78.5% | 133.7 |
| Gemini 3 Flash Preview (Non-reasoning) | $0.500 | $3.000 | 88.2% | 81.2% | 167.8 | |
| Grok 4 Fast (Reasoning) | xAI | $0.200 | $0.500 | 85.0% | 84.7% | 137.7 |
| Claude 3.7 Sonnet (Reasoning) | Anthropic | $3.000 | $15.000 | 83.7% | 77.2% | - |
| Gemini 2.5 Pro | $1.250 | $10.000 | 86.2% | 84.4% | 152.1 | |
| GLM-4.7 (Non-reasoning) | Z AI | $0.550 | $2.150 | 79.4% | 66.4% | 116.0 |
| DeepSeek V3.1 Terminus (Reasoning) | DeepSeek | $0.400 | $2.000 | 85.1% | 79.2% | - |
| Doubao Seed Code | ByteDance Seed | $0.170 | $1.120 | 85.4% | 76.4% | 37.0 |
| GPT-5.2 (Non-reasoning) | OpenAI | $1.750 | $14.000 | 81.4% | 71.2% | 70.1 |
| gpt-oss-120B (high) | OpenAI | $0.150 | $0.600 | 80.8% | 78.2% | 327.4 |
| o4-mini (high) | OpenAI | $1.100 | $4.400 | 83.2% | 78.4% | 132.1 |
| Claude 4 Sonnet (Non-reasoning) | Anthropic | $3.000 | $15.000 | 83.7% | 68.3% | 72.5 |
| DeepSeek V3.2 Exp (Reasoning) | DeepSeek | $0.280 | $0.420 | 85.0% | 79.7% | 29.1 |
| Grok 3 mini Reasoning (high) | xAI | $0.300 | $0.500 | 82.8% | 79.1% | 193.2 |
| GLM-4.6 (Reasoning) | Z AI | $0.575 | $2.200 | 82.9% | 78.0% | 87.6 |
| Qwen3 Max Thinking | Alibaba | $1.200 | $6.000 | 82.4% | 77.6% | 36.8 |
| Nova 2.0 Pro Preview (low) | Amazon | $1.250 | $10.000 | 82.2% | 75.1% | 131.0 |
| DeepSeek V3.2 (Non-reasoning) | DeepSeek | $0.280 | $0.420 | 83.7% | 75.1% | 28.8 |
| Claude 4.1 Opus (Reasoning) | Anthropic | $15.000 | $75.000 | 88.0% | 80.9% | 48.4 |
| Qwen3 Max | Alibaba | $1.200 | $6.000 | 84.1% | 76.4% | 32.9 |
| Gemini 2.5 Flash Preview (Sep '25) (Reasoning) | $0.300 | $2.500 | 84.2% | 79.3% | 290.0 | |
| Claude 3.7 Sonnet (Non-reasoning) | Anthropic | $3.000 | $15.000 | 80.3% | 65.6% | - |
| Claude 4.5 Haiku (Non-reasoning) | Anthropic | $1.000 | $5.000 | 80.0% | 64.6% | 92.4 |
| o1 | OpenAI | $15.000 | $60.000 | 84.1% | 74.7% | 169.7 |
| MiMo-V2-Flash (Non-reasoning) | Xiaomi | $0.100 | $0.300 | 74.4% | 65.6% | 92.0 |
| Gemini 2.5 Pro Preview (Mar' 25) | $1.250 | $10.000 | 85.8% | 83.6% | - | |
| GLM-4.6 (Non-reasoning) | Z AI | $0.600 | $2.200 | 78.4% | 63.2% | 87.2 |
| Nova 2.0 Lite (medium) | Amazon | $0.300 | $2.500 | 81.3% | 76.8% | 240.2 |
| Gemini 2.5 Pro Preview (May' 25) | $1.250 | $10.000 | 83.7% | 82.2% | - | |
| Qwen3 235B A22B 2507 (Reasoning) | Alibaba | $0.700 | $8.400 | 84.3% | 79.0% | 75.9 |
| DeepSeek V3.2 Speciale | DeepSeek | $0.400 | $0.500 | 86.3% | 87.1% | - |
| ERNIE 5.0 Thinking Preview | Baidu | $0.840 | $3.370 | 83.0% | 77.7% | 25.6 |
| Qwen3 VL 32B (Reasoning) | Alibaba | $0.700 | $8.400 | 81.8% | 73.3% | 50.5 |
| DeepSeek V3.1 Terminus (Non-reasoning) | DeepSeek | $0.400 | $1.680 | 83.6% | 75.1% | - |
| DeepSeek V3.2 Exp (Non-reasoning) | DeepSeek | $0.280 | $0.420 | 83.6% | 73.8% | 29.4 |
| MiniMax-Text-01 | MiniMax | $0.200 | $1.100 | 75.9% | 57.8% | 28.5 |
| Kimi K2 0905 | Kimi | $0.990 | $2.500 | 81.9% | 76.7% | 47.2 |
| DeepSeek V3.1 (Reasoning) | DeepSeek | $0.590 | $1.690 | 85.1% | 77.9% | - |
| DeepSeek V3.1 (Non-reasoning) | DeepSeek | $0.560 | $1.680 | 83.3% | 73.5% | - |
| Nova 2.0 Omni (medium) | Amazon | $0.300 | $2.500 | 80.9% | 76.0% | - |
| Qwen3 VL 235B A22B (Reasoning) | Alibaba | $0.700 | $8.400 | 83.6% | 77.2% | 43.6 |
| Magistral Medium 1.2 | Mistral | $2.000 | $5.000 | 81.5% | 73.9% | 35.8 |
| Claude 4 Opus (Reasoning) | Anthropic | $15.000 | $75.000 | 87.3% | 79.6% | 48.8 |
| Gemini 2.5 Flash (Reasoning) | $0.300 | $2.500 | 83.2% | 79.0% | 262.3 | |
| GPT-5.1 (Non-reasoning) | OpenAI | $1.250 | $10.000 | 80.1% | 64.3% | 70.4 |
| DeepSeek R1 0528 (May '25) | DeepSeek | $1.350 | $4.200 | 84.9% | 81.3% | - |
| GLM-4.5 (Reasoning) | Z AI | $0.600 | $2.200 | 83.5% | 78.2% | 50.2 |
| GPT-5 nano (high) | OpenAI | $0.050 | $0.400 | 78.0% | 67.6% | 115.1 |
| Qwen3 Next 80B A3B (Reasoning) | Alibaba | $0.500 | $6.000 | 82.4% | 75.9% | 158.4 |
| Grok Code Fast 1 | xAI | $0.200 | $1.500 | 79.3% | 72.7% | 229.5 |
| Qwen3 Max (Preview) | Alibaba | $1.200 | $6.000 | 83.8% | 76.4% | 32.2 |
| GPT-5 nano (medium) | OpenAI | $0.050 | $0.400 | 77.2% | 67.0% | 126.3 |
| o3-mini | OpenAI | $1.100 | $4.400 | 79.1% | 74.8% | 142.7 |
| Kimi K2 | Kimi | $0.600 | $2.500 | 82.4% | 76.6% | 37.3 |
| GPT-4.1 | OpenAI | $2.000 | $8.000 | 80.6% | 66.6% | 76.9 |
| o1-pro | OpenAI | $150.000 | $600.000 | — | — | - |
| Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) | $0.300 | $2.500 | 83.6% | 76.6% | 249.2 | |
| Grok 3 | xAI | $3.000 | $15.000 | 79.9% | 69.3% | 41.9 |
| o3-mini (high) | OpenAI | $1.100 | $4.400 | 80.2% | 77.3% | 159.1 |
| Seed-OSS-36B-Instruct | ByteDance Seed | $0.210 | $0.570 | 81.5% | 72.6% | 29.4 |
| Nova 2.0 Lite (low) | Amazon | $0.300 | $2.500 | 78.8% | 69.8% | 250.8 |
| Qwen3 Coder 480B A35B Instruct | Alibaba | $1.500 | $7.500 | 78.8% | 61.8% | 40.5 |
| NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) | NVIDIA | $0.060 | $0.240 | 79.4% | 75.7% | 191.4 |
| gpt-oss-20B (high) | OpenAI | $0.070 | $0.200 | 74.8% | 68.8% | 307.3 |
| Qwen3 235B A22B 2507 Instruct | Alibaba | $0.700 | $2.800 | 82.8% | 75.3% | 55.4 |
| GPT-5 (minimal) | OpenAI | $1.250 | $10.000 | 80.6% | 67.3% | 75.0 |
| MiniMax M1 80k | MiniMax | $0.550 | $2.200 | 81.6% | 69.7% | - |
| Nova 2.0 Omni (low) | Amazon | $0.300 | $2.500 | 79.8% | 69.9% | - |
| GLM-4.6V (Reasoning) | Z AI | $0.300 | $0.900 | 79.9% | 71.9% | 68.0 |
| GLM-4.5-Air | Z AI | $0.200 | $1.100 | 81.5% | 73.3% | 98.9 |
| gpt-oss-120B (low) | OpenAI | $0.150 | $0.595 | 77.5% | 67.2% | 299.1 |
| Grok 4.1 Fast (Non-reasoning) | xAI | $0.200 | $0.500 | 74.3% | 63.7% | 81.1 |
| Nova 2.0 Pro Preview (Non-reasoning) | Amazon | $1.250 | $10.000 | 77.2% | 63.6% | 159.0 |
| o1-preview | OpenAI | $16.500 | $66.000 | — | — | - |
| Claude 4.1 Opus (Non-reasoning) | Anthropic | $15.000 | $75.000 | — | — | 37.9 |
| GPT-4.1 mini | OpenAI | $0.400 | $1.600 | 78.1% | 66.4% | 67.2 |
| Qwen3 30B A3B 2507 (Reasoning) | Alibaba | $0.200 | $2.400 | 80.5% | 70.7% | 165.3 |
| Grok 4 Fast (Non-reasoning) | xAI | $0.200 | $0.500 | 73.0% | 60.6% | 127.6 |
| DeepSeek V3 0324 | DeepSeek | $1.195 | $1.350 | 81.9% | 65.5% | - |
| Ring-1T | InclusionAI | $0.560 | $2.240 | 80.6% | 77.4% | 57.9 |
| Mistral Large 3 | Mistral | $0.500 | $1.500 | 80.7% | 68.0% | 50.9 |
| Magistral Small 1.2 | Mistral | $0.500 | $1.500 | 76.8% | 66.3% | 192.7 |
| Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) | $0.100 | $0.400 | 80.8% | 70.9% | 576.3 | |
| INTELLECT-3 | Prime Intellect | $0.200 | $1.100 | 82.2% | 76.1% | 84.8 |
| Claude 4 Opus (Non-reasoning) | Anthropic | $15.000 | $75.000 | 86.0% | 70.1% | 40.2 |
| CompactifAI Llama 4 Scout Slim | Multiverse Computing | $0.070 | $0.100 | 70.3% | 42.6% | 116.7 |
| GPT-5 (ChatGPT) | OpenAI | $1.250 | $10.000 | 82.0% | 68.6% | 169.6 |
| Hermes 4 - Llama-3.1 405B (Reasoning) | Nous Research | $1.000 | $3.000 | 82.9% | 72.7% | 33.8 |
| GPT-5 mini (minimal) | OpenAI | $0.250 | $2.000 | 77.5% | 68.7% | 72.1 |
| gpt-oss-20B (low) | OpenAI | $0.070 | $0.200 | 71.8% | 61.1% | 246.9 |
| Mistral Medium 3.1 | Mistral | $0.400 | $2.000 | 68.3% | 58.8% | 90.6 |
| Gemini 2.5 Flash (Non-reasoning) | $0.300 | $2.500 | 80.9% | 68.3% | 224.2 | |
| Qwen3 VL 235B A22B Instruct | Alibaba | $0.700 | $2.800 | 82.3% | 71.2% | 35.3 |
| Ring-flash-2.0 | InclusionAI | $0.140 | $0.570 | 79.3% | 72.5% | 87.3 |
| Hermes 4 - Llama-3.1 70B (Reasoning) | Nous Research | $0.130 | $0.400 | 81.1% | 69.9% | 81.0 |
| Qwen3 Coder 30B A3B Instruct | Alibaba | $0.450 | $2.250 | 70.6% | 51.6% | 98.6 |
| Qwen3 Next 80B A3B Instruct | Alibaba | $0.500 | $2.000 | 81.9% | 73.8% | 145.9 |
| Codestral (Jan '25) | Mistral | $0.300 | $0.900 | 44.6% | 31.2% | 212.0 |
| Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) | $0.100 | $0.400 | 79.6% | 65.1% | 450.6 | |
| Qwen3 235B A22B (Reasoning) | Alibaba | $0.700 | $8.400 | 82.8% | 70.0% | 54.2 |
| Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) | NVIDIA | $0.600 | $1.800 | 82.5% | 72.8% | 37.2 |
| Ling-flash-2.0 | InclusionAI | $0.140 | $0.570 | 77.7% | 65.7% | 54.6 |
| Qwen3 VL 30B A3B (Reasoning) | Alibaba | $0.200 | $2.400 | 80.7% | 72.0% | 106.6 |
| QwQ 32B | Alibaba | $0.430 | $0.600 | 76.4% | 59.3% | 29.6 |
| Llama Nemotron Super 49B v1.5 (Reasoning) | NVIDIA | $0.100 | $0.400 | 81.4% | 74.8% | 74.1 |
| Reka Core | Reka AI | $2.000 | $2.000 | — | — | 40.9 |
| GLM-4.5V (Reasoning) | Z AI | $0.600 | $1.800 | 78.8% | 68.4% | 31.4 |
| Nova Premier | Amazon | $2.500 | $12.500 | 73.3% | 56.9% | 75.1 |
| Magistral Medium 1 | Mistral | $2.000 | $5.000 | 75.3% | 67.9% | 60.2 |
| GPT-4o (Aug '24) | OpenAI | $2.500 | $10.000 | — | 52.1% | 79.0 |
| Llama 4 Maverick | Meta | $0.310 | $0.850 | 80.9% | 67.1% | 127.7 |
| Gemini 2.0 Flash (Feb '25) | $0.100 | $0.400 | 77.9% | 62.3% | - | |
| Devstral Medium | Mistral | $0.400 | $2.000 | 70.8% | 49.2% | 112.4 |
| Nova 2.0 Lite (Non-reasoning) | Amazon | $0.300 | $2.500 | 74.3% | 60.3% | 224.9 |
| Claude 3.5 Haiku | Anthropic | $0.800 | $4.000 | 63.4% | 40.8% | 48.4 |
| DeepSeek R1 (Jan '25) | DeepSeek | $1.350 | $4.000 | 84.4% | 70.8% | - |
| Reka Flash (Feb '24) | Reka AI | $0.200 | $0.800 | — | — | 69.1 |
| GPT-4o (March 2025, chatgpt-4o-latest) | OpenAI | $5.000 | $15.000 | 80.3% | 65.5% | 215.5 |
| Reka Edge | Reka AI | $0.100 | $0.100 | — | — | 63.8 |
| Gemini 2.5 Flash-Lite (Reasoning) | $0.100 | $0.400 | 75.9% | 62.5% | 369.9 | |
| Sonar Reasoning | Perplexity | $1.000 | $5.000 | — | 62.3% | - |
| Devstral Small (May '25) | Mistral | $0.100 | $0.300 | 63.2% | 43.4% | 200.0 |
| Llama 3.1 Instruct 405B | Meta | $3.750 | $6.750 | 73.2% | 51.5% | 24.9 |
| Mistral Medium 3 | Mistral | $0.400 | $2.000 | 76.0% | 57.8% | 95.7 |
| GLM-4.6V (Non-reasoning) | Z AI | $0.300 | $0.900 | 75.2% | 56.6% | 46.6 |
| Nova 2.0 Omni (Non-reasoning) | Amazon | $0.300 | $2.500 | 71.9% | 55.5% | 226.3 |
| Qwen3 235B A22B (Non-reasoning) | Alibaba | $0.700 | $2.800 | 76.2% | 61.3% | 45.7 |
| ERNIE 4.5 300B A47B | Baidu | $0.280 | $1.100 | 77.6% | 81.1% | 26.4 |
| Qwen3 VL 32B Instruct | Alibaba | $0.700 | $2.800 | 79.1% | 67.1% | 45.6 |
| DeepSeek R1 Distill Qwen 32B | DeepSeek | $0.285 | $0.285 | 73.9% | 61.5% | 39.4 |
| Qwen3 VL 8B (Reasoning) | Alibaba | $0.180 | $2.100 | 74.9% | 57.9% | 63.6 |
| Qwen3 32B (Reasoning) | Alibaba | $0.700 | $8.400 | 79.8% | 66.8% | 88.7 |
| DeepSeek V3 (Dec '24) | DeepSeek | $0.400 | $0.890 | 75.2% | 55.7% | - |
| Hermes 4 - Llama-3.1 405B (Non-reasoning) | Nous Research | $1.000 | $3.000 | 72.9% | 53.6% | 32.9 |
| EXAONE 4.0 32B (Reasoning) | LG AI Research | $0.600 | $1.000 | 81.8% | 73.9% | 96.2 |
| Qwen3 14B (Reasoning) | Alibaba | $0.350 | $4.200 | 77.4% | 60.4% | 58.8 |
| Magistral Small 1 | Mistral | $0.500 | $1.500 | 74.6% | 64.1% | 118.7 |
| Olmo 3 7B Think | Allen Institute for AI | $0.120 | $0.200 | 65.5% | 51.6% | 170.3 |
| CompactifAI Llama 3.3 70B Slim | Multiverse Computing | $0.160 | $0.310 | 57.1% | 35.5% | 130.7 |
| CompactifAI Mistral Small 3.1 Slim | Multiverse Computing | $0.050 | $0.080 | 53.8% | 32.9% | 121.6 |
| Qwen3 VL 30B A3B Instruct | Alibaba | $0.200 | $0.800 | 76.4% | 69.5% | 99.2 |
| DeepSeek R1 0528 Qwen3 8B | DeepSeek | $0.060 | $0.090 | 73.9% | 61.2% | - |
| Ministral 3 14B | Mistral | $0.200 | $0.200 | 69.3% | 57.2% | 126.2 |
| Qwen2.5 Max | Alibaba | $1.600 | $6.400 | 76.2% | 58.7% | 41.4 |
| DeepSeek R1 Distill Llama 70B | DeepSeek | $0.875 | $1.300 | 79.5% | 40.2% | 38.9 |
| Claude 3.5 Sonnet (Oct '24) | Anthropic | $3.000 | $15.000 | 77.2% | 59.9% | - |
| DeepSeek R1 Distill Qwen 14B | DeepSeek | $0.150 | $0.150 | 74.0% | 48.4% | - |
| Qwen3 30B A3B (Reasoning) | Alibaba | $0.200 | $2.400 | 77.7% | 61.6% | 66.5 |
| Qwen3 Omni 30B A3B (Reasoning) | Alibaba | $0.250 | $0.970 | 79.2% | 72.6% | 97.1 |
| Devstral Small (Jul '25) | Mistral | $0.100 | $0.300 | 62.2% | 41.4% | 237.5 |
| Qwen3 30B A3B 2507 Instruct | Alibaba | $0.200 | $0.800 | 77.7% | 65.9% | 59.8 |
| Llama Nemotron Super 49B v1.5 (Non-reasoning) | NVIDIA | $0.100 | $0.400 | 69.2% | 48.1% | 69.4 |
| NVIDIA Nemotron Nano 9B V2 (Reasoning) | NVIDIA | $0.040 | $0.160 | 74.2% | 57.0% | 118.2 |
| Sonar | Perplexity | $1.000 | $1.000 | 68.9% | 47.1% | 78.3 |
| Mistral Small 3.2 | Mistral | $0.100 | $0.300 | 68.1% | 50.5% | 111.3 |
| Qwen3 8B (Reasoning) | Alibaba | $0.180 | $2.100 | 74.3% | 58.9% | 85.1 |
| Ministral 3 8B | Mistral | $0.150 | $0.150 | 64.2% | 47.1% | 196.9 |
| NVIDIA Nemotron Nano 12B v2 VL (Reasoning) | NVIDIA | $0.200 | $0.600 | 75.9% | 57.2% | 128.9 |
| QwQ 32B-Preview | Alibaba | $0.120 | $0.180 | 64.8% | 55.7% | 40.8 |
| Sonar Pro | Perplexity | $3.000 | $15.000 | 75.5% | 57.8% | 86.2 |
| Llama 3.3 Instruct 70B | Meta | $0.585 | $0.715 | 71.3% | 49.8% | 105.5 |
| Ling-mini-2.0 | InclusionAI | $0.070 | $0.280 | 67.1% | 56.2% | 156.9 |
| GPT-4o (Nov '24) | OpenAI | $2.500 | $10.000 | 74.8% | 54.3% | 116.4 |
| Qwen3 VL 8B Instruct | Alibaba | $0.180 | $0.700 | 68.6% | 42.7% | 98.3 |
| Gemini 2.0 Flash-Lite (Feb '25) | $0.075 | $0.300 | 72.4% | 53.5% | - | |
| Mistral Large 2 (Nov '24) | Mistral | $2.000 | $6.000 | 69.7% | 48.6% | 45.9 |
| Gemini 2.0 Flash-Lite (Preview) | $0.075 | $0.300 | — | 54.2% | - | |
| GPT-4o (May '24) | OpenAI | $5.000 | $15.000 | 74.0% | 52.6% | 78.4 |
| Qwen3 32B (Non-reasoning) | Alibaba | $0.700 | $2.800 | 72.7% | 53.5% | 85.6 |
| Phi-3 Medium Instruct 14B | Microsoft Azure | $0.170 | $0.680 | 54.3% | 32.6% | - |
| Reka Flash 3 | Reka AI | $0.200 | $0.800 | 66.9% | 52.9% | 49.3 |
| Claude 3.5 Sonnet (June '24) | Anthropic | $3.000 | $15.000 | 75.1% | 56.0% | - |
| Qwen3 4B (Reasoning) | Alibaba | $0.110 | $1.260 | 69.6% | 52.2% | 91.0 |
| Llama 3.1 Nemotron Instruct 70B | NVIDIA | $1.200 | $1.200 | 69.0% | 46.5% | 39.6 |
| GPT-4o (ChatGPT) | OpenAI | $5.000 | $15.000 | 77.3% | 51.1% | 202.7 |
| GPT-5 nano (minimal) | OpenAI | $0.050 | $0.400 | 55.6% | 42.8% | 116.8 |
| Llama 4 Scout | Meta | $0.180 | $0.625 | 75.2% | 58.7% | 141.7 |
| NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) | NVIDIA | $0.060 | $0.240 | 57.9% | 39.9% | 174.9 |
| Mistral Small 3.1 | Mistral | $0.100 | $0.300 | 65.9% | 45.4% | 106.0 |
| Pixtral Large | Mistral | $2.000 | $6.000 | 70.1% | 50.5% | 30.6 |
| NVIDIA Nemotron Nano 9B V2 (Non-reasoning) | NVIDIA | $0.060 | $0.230 | 73.9% | 55.7% | 111.8 |
| Command A | Cohere | $2.500 | $10.000 | 71.2% | 52.7% | 52.8 |
| GPT-4 Turbo | OpenAI | $10.000 | $30.000 | 69.4% | — | 27.0 |
| Hermes 4 - Llama-3.1 70B (Non-reasoning) | Nous Research | $0.130 | $0.400 | 66.4% | 49.1% | 73.0 |
| Aya Expanse 32B | Cohere | $0.500 | $1.500 | 37.7% | 23.0% | 41.9 |
| Nova Pro | Amazon | $0.800 | $3.200 | 69.1% | 49.9% | - |
| Qwen3 14B (Non-reasoning) | Alibaba | $0.350 | $1.400 | 67.5% | 47.0% | 54.8 |
| GPT-4.1 nano | OpenAI | $0.100 | $0.400 | 65.7% | 51.2% | 130.1 |
| Phi-4 | Microsoft Azure | $0.125 | $0.500 | 71.4% | 57.5% | 10.9 |
| GLM-4.5V (Non-reasoning) | Z AI | $0.600 | $1.800 | 75.1% | 57.3% | 28.3 |
| Gemini 2.5 Flash-Lite (Non-reasoning) | $0.100 | $0.400 | 72.4% | 47.4% | 256.1 | |
| Llama 3.1 Instruct 70B | Meta | $0.560 | $0.560 | 67.6% | 40.9% | 61.5 |
| Qwen3 1.7B (Reasoning) | Alibaba | $0.110 | $1.260 | 57.0% | 35.6% | 124.8 |
| Mistral Large 2 (Jul '24) | Mistral | $2.000 | $6.000 | 68.3% | 47.2% | - |
| Qwen2.5 Coder Instruct 32B | Alibaba | $0.130 | $0.175 | 63.5% | 41.7% | 35.7 |
| GPT-4 | OpenAI | $30.000 | $60.000 | — | — | 28.3 |
| Mistral Small 3 | Mistral | $0.100 | $0.300 | 65.2% | 46.2% | 231.4 |
| GPT-4o mini | OpenAI | $0.150 | $0.600 | 64.8% | 42.6% | 49.0 |
| Qwen3 4B (Non-reasoning) | Alibaba | $0.110 | $0.420 | 58.6% | 39.8% | 85.1 |
| Claude 3 Opus | Anthropic | $15.000 | $75.000 | 69.6% | 48.9% | - |
| Nova Lite | Amazon | $0.060 | $0.240 | 59.0% | 43.3% | 238.3 |
| Qwen3 30B A3B (Non-reasoning) | Alibaba | $0.200 | $0.800 | 71.0% | 51.5% | 61.7 |
| Ministral 8B | Mistral | $0.100 | $0.100 | 38.9% | 27.6% | 195.9 |
| Llama 3.1 Instruct 8B | Meta | $0.100 | $0.100 | 47.6% | 25.9% | 162.8 |
| Ministral 3 3B | Mistral | $0.100 | $0.100 | 52.4% | 35.8% | 293.0 |
| Olmo 3.1 32B Instruct | Allen Institute for AI | $0.200 | $0.600 | — | 53.9% | 47.6 |
| Reka Flash (Sep '24) | Reka AI | $0.200 | $0.800 | — | — | 71.1 |
| EXAONE 4.0 32B (Non-reasoning) | LG AI Research | $0.600 | $1.000 | 76.8% | 62.8% | 88.3 |
| Qwen2.5 Turbo | Alibaba | $0.050 | $0.200 | 63.3% | 41.0% | 66.4 |
| Llama 3.2 Instruct 90B (Vision) | Meta | $0.720 | $0.720 | 67.1% | 43.2% | 36.6 |
| Solar Mini | Upstage | $0.150 | $0.150 | — | — | 84.5 |
| Granite 4.0 H Small | IBM | $0.060 | $0.250 | 62.4% | 41.6% | 370.8 |
| CompactifAI Llama 3.1 8B Slim | Multiverse Computing | $0.050 | $0.070 | 32.1% | 22.1% | 228.4 |
| Llama 3.2 Instruct 11B (Vision) | Meta | $0.160 | $0.160 | 46.4% | 22.1% | 48.8 |
| Ministral 3B | Mistral | $0.040 | $0.040 | 33.9% | 26.0% | 272.6 |
| Granite 3.3 8B (Non-reasoning) | IBM | $0.030 | $0.250 | 46.8% | 33.8% | 488.2 |
| Qwen3 8B (Non-reasoning) | Alibaba | $0.180 | $0.700 | 64.3% | 45.2% | 79.7 |
| Qwen3 Omni 30B A3B Instruct | Alibaba | $0.250 | $0.970 | 72.5% | 62.0% | 88.3 |
| Jamba 1.5 Large | AI21 Labs | $2.000 | $8.000 | 57.2% | 42.7% | - |
| Jamba 1.6 Large | AI21 Labs | $2.000 | $8.000 | 56.5% | 38.7% | 44.9 |
| Hermes 3 - Llama-3.1 70B | Nous Research | $0.300 | $0.300 | 57.1% | 40.1% | 41.3 |
| NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) | NVIDIA | $0.200 | $0.600 | 64.9% | 43.9% | 131.2 |
| Nova Micro | Amazon | $0.035 | $0.140 | 53.1% | 35.8% | 421.4 |
| Claude 3 Sonnet | Anthropic | $3.000 | $15.000 | 57.9% | 40.0% | - |
| Mistral Small (Sep '24) | Mistral | $0.200 | $0.600 | 52.9% | 38.1% | 110.9 |
| Llama 3 Instruct 70B | Meta | $0.510 | $0.740 | 57.4% | 37.9% | 38.0 |
| Phi-3 Mini Instruct 3.8B | Microsoft Azure | $0.130 | $0.520 | 43.5% | 31.9% | - |
| Aya Expanse 8B | Cohere | $0.500 | $1.500 | 31.2% | 24.7% | 79.5 |
| Mistral Large (Feb '24) | Mistral | $4.000 | $12.000 | 51.5% | 35.1% | - |
| Llama 3.2 Instruct 3B | Meta | $0.060 | $0.060 | 34.7% | 25.5% | 46.5 |
| Llama 2 Chat 7B | Meta | $0.050 | $0.250 | 16.4% | 22.7% | 109.5 |
| Jamba 1.7 Large | AI21 Labs | $2.000 | $8.000 | 57.7% | 39.0% | 41.9 |
| Claude 3 Haiku | Anthropic | $0.250 | $1.250 | — | — | 111.8 |
| Llama 3.2 Instruct 1B | Meta | $0.053 | $0.055 | 20.0% | 19.6% | 74.5 |
| Mistral Small (Feb '24) | Mistral | $1.000 | $3.000 | 41.9% | 30.2% | 108.1 |
| GPT-3.5 Turbo | OpenAI | $0.500 | $1.500 | 46.2% | 29.7% | 112.3 |
| Mistral Medium | Mistral | $2.750 | $8.100 | 49.1% | 34.9% | 92.0 |
| Pixtral 12B (2409) | Mistral | $0.150 | $0.150 | 47.3% | 34.3% | 136.6 |
| Llama 3 Instruct 8B | Meta | $0.045 | $0.155 | 40.5% | 29.6% | 67.1 |
| Command-R+ (Apr '24) | Cohere | $3.000 | $15.000 | 43.2% | 32.3% | - |
| Olmo 3 7B Instruct | Allen Institute for AI | $0.100 | $0.200 | 52.2% | 40.0% | 37.0 |
| Jamba 1.5 Mini | AI21 Labs | $0.200 | $0.400 | 37.1% | 30.2% | - |
| Jamba 1.6 Mini | AI21 Labs | $0.200 | $0.400 | 36.7% | 30.0% | 127.1 |
| Jamba 1.7 Mini | AI21 Labs | $0.200 | $0.400 | 38.8% | 32.2% | 158.1 |
| Gemma 2 9B | $0.030 | $0.090 | 49.5% | 31.1% | - | |
| Mixtral 8x7B Instruct | Mistral | $0.540 | $0.600 | 38.7% | 29.2% | - |
| Command-R (Mar '24) | Cohere | $0.500 | $1.500 | 33.8% | 28.4% | - |
| Mistral 7B Instruct | Mistral | $0.250 | $0.250 | 24.5% | 17.7% | 125.6 |
| Command-R+ (Aug '24) | Cohere | $2.500 | $10.000 | 42.7% | 33.7% | 21.0 |
| Qwen3 1.7B (Non-reasoning) | Alibaba | $0.110 | $0.420 | 41.1% | 28.3% | 116.5 |
| Qwen3 0.6B (Reasoning) | Alibaba | $0.110 | $1.260 | 34.7% | 23.9% | 199.0 |
| Gemma 3n E4B Instruct | $0.020 | $0.040 | 48.8% | 29.6% | 60.5 | |
| Qwen3 0.6B (Non-reasoning) | Alibaba | $0.110 | $0.420 | 23.1% | 23.1% | 190.1 |
| Mistral NeMo | Mistral | $0.150 | $0.150 | 39.9% | 31.4% | 190.2 |
| Command-R (Aug '24) | Cohere | $0.150 | $0.600 | 33.7% | 28.9% | 59.8 |
| Cogito v2.1 (Reasoning) | Deep Cogito | $1.250 | $1.250 | 84.9% | 76.8% | 73.9 |
| DeepSeek-OCR | DeepSeek | $0.030 | $0.100 | — | — | 306.8 |
| Grok 3 mini Reasoning (Low) | xAI | $0.300 | $0.500 | — | — | 110.8 |
| Doubao-Seed-1.8 | ByteDance Seed | $0.110 | $0.280 | — | — | 43.1 |