Live AI model pricing & benchmarks

AI Stats

290 LLM models tracked with pricing & performance benchmarks.
Data sourced from ArtificialAnalysis.com & Epoch.ai

Artificial Analysis + Epoch AI
Top MMLU model
Gemini 3 Pro Preview (high)
89.8%
Cheapest blended price (3:1)
Gemma 3n E4B Instruct
$0.025
Top ECI Score
Gemini 3 Pro Preview
154.4 ECI
Epoch.ai Models
203
84 benchmark runs
Total models
290

From ArtificialAnalysis.com

Avg input price
$1.841

per 1M input tokens

Avg output price
$7.799

per 1M output tokens

Leaderboard

Switch metrics to see different top-5 rankings.

  • 1
    Gemini 3 Pro Preview (high)
    Google
    89.8%
  • 2
    Claude Opus 4.5 (Reasoning)
    Anthropic
    89.5%
  • 3
    Gemini 3 Pro Preview (low)
    Google
    89.5%
  • 4
    Gemini 3 Flash Preview (Reasoning)
    Google
    89.0%
  • 5
    Claude Opus 4.5 (Non-reasoning)
    Anthropic
    88.9%

All models

Search and scan every tracked model.

Showing 20 of 290 models

GPT-5.2 (xhigh)

OpenAI

Input: $1.750
Output: $14.000
Pricing
Input (1M)
$1.750
Output (1M)
$14.000
Blended (3:1)
$4.813
Performance
Tokens / s
101.7
TTFT (token)
36.40s
TTFT (answer)
36.40s
Benchmarks
MMLU Pro 87.4%
GPQA 90.3%
HLE 35.4%
AIME
LiveCodeBench 88.9%
SciCode 52.1%
Math 500
AA Indexes
Intelligence
51.1%
Coding
48.7%
Math
99.0%

Claude Opus 4.5 (Reasoning)

Anthropic

Input: $5.000
Output: $25.000
Pricing
Input (1M)
$5.000
Output (1M)
$25.000
Blended (3:1)
$10.000
Performance
Tokens / s
83.5
TTFT (token)
1.66s
TTFT (answer)
25.60s
Benchmarks
MMLU Pro 89.5%
GPQA 86.6%
HLE 28.4%
AIME
LiveCodeBench 87.1%
SciCode 49.5%
Math 500
AA Indexes
Intelligence
49.6%
Coding
47.8%
Math
91.3%

Gemini 3 Pro Preview (high)

Google

Input: $2.000
Output: $12.000
Pricing
Input (1M)
$2.000
Output (1M)
$12.000
Blended (3:1)
$4.500
Performance
Tokens / s
116.3
TTFT (token)
31.67s
TTFT (answer)
31.67s
Benchmarks
MMLU Pro 89.8%
GPQA 90.8%
HLE 37.2%
AIME
LiveCodeBench 91.7%
SciCode 56.1%
Math 500
AA Indexes
Intelligence
48.4%
Coding
46.5%
Math
95.7%

GPT-5.1 (high)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
93.4
TTFT (token)
42.56s
TTFT (answer)
42.56s
Benchmarks
MMLU Pro 87.0%
GPQA 87.3%
HLE 26.5%
AIME
LiveCodeBench 86.8%
SciCode 43.3%
Math 500
AA Indexes
Intelligence
47.5%
Coding
44.7%
Math
94.0%

Gemini 3 Flash Preview (Reasoning)

Google

Input: $0.500
Output: $3.000
Pricing
Input (1M)
$0.500
Output (1M)
$3.000
Blended (3:1)
$1.125
Performance
Tokens / s
200.6
TTFT (token)
12.96s
TTFT (answer)
12.96s
Benchmarks
MMLU Pro 89.0%
GPQA 89.8%
HLE 34.7%
AIME
LiveCodeBench 90.8%
SciCode 50.6%
Math 500
AA Indexes
Intelligence
46.2%
Coding
42.6%
Math
97.0%

GPT-5.2 (medium)

OpenAI

Input: $1.750
Output: $14.000
Pricing
Input (1M)
$1.750
Output (1M)
$14.000
Blended (3:1)
$4.813
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 85.9%
GPQA 86.4%
HLE 24.9%
AIME
LiveCodeBench 89.4%
SciCode 46.2%
Math 500
AA Indexes
Intelligence
45.8%
Coding
44.2%
Math
96.7%

GPT-5 (high)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
112.4
TTFT (token)
102.36s
TTFT (answer)
102.36s
Benchmarks
MMLU Pro 87.1%
GPQA 85.4%
HLE 26.5%
AIME 95.7%
LiveCodeBench 84.6%
SciCode 42.9%
Math 500 99.4%
AA Indexes
Intelligence
44.5%
Coding
36.0%
Math
94.3%

GPT-5 Codex (high)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
156.2
TTFT (token)
22.61s
TTFT (answer)
22.61s
Benchmarks
MMLU Pro 86.5%
GPQA 83.7%
HLE 25.6%
AIME
LiveCodeBench 84.0%
SciCode 40.9%
Math 500
AA Indexes
Intelligence
44.4%
Coding
38.9%
Math
98.7%

Claude Opus 4.5 (Non-reasoning)

Anthropic

Input: $5.000
Output: $25.000
Pricing
Input (1M)
$5.000
Output (1M)
$25.000
Blended (3:1)
$10.000
Performance
Tokens / s
79.0
TTFT (token)
1.74s
TTFT (answer)
1.74s
Benchmarks
MMLU Pro 88.9%
GPQA 81.0%
HLE 12.9%
AIME
LiveCodeBench 73.8%
SciCode 47.0%
Math 500
AA Indexes
Intelligence
43.0%
Coding
42.9%
Math
62.7%

Claude 4.5 Sonnet (Reasoning)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
77.6
TTFT (token)
1.48s
TTFT (answer)
27.24s
Benchmarks
MMLU Pro 87.5%
GPQA 83.4%
HLE 17.3%
AIME
LiveCodeBench 71.4%
SciCode 44.7%
Math 500
AA Indexes
Intelligence
42.8%
Coding
38.6%
Math
88.0%

GPT-5 (medium)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
103.1
TTFT (token)
47.16s
TTFT (answer)
47.16s
Benchmarks
MMLU Pro 86.7%
GPQA 84.2%
HLE 23.5%
AIME 91.7%
LiveCodeBench 70.3%
SciCode 41.1%
Math 500 99.1%
AA Indexes
Intelligence
42.2%
Coding
39.0%
Math
91.7%

GPT-5.1 Codex (high)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
202.5
TTFT (token)
11.86s
TTFT (answer)
11.86s
Benchmarks
MMLU Pro 86.0%
GPQA 86.0%
HLE 23.4%
AIME
LiveCodeBench 84.9%
SciCode 40.2%
Math 500
AA Indexes
Intelligence
42.0%
Coding
36.6%
Math
95.7%

GLM-4.7 (Reasoning)

Z AI

Input: $0.550
Output: $2.150
Pricing
Input (1M)
$0.550
Output (1M)
$2.150
Blended (3:1)
$0.938
Performance
Tokens / s
123.4
TTFT (token)
0.68s
TTFT (answer)
16.89s
Benchmarks
MMLU Pro 85.6%
GPQA 85.9%
HLE 25.1%
AIME
LiveCodeBench 89.4%
SciCode 45.1%
Math 500
AA Indexes
Intelligence
42.0%
Coding
36.3%
Math
95.0%

DeepSeek V3.2 (Reasoning)

DeepSeek

Input: $0.280
Output: $0.420
Pricing
Input (1M)
$0.280
Output (1M)
$0.420
Blended (3:1)
$0.315
Performance
Tokens / s
29.3
TTFT (token)
1.17s
TTFT (answer)
69.36s
Benchmarks
MMLU Pro 86.2%
GPQA 84.0%
HLE 22.2%
AIME
LiveCodeBench 86.2%
SciCode 38.9%
Math 500
AA Indexes
Intelligence
41.6%
Coding
36.7%
Math
92.0%

Grok 4

xAI

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
39.7
TTFT (token)
7.94s
TTFT (answer)
7.94s
Benchmarks
MMLU Pro 86.6%
GPQA 87.7%
HLE 23.9%
AIME 94.3%
LiveCodeBench 81.9%
SciCode 45.7%
Math 500 99.0%
AA Indexes
Intelligence
41.4%
Coding
40.5%
Math
92.7%

GPT-5 mini (high)

OpenAI

Input: $0.250
Output: $2.000
Pricing
Input (1M)
$0.250
Output (1M)
$2.000
Blended (3:1)
$0.688
Performance
Tokens / s
75.5
TTFT (token)
109.46s
TTFT (answer)
109.46s
Benchmarks
MMLU Pro 83.7%
GPQA 82.8%
HLE 19.7%
AIME
LiveCodeBench 83.8%
SciCode 39.2%
Math 500
AA Indexes
Intelligence
41.0%
Coding
35.3%
Math
90.7%

Gemini 3 Pro Preview (low)

Google

Input: $2.000
Output: $12.000
Pricing
Input (1M)
$2.000
Output (1M)
$12.000
Blended (3:1)
$4.500
Performance
Tokens / s
126.5
TTFT (token)
4.12s
TTFT (answer)
4.12s
Benchmarks
MMLU Pro 89.5%
GPQA 88.7%
HLE 27.6%
AIME
LiveCodeBench 85.7%
SciCode 49.9%
Math 500
AA Indexes
Intelligence
40.9%
Coding
39.4%
Math
86.7%

o3

OpenAI

Input: $2.000
Output: $8.000
Pricing
Input (1M)
$2.000
Output (1M)
$8.000
Blended (3:1)
$3.500
Performance
Tokens / s
217.4
TTFT (token)
12.90s
TTFT (answer)
12.90s
Benchmarks
MMLU Pro 85.3%
GPQA 82.7%
HLE 20.0%
AIME 90.3%
LiveCodeBench 80.8%
SciCode 41.0%
Math 500 99.2%
AA Indexes
Intelligence
40.9%
Coding
38.4%
Math
88.3%

o3-pro

OpenAI

Input: $20.000
Output: $80.000
Pricing
Input (1M)
$20.000
Output (1M)
$80.000
Blended (3:1)
$35.000
Performance
Tokens / s
29.4
TTFT (token)
92.48s
TTFT (answer)
92.48s
Benchmarks
MMLU Pro
GPQA 84.5%
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
40.7%
Coding
Math

Kimi K2 Thinking

Kimi

Input: $0.600
Output: $2.500
Pricing
Input (1M)
$0.600
Output (1M)
$2.500
Blended (3:1)
$1.075
Performance
Tokens / s
81.2
TTFT (token)
0.66s
TTFT (answer)
25.29s
Benchmarks
MMLU Pro 84.8%
GPQA 83.8%
HLE 22.3%
AIME
LiveCodeBench 85.3%
SciCode 42.4%
Math 500
AA Indexes
Intelligence
40.6%
Coding
34.8%
Math
94.7%

MiniMax-M2.1

MiniMax

Input: $0.300
Output: $1.200
Pricing
Input (1M)
$0.300
Output (1M)
$1.200
Blended (3:1)
$0.525
Performance
Tokens / s
70.4
TTFT (token)
1.89s
TTFT (answer)
30.29s
Benchmarks
MMLU Pro 87.5%
GPQA 83.0%
HLE 22.2%
AIME
LiveCodeBench 81.0%
SciCode 40.7%
Math 500
AA Indexes
Intelligence
39.7%
Coding
32.8%
Math
82.7%

MiMo-V2-Flash (Reasoning)

Xiaomi

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
102.1
TTFT (token)
2.76s
TTFT (answer)
22.35s
Benchmarks
MMLU Pro 84.3%
GPQA 84.6%
HLE 21.1%
AIME
LiveCodeBench 86.8%
SciCode 39.4%
Math 500
AA Indexes
Intelligence
39.4%
Coding
31.8%
Math
96.3%

GPT-5 mini (medium)

OpenAI

Input: $0.250
Output: $2.000
Pricing
Input (1M)
$0.250
Output (1M)
$2.000
Blended (3:1)
$0.688
Performance
Tokens / s
74.8
TTFT (token)
30.71s
TTFT (answer)
30.71s
Benchmarks
MMLU Pro 82.8%
GPQA 80.3%
HLE 14.6%
AIME
LiveCodeBench 69.2%
SciCode 41.0%
Math 500
AA Indexes
Intelligence
38.9%
Coding
32.9%
Math
85.0%

GPT-5 (low)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
98.9
TTFT (token)
27.90s
TTFT (answer)
27.90s
Benchmarks
MMLU Pro 86.0%
GPQA 80.8%
HLE 18.4%
AIME 83.0%
LiveCodeBench 76.3%
SciCode 39.1%
Math 500 98.7%
AA Indexes
Intelligence
38.9%
Coding
30.7%
Math
83.0%

Claude 4 Sonnet (Reasoning)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
78.9
TTFT (token)
1.55s
TTFT (answer)
26.90s
Benchmarks
MMLU Pro 84.2%
GPQA 77.7%
HLE 9.6%
AIME 77.3%
LiveCodeBench 65.5%
SciCode 40.0%
Math 500 99.1%
AA Indexes
Intelligence
38.5%
Coding
34.1%
Math
74.3%

GPT-5.1 Codex mini (high)

OpenAI

Input: $0.250
Output: $2.000
Pricing
Input (1M)
$0.250
Output (1M)
$2.000
Blended (3:1)
$0.688
Performance
Tokens / s
116.5
TTFT (token)
12.27s
TTFT (answer)
12.27s
Benchmarks
MMLU Pro 82.0%
GPQA 81.3%
HLE 16.9%
AIME
LiveCodeBench 83.6%
SciCode 42.6%
Math 500
AA Indexes
Intelligence
38.4%
Coding
36.4%
Math
91.7%

Grok 4.1 Fast (Reasoning)

xAI

Input: $0.200
Output: $0.500
Pricing
Input (1M)
$0.200
Output (1M)
$0.500
Blended (3:1)
$0.275
Performance
Tokens / s
121.1
TTFT (token)
14.68s
TTFT (answer)
14.68s
Benchmarks
MMLU Pro 85.4%
GPQA 85.3%
HLE 17.6%
AIME
LiveCodeBench 82.2%
SciCode 44.2%
Math 500
AA Indexes
Intelligence
38.4%
Coding
30.9%
Math
89.3%

Claude 4.5 Sonnet (Non-reasoning)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
54.1
TTFT (token)
1.14s
TTFT (answer)
1.14s
Benchmarks
MMLU Pro 86.0%
GPQA 72.7%
HLE 7.1%
AIME
LiveCodeBench 59.0%
SciCode 42.8%
Math 500
AA Indexes
Intelligence
36.9%
Coding
33.5%
Math
37.0%

Claude 4.5 Haiku (Reasoning)

Anthropic

Input: $1.000
Output: $5.000
Pricing
Input (1M)
$1.000
Output (1M)
$5.000
Blended (3:1)
$2.000
Performance
Tokens / s
101.2
TTFT (token)
0.72s
TTFT (answer)
20.48s
Benchmarks
MMLU Pro 76.0%
GPQA 67.2%
HLE 9.7%
AIME
LiveCodeBench 61.5%
SciCode 43.3%
Math 500
AA Indexes
Intelligence
36.9%
Coding
32.6%
Math
83.7%

KAT-Coder-Pro V1

KwaiKAT

Input: $0.300
Output: $1.200
Pricing
Input (1M)
$0.300
Output (1M)
$1.200
Blended (3:1)
$0.525
Performance
Tokens / s
67.0
TTFT (token)
2.07s
TTFT (answer)
2.07s
Benchmarks
MMLU Pro 81.3%
GPQA 76.4%
HLE 33.4%
AIME
LiveCodeBench 74.7%
SciCode 36.6%
Math 500
AA Indexes
Intelligence
36.2%
Coding
18.3%
Math
94.7%

MiniMax-M2

MiniMax

Input: $0.300
Output: $1.200
Pricing
Input (1M)
$0.300
Output (1M)
$1.200
Blended (3:1)
$0.525
Performance
Tokens / s
85.6
TTFT (token)
1.37s
TTFT (answer)
24.74s
Benchmarks
MMLU Pro 82.0%
GPQA 77.7%
HLE 12.5%
AIME
LiveCodeBench 82.6%
SciCode 36.1%
Math 500
AA Indexes
Intelligence
36.0%
Coding
29.2%
Math
78.3%

Nova 2.0 Pro Preview (medium)

Amazon

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
133.7
TTFT (token)
17.30s
TTFT (answer)
32.27s
Benchmarks
MMLU Pro 83.0%
GPQA 78.5%
HLE 8.9%
AIME
LiveCodeBench 73.0%
SciCode 42.7%
Math 500
AA Indexes
Intelligence
35.6%
Coding
30.4%
Math
89.0%

Gemini 3 Flash Preview (Non-reasoning)

Google

Input: $0.500
Output: $3.000
Pricing
Input (1M)
$0.500
Output (1M)
$3.000
Blended (3:1)
$1.125
Performance
Tokens / s
167.8
TTFT (token)
0.82s
TTFT (answer)
0.82s
Benchmarks
MMLU Pro 88.2%
GPQA 81.2%
HLE 14.1%
AIME
LiveCodeBench 79.7%
SciCode 49.9%
Math 500
AA Indexes
Intelligence
35.1%
Coding
37.8%
Math
55.7%

Grok 4 Fast (Reasoning)

xAI

Input: $0.200
Output: $0.500
Pricing
Input (1M)
$0.200
Output (1M)
$0.500
Blended (3:1)
$0.275
Performance
Tokens / s
137.7
TTFT (token)
4.56s
TTFT (answer)
4.56s
Benchmarks
MMLU Pro 85.0%
GPQA 84.7%
HLE 17.0%
AIME
LiveCodeBench 83.2%
SciCode 44.2%
Math 500
AA Indexes
Intelligence
34.8%
Coding
27.4%
Math
89.7%

Claude 3.7 Sonnet (Reasoning)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 83.7%
GPQA 77.2%
HLE 10.3%
AIME 48.7%
LiveCodeBench 47.3%
SciCode 40.3%
Math 500 94.7%
AA Indexes
Intelligence
34.6%
Coding
27.6%
Math
56.3%

Gemini 2.5 Pro

Google

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
152.1
TTFT (token)
36.55s
TTFT (answer)
36.55s
Benchmarks
MMLU Pro 86.2%
GPQA 84.4%
HLE 21.1%
AIME 88.7%
LiveCodeBench 80.1%
SciCode 42.8%
Math 500 96.7%
AA Indexes
Intelligence
34.4%
Coding
31.9%
Math
87.7%

GLM-4.7 (Non-reasoning)

Z AI

Input: $0.550
Output: $2.150
Pricing
Input (1M)
$0.550
Output (1M)
$2.150
Blended (3:1)
$0.938
Performance
Tokens / s
116.0
TTFT (token)
0.55s
TTFT (answer)
0.55s
Benchmarks
MMLU Pro 79.4%
GPQA 66.4%
HLE 6.1%
AIME
LiveCodeBench 56.2%
SciCode 35.4%
Math 500
AA Indexes
Intelligence
34.1%
Coding
32.0%
Math
48.0%

DeepSeek V3.1 Terminus (Reasoning)

DeepSeek

Input: $0.400
Output: $2.000
Pricing
Input (1M)
$0.400
Output (1M)
$2.000
Blended (3:1)
$0.800
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 85.1%
GPQA 79.2%
HLE 15.2%
AIME
LiveCodeBench 79.8%
SciCode 40.6%
Math 500
AA Indexes
Intelligence
33.8%
Coding
33.7%
Math
89.7%

Doubao Seed Code

ByteDance Seed

Input: $0.170
Output: $1.120
Pricing
Input (1M)
$0.170
Output (1M)
$1.120
Blended (3:1)
$0.407
Performance
Tokens / s
37.0
TTFT (token)
2.40s
TTFT (answer)
56.40s
Benchmarks
MMLU Pro 85.4%
GPQA 76.4%
HLE 13.3%
AIME
LiveCodeBench 76.6%
SciCode 40.7%
Math 500
AA Indexes
Intelligence
33.5%
Coding
31.3%
Math
79.3%

GPT-5.2 (Non-reasoning)

OpenAI

Input: $1.750
Output: $14.000
Pricing
Input (1M)
$1.750
Output (1M)
$14.000
Blended (3:1)
$4.813
Performance
Tokens / s
70.1
TTFT (token)
0.50s
TTFT (answer)
0.50s
Benchmarks
MMLU Pro 81.4%
GPQA 71.2%
HLE 7.3%
AIME
LiveCodeBench 66.9%
SciCode 40.4%
Math 500
AA Indexes
Intelligence
33.4%
Coding
34.7%
Math
51.0%

gpt-oss-120B (high)

OpenAI

Input: $0.150
Output: $0.600
Pricing
Input (1M)
$0.150
Output (1M)
$0.600
Blended (3:1)
$0.263
Performance
Tokens / s
327.4
TTFT (token)
0.49s
TTFT (answer)
6.60s
Benchmarks
MMLU Pro 80.8%
GPQA 78.2%
HLE 18.5%
AIME
LiveCodeBench 87.8%
SciCode 38.9%
Math 500
AA Indexes
Intelligence
33.2%
Coding
28.6%
Math
93.4%

o4-mini (high)

OpenAI

Input: $1.100
Output: $4.400
Pricing
Input (1M)
$1.100
Output (1M)
$4.400
Blended (3:1)
$1.925
Performance
Tokens / s
132.1
TTFT (token)
50.46s
TTFT (answer)
50.46s
Benchmarks
MMLU Pro 83.2%
GPQA 78.4%
HLE 17.5%
AIME 94.0%
LiveCodeBench 85.9%
SciCode 46.5%
Math 500 98.9%
AA Indexes
Intelligence
33.1%
Coding
25.6%
Math
90.7%

Claude 4 Sonnet (Non-reasoning)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
72.5
TTFT (token)
1.77s
TTFT (answer)
1.77s
Benchmarks
MMLU Pro 83.7%
GPQA 68.3%
HLE 4.0%
AIME 40.7%
LiveCodeBench 44.9%
SciCode 37.3%
Math 500 93.4%
AA Indexes
Intelligence
33.0%
Coding
30.6%
Math
38.0%

DeepSeek V3.2 Exp (Reasoning)

DeepSeek

Input: $0.280
Output: $0.420
Pricing
Input (1M)
$0.280
Output (1M)
$0.420
Blended (3:1)
$0.315
Performance
Tokens / s
29.1
TTFT (token)
1.24s
TTFT (answer)
69.95s
Benchmarks
MMLU Pro 85.0%
GPQA 79.7%
HLE 13.8%
AIME
LiveCodeBench 78.9%
SciCode 37.7%
Math 500
AA Indexes
Intelligence
32.8%
Coding
33.3%
Math
87.7%

Grok 3 mini Reasoning (high)

xAI

Input: $0.300
Output: $0.500
Pricing
Input (1M)
$0.300
Output (1M)
$0.500
Blended (3:1)
$0.350
Performance
Tokens / s
193.2
TTFT (token)
0.66s
TTFT (answer)
11.01s
Benchmarks
MMLU Pro 82.8%
GPQA 79.1%
HLE 11.1%
AIME 93.3%
LiveCodeBench 69.6%
SciCode 40.6%
Math 500 99.2%
AA Indexes
Intelligence
32.6%
Coding
25.2%
Math
84.7%

GLM-4.6 (Reasoning)

Z AI

Input: $0.575
Output: $2.200
Pricing
Input (1M)
$0.575
Output (1M)
$2.200
Blended (3:1)
$0.981
Performance
Tokens / s
87.6
TTFT (token)
0.69s
TTFT (answer)
23.53s
Benchmarks
MMLU Pro 82.9%
GPQA 78.0%
HLE 13.3%
AIME
LiveCodeBench 69.5%
SciCode 38.4%
Math 500
AA Indexes
Intelligence
32.5%
Coding
29.5%
Math
86.0%

Qwen3 Max Thinking

Alibaba

Input: $1.200
Output: $6.000
Pricing
Input (1M)
$1.200
Output (1M)
$6.000
Blended (3:1)
$2.400
Performance
Tokens / s
36.8
TTFT (token)
1.73s
TTFT (answer)
56.02s
Benchmarks
MMLU Pro 82.4%
GPQA 77.6%
HLE 12.0%
AIME
LiveCodeBench 53.5%
SciCode 38.7%
Math 500
AA Indexes
Intelligence
32.5%
Coding
24.5%
Math
82.3%

Nova 2.0 Pro Preview (low)

Amazon

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
131.0
TTFT (token)
13.06s
TTFT (answer)
28.33s
Benchmarks
MMLU Pro 82.2%
GPQA 75.1%
HLE 5.2%
AIME
LiveCodeBench 63.8%
SciCode 38.7%
Math 500
AA Indexes
Intelligence
32.4%
Coding
24.5%
Math
63.3%

DeepSeek V3.2 (Non-reasoning)

DeepSeek

Input: $0.280
Output: $0.420
Pricing
Input (1M)
$0.280
Output (1M)
$0.420
Blended (3:1)
$0.315
Performance
Tokens / s
28.8
TTFT (token)
1.22s
TTFT (answer)
1.22s
Benchmarks
MMLU Pro 83.7%
GPQA 75.1%
HLE 10.5%
AIME
LiveCodeBench 59.3%
SciCode 38.7%
Math 500
AA Indexes
Intelligence
32.2%
Coding
34.6%
Math
59.0%

Claude 4.1 Opus (Reasoning)

Anthropic

Input: $15.000
Output: $75.000
Pricing
Input (1M)
$15.000
Output (1M)
$75.000
Blended (3:1)
$30.000
Performance
Tokens / s
48.4
TTFT (token)
1.22s
TTFT (answer)
42.55s
Benchmarks
MMLU Pro 88.0%
GPQA 80.9%
HLE 11.9%
AIME
LiveCodeBench 65.4%
SciCode 40.9%
Math 500
AA Indexes
Intelligence
31.9%
Coding
36.5%
Math
80.3%

Qwen3 Max

Alibaba

Input: $1.200
Output: $6.000
Pricing
Input (1M)
$1.200
Output (1M)
$6.000
Blended (3:1)
$2.400
Performance
Tokens / s
32.9
TTFT (token)
1.97s
TTFT (answer)
1.97s
Benchmarks
MMLU Pro 84.1%
GPQA 76.4%
HLE 11.1%
AIME
LiveCodeBench 76.7%
SciCode 38.3%
Math 500
AA Indexes
Intelligence
31.2%
Coding
26.4%
Math
80.7%

Gemini 2.5 Flash Preview (Sep '25) (Reasoning)

Google

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
290.0
TTFT (token)
12.33s
TTFT (answer)
12.33s
Benchmarks
MMLU Pro 84.2%
GPQA 79.3%
HLE 12.7%
AIME
LiveCodeBench 71.3%
SciCode 40.5%
Math 500
AA Indexes
Intelligence
31.0%
Coding
24.6%
Math
78.3%

Claude 3.7 Sonnet (Non-reasoning)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 80.3%
GPQA 65.6%
HLE 4.8%
AIME 22.3%
LiveCodeBench 39.4%
SciCode 37.6%
Math 500 85.0%
AA Indexes
Intelligence
30.9%
Coding
26.7%
Math
21.0%

Claude 4.5 Haiku (Non-reasoning)

Anthropic

Input: $1.000
Output: $5.000
Pricing
Input (1M)
$1.000
Output (1M)
$5.000
Blended (3:1)
$2.000
Performance
Tokens / s
92.4
TTFT (token)
0.61s
TTFT (answer)
0.61s
Benchmarks
MMLU Pro 80.0%
GPQA 64.6%
HLE 4.3%
AIME
LiveCodeBench 51.1%
SciCode 34.4%
Math 500
AA Indexes
Intelligence
30.8%
Coding
29.6%
Math
39.0%

o1

OpenAI

Input: $15.000
Output: $60.000
Pricing
Input (1M)
$15.000
Output (1M)
$60.000
Blended (3:1)
$26.250
Performance
Tokens / s
169.7
TTFT (token)
18.55s
TTFT (answer)
18.55s
Benchmarks
MMLU Pro 84.1%
GPQA 74.7%
HLE 7.7%
AIME 72.3%
LiveCodeBench 67.9%
SciCode 35.8%
Math 500 97.0%
AA Indexes
Intelligence
30.8%
Coding
20.5%
Math

MiMo-V2-Flash (Non-reasoning)

Xiaomi

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
92.0
TTFT (token)
2.01s
TTFT (answer)
2.01s
Benchmarks
MMLU Pro 74.4%
GPQA 65.6%
HLE 8.0%
AIME
LiveCodeBench 40.2%
SciCode 25.9%
Math 500
AA Indexes
Intelligence
30.4%
Coding
25.8%
Math
67.7%

Gemini 2.5 Pro Preview (Mar' 25)

Google

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 85.8%
GPQA 83.6%
HLE 17.1%
AIME 87.0%
LiveCodeBench 77.8%
SciCode 39.5%
Math 500 98.0%
AA Indexes
Intelligence
30.3%
Coding
46.7%
Math

GLM-4.6 (Non-reasoning)

Z AI

Input: $0.600
Output: $2.200
Pricing
Input (1M)
$0.600
Output (1M)
$2.200
Blended (3:1)
$1.000
Performance
Tokens / s
87.2
TTFT (token)
1.63s
TTFT (answer)
1.63s
Benchmarks
MMLU Pro 78.4%
GPQA 63.2%
HLE 5.2%
AIME
LiveCodeBench 56.1%
SciCode 33.1%
Math 500
AA Indexes
Intelligence
30.1%
Coding
30.2%
Math
44.3%

Nova 2.0 Lite (medium)

Amazon

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
240.2
TTFT (token)
12.65s
TTFT (answer)
20.97s
Benchmarks
MMLU Pro 81.3%
GPQA 76.8%
HLE 8.6%
AIME
LiveCodeBench 66.3%
SciCode 36.8%
Math 500
AA Indexes
Intelligence
30.1%
Coding
23.9%
Math
88.7%

Gemini 2.5 Pro Preview (May' 25)

Google

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 83.7%
GPQA 82.2%
HLE 15.4%
AIME 84.3%
LiveCodeBench 77.0%
SciCode 41.6%
Math 500 98.6%
AA Indexes
Intelligence
29.5%
Coding
Math

Qwen3 235B A22B 2507 (Reasoning)

Alibaba

Input: $0.700
Output: $8.400
Pricing
Input (1M)
$0.700
Output (1M)
$8.400
Blended (3:1)
$2.625
Performance
Tokens / s
75.9
TTFT (token)
1.08s
TTFT (answer)
27.43s
Benchmarks
MMLU Pro 84.3%
GPQA 79.0%
HLE 15.0%
AIME 94.0%
LiveCodeBench 78.8%
SciCode 42.4%
Math 500 98.4%
AA Indexes
Intelligence
29.5%
Coding
23.2%
Math
91.0%

DeepSeek V3.2 Speciale

DeepSeek

Input: $0.400
Output: $0.500
Pricing
Input (1M)
$0.400
Output (1M)
$0.500
Blended (3:1)
$0.425
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 86.3%
GPQA 87.1%
HLE 26.1%
AIME
LiveCodeBench 89.6%
SciCode 44.0%
Math 500
AA Indexes
Intelligence
29.4%
Coding
37.9%
Math
96.7%

ERNIE 5.0 Thinking Preview

Baidu

Input: $0.840
Output: $3.370
Pricing
Input (1M)
$0.840
Output (1M)
$3.370
Blended (3:1)
$1.472
Performance
Tokens / s
25.6
TTFT (token)
4.07s
TTFT (answer)
82.15s
Benchmarks
MMLU Pro 83.0%
GPQA 77.7%
HLE 12.7%
AIME
LiveCodeBench 81.2%
SciCode 37.5%
Math 500
AA Indexes
Intelligence
29.2%
Coding
29.2%
Math
85.0%

Qwen3 VL 32B (Reasoning)

Alibaba

Input: $0.700
Output: $8.400
Pricing
Input (1M)
$0.700
Output (1M)
$8.400
Blended (3:1)
$2.625
Performance
Tokens / s
50.5
TTFT (token)
0.98s
TTFT (answer)
40.55s
Benchmarks
MMLU Pro 81.8%
GPQA 73.3%
HLE 9.6%
AIME
LiveCodeBench 73.8%
SciCode 28.5%
Math 500
AA Indexes
Intelligence
28.6%
Coding
14.5%
Math
84.7%

DeepSeek V3.1 Terminus (Non-reasoning)

DeepSeek

Input: $0.400
Output: $1.680
Pricing
Input (1M)
$0.400
Output (1M)
$1.680
Blended (3:1)
$0.800
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 83.6%
GPQA 75.1%
HLE 8.4%
AIME
LiveCodeBench 52.9%
SciCode 32.1%
Math 500
AA Indexes
Intelligence
28.3%
Coding
31.9%
Math
53.7%

DeepSeek V3.2 Exp (Non-reasoning)

DeepSeek

Input: $0.280
Output: $0.420
Pricing
Input (1M)
$0.280
Output (1M)
$0.420
Blended (3:1)
$0.315
Performance
Tokens / s
29.4
TTFT (token)
1.20s
TTFT (answer)
1.20s
Benchmarks
MMLU Pro 83.6%
GPQA 73.8%
HLE 8.6%
AIME
LiveCodeBench 55.4%
SciCode 39.9%
Math 500
AA Indexes
Intelligence
28.3%
Coding
30.0%
Math
57.7%

MiniMax-Text-01

MiniMax

Input: $0.200
Output: $1.100
Pricing
Input (1M)
$0.200
Output (1M)
$1.100
Blended (3:1)
$0.425
Performance
Tokens / s
28.5
TTFT (token)
1.32s
TTFT (answer)
1.32s
Benchmarks
MMLU Pro 75.9%
GPQA 57.8%
HLE 4.2%
AIME 13.0%
LiveCodeBench 24.7%
SciCode 25.0%
Math 500 75.3%
AA Indexes
Intelligence
28.3%
Coding
17.3%
Math
12.3%

Kimi K2 0905

Kimi

Input: $0.990
Output: $2.500
Pricing
Input (1M)
$0.990
Output (1M)
$2.500
Blended (3:1)
$1.200
Performance
Tokens / s
47.2
TTFT (token)
0.54s
TTFT (answer)
0.54s
Benchmarks
MMLU Pro 81.9%
GPQA 76.7%
HLE 6.3%
AIME
LiveCodeBench 61.0%
SciCode 30.7%
Math 500
AA Indexes
Intelligence
28.1%
Coding
25.9%
Math
57.3%

DeepSeek V3.1 (Reasoning)

DeepSeek

Input: $0.590
Output: $1.690
Pricing
Input (1M)
$0.590
Output (1M)
$1.690
Blended (3:1)
$0.865
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 85.1%
GPQA 77.9%
HLE 13.0%
AIME
LiveCodeBench 78.4%
SciCode 39.1%
Math 500
AA Indexes
Intelligence
28.1%
Coding
29.7%
Math
89.7%

DeepSeek V3.1 (Non-reasoning)

DeepSeek

Input: $0.560
Output: $1.680
Pricing
Input (1M)
$0.560
Output (1M)
$1.680
Blended (3:1)
$0.840
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 83.3%
GPQA 73.5%
HLE 6.3%
AIME
LiveCodeBench 57.7%
SciCode 36.7%
Math 500
AA Indexes
Intelligence
27.9%
Coding
28.4%
Math
49.7%

Nova 2.0 Omni (medium)

Amazon

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 80.9%
GPQA 76.0%
HLE 6.8%
AIME
LiveCodeBench 66.0%
SciCode 36.2%
Math 500
AA Indexes
Intelligence
27.9%
Coding
15.1%
Math
89.7%

Qwen3 VL 235B A22B (Reasoning)

Alibaba

Input: $0.700
Output: $8.400
Pricing
Input (1M)
$0.700
Output (1M)
$8.400
Blended (3:1)
$2.625
Performance
Tokens / s
43.6
TTFT (token)
1.09s
TTFT (answer)
46.98s
Benchmarks
MMLU Pro 83.6%
GPQA 77.2%
HLE 10.1%
AIME
LiveCodeBench 64.6%
SciCode 39.9%
Math 500
AA Indexes
Intelligence
27.6%
Coding
20.9%
Math
88.3%

Magistral Medium 1.2

Mistral

Input: $2.000
Output: $5.000
Pricing
Input (1M)
$2.000
Output (1M)
$5.000
Blended (3:1)
$2.750
Performance
Tokens / s
35.8
TTFT (token)
0.52s
TTFT (answer)
56.37s
Benchmarks
MMLU Pro 81.5%
GPQA 73.9%
HLE 9.6%
AIME
LiveCodeBench 75.0%
SciCode 39.2%
Math 500
AA Indexes
Intelligence
27.5%
Coding
21.7%
Math
82.0%

Claude 4 Opus (Reasoning)

Anthropic

Input: $15.000
Output: $75.000
Pricing
Input (1M)
$15.000
Output (1M)
$75.000
Blended (3:1)
$30.000
Performance
Tokens / s
48.8
TTFT (token)
1.13s
TTFT (answer)
42.13s
Benchmarks
MMLU Pro 87.3%
GPQA 79.6%
HLE 11.7%
AIME 75.7%
LiveCodeBench 63.6%
SciCode 39.8%
Math 500 98.2%
AA Indexes
Intelligence
27.4%
Coding
34.0%
Math
73.3%

Gemini 2.5 Flash (Reasoning)

Google

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
262.3
TTFT (token)
17.46s
TTFT (answer)
17.46s
Benchmarks
MMLU Pro 83.2%
GPQA 79.0%
HLE 11.1%
AIME 82.3%
LiveCodeBench 69.5%
SciCode 39.4%
Math 500 98.1%
AA Indexes
Intelligence
27.4%
Coding
22.2%
Math
73.3%

GPT-5.1 (Non-reasoning)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
70.4
TTFT (token)
0.76s
TTFT (answer)
0.76s
Benchmarks
MMLU Pro 80.1%
GPQA 64.3%
HLE 5.2%
AIME
LiveCodeBench 49.4%
SciCode 36.5%
Math 500
AA Indexes
Intelligence
27.4%
Coding
27.3%
Math
38.0%

DeepSeek R1 0528 (May '25)

DeepSeek

Input: $1.350
Output: $4.200
Pricing
Input (1M)
$1.350
Output (1M)
$4.200
Blended (3:1)
$2.362
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 84.9%
GPQA 81.3%
HLE 14.9%
AIME 89.3%
LiveCodeBench 77.0%
SciCode 40.3%
Math 500 98.3%
AA Indexes
Intelligence
27.2%
Coding
24.0%
Math
76.0%

GLM-4.5 (Reasoning)

Z AI

Input: $0.600
Output: $2.200
Pricing
Input (1M)
$0.600
Output (1M)
$2.200
Blended (3:1)
$1.000
Performance
Tokens / s
50.2
TTFT (token)
0.60s
TTFT (answer)
40.48s
Benchmarks
MMLU Pro 83.5%
GPQA 78.2%
HLE 12.2%
AIME 87.3%
LiveCodeBench 73.8%
SciCode 34.8%
Math 500 97.9%
AA Indexes
Intelligence
26.9%
Coding
26.3%
Math
73.7%

GPT-5 nano (high)

OpenAI

Input: $0.050
Output: $0.400
Pricing
Input (1M)
$0.050
Output (1M)
$0.400
Blended (3:1)
$0.138
Performance
Tokens / s
115.1
TTFT (token)
138.34s
TTFT (answer)
138.34s
Benchmarks
MMLU Pro 78.0%
GPQA 67.6%
HLE 8.2%
AIME
LiveCodeBench 78.9%
SciCode 36.6%
Math 500
AA Indexes
Intelligence
26.9%
Coding
20.3%
Math
83.7%

Qwen3 Next 80B A3B (Reasoning)

Alibaba

Input: $0.500
Output: $6.000
Pricing
Input (1M)
$0.500
Output (1M)
$6.000
Blended (3:1)
$1.875
Performance
Tokens / s
158.4
TTFT (token)
1.06s
TTFT (answer)
13.69s
Benchmarks
MMLU Pro 82.4%
GPQA 75.9%
HLE 11.7%
AIME
LiveCodeBench 78.4%
SciCode 38.8%
Math 500
AA Indexes
Intelligence
26.6%
Coding
19.5%
Math
84.3%

Grok Code Fast 1

xAI

Input: $0.200
Output: $1.500
Pricing
Input (1M)
$0.200
Output (1M)
$1.500
Blended (3:1)
$0.525
Performance
Tokens / s
229.5
TTFT (token)
9.06s
TTFT (answer)
9.06s
Benchmarks
MMLU Pro 79.3%
GPQA 72.7%
HLE 7.5%
AIME
LiveCodeBench 65.7%
SciCode 36.2%
Math 500
AA Indexes
Intelligence
26.2%
Coding
23.7%
Math
43.3%

Qwen3 Max (Preview)

Alibaba

Input: $1.200
Output: $6.000
Pricing
Input (1M)
$1.200
Output (1M)
$6.000
Blended (3:1)
$2.400
Performance
Tokens / s
32.2
TTFT (token)
1.73s
TTFT (answer)
1.73s
Benchmarks
MMLU Pro 83.8%
GPQA 76.4%
HLE 9.3%
AIME
LiveCodeBench 65.1%
SciCode 37.0%
Math 500
AA Indexes
Intelligence
26.1%
Coding
25.5%
Math
75.0%

GPT-5 nano (medium)

OpenAI

Input: $0.050
Output: $0.400
Pricing
Input (1M)
$0.050
Output (1M)
$0.400
Blended (3:1)
$0.138
Performance
Tokens / s
126.3
TTFT (token)
59.41s
TTFT (answer)
59.41s
Benchmarks
MMLU Pro 77.2%
GPQA 67.0%
HLE 7.6%
AIME
LiveCodeBench 76.3%
SciCode 33.8%
Math 500
AA Indexes
Intelligence
26.0%
Coding
22.9%
Math
78.3%

o3-mini

OpenAI

Input: $1.100
Output: $4.400
Pricing
Input (1M)
$1.100
Output (1M)
$4.400
Blended (3:1)
$1.925
Performance
Tokens / s
142.7
TTFT (token)
17.29s
TTFT (answer)
17.29s
Benchmarks
MMLU Pro 79.1%
GPQA 74.8%
HLE 8.7%
AIME 77.0%
LiveCodeBench 71.7%
SciCode 39.9%
Math 500 97.3%
AA Indexes
Intelligence
25.9%
Coding
17.9%
Math

Kimi K2

Kimi

Input: $0.600
Output: $2.500
Pricing
Input (1M)
$0.600
Output (1M)
$2.500
Blended (3:1)
$1.075
Performance
Tokens / s
37.3
TTFT (token)
0.68s
TTFT (answer)
0.68s
Benchmarks
MMLU Pro 82.4%
GPQA 76.6%
HLE 7.0%
AIME 69.3%
LiveCodeBench 55.6%
SciCode 34.5%
Math 500 97.1%
AA Indexes
Intelligence
25.9%
Coding
22.1%
Math
57.0%

GPT-4.1

OpenAI

Input: $2.000
Output: $8.000
Pricing
Input (1M)
$2.000
Output (1M)
$8.000
Blended (3:1)
$3.500
Performance
Tokens / s
76.9
TTFT (token)
0.52s
TTFT (answer)
0.52s
Benchmarks
MMLU Pro 80.6%
GPQA 66.6%
HLE 4.6%
AIME 43.7%
LiveCodeBench 45.7%
SciCode 38.1%
Math 500 91.3%
AA Indexes
Intelligence
25.9%
Coding
21.8%
Math
34.7%

o1-pro

OpenAI

Input: $150.000
Output: $600.000
Pricing
Input (1M)
$150.000
Output (1M)
$600.000
Blended (3:1)
$262.500
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
25.8%
Coding
Math

Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)

Google

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
249.2
TTFT (token)
0.30s
TTFT (answer)
0.30s
Benchmarks
MMLU Pro 83.6%
GPQA 76.6%
HLE 7.8%
AIME
LiveCodeBench 62.5%
SciCode 37.5%
Math 500
AA Indexes
Intelligence
25.7%
Coding
22.1%
Math
56.7%

Grok 3

xAI

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
41.9
TTFT (token)
0.77s
TTFT (answer)
0.77s
Benchmarks
MMLU Pro 79.9%
GPQA 69.3%
HLE 5.1%
AIME 33.0%
LiveCodeBench 42.5%
SciCode 36.8%
Math 500 87.0%
AA Indexes
Intelligence
25.4%
Coding
19.8%
Math
58.0%

o3-mini (high)

OpenAI

Input: $1.100
Output: $4.400
Pricing
Input (1M)
$1.100
Output (1M)
$4.400
Blended (3:1)
$1.925
Performance
Tokens / s
159.1
TTFT (token)
56.06s
TTFT (answer)
56.06s
Benchmarks
MMLU Pro 80.2%
GPQA 77.3%
HLE 12.3%
AIME 86.0%
LiveCodeBench 73.4%
SciCode 39.8%
Math 500 98.5%
AA Indexes
Intelligence
25.1%
Coding
17.3%
Math

Seed-OSS-36B-Instruct

ByteDance Seed

Input: $0.210
Output: $0.570
Pricing
Input (1M)
$0.210
Output (1M)
$0.570
Blended (3:1)
$0.300
Performance
Tokens / s
29.4
TTFT (token)
1.73s
TTFT (answer)
69.67s
Benchmarks
MMLU Pro 81.5%
GPQA 72.6%
HLE 9.1%
AIME
LiveCodeBench 76.5%
SciCode 36.5%
Math 500
AA Indexes
Intelligence
25.1%
Coding
16.7%
Math
84.7%

Nova 2.0 Lite (low)

Amazon

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
250.8
TTFT (token)
6.82s
TTFT (answer)
14.80s
Benchmarks
MMLU Pro 78.8%
GPQA 69.8%
HLE 4.2%
AIME
LiveCodeBench 46.9%
SciCode 33.3%
Math 500
AA Indexes
Intelligence
25.0%
Coding
13.6%
Math
46.7%

Qwen3 Coder 480B A35B Instruct

Alibaba

Input: $1.500
Output: $7.500
Pricing
Input (1M)
$1.500
Output (1M)
$7.500
Blended (3:1)
$3.000
Performance
Tokens / s
40.5
TTFT (token)
1.69s
TTFT (answer)
1.69s
Benchmarks
MMLU Pro 78.8%
GPQA 61.8%
HLE 4.4%
AIME 47.7%
LiveCodeBench 58.5%
SciCode 35.9%
Math 500 94.2%
AA Indexes
Intelligence
25.0%
Coding
24.6%
Math
39.3%

NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)

NVIDIA

Input: $0.060
Output: $0.240
Pricing
Input (1M)
$0.060
Output (1M)
$0.240
Blended (3:1)
$0.105
Performance
Tokens / s
191.4
TTFT (token)
0.39s
TTFT (answer)
10.83s
Benchmarks
MMLU Pro 79.4%
GPQA 75.7%
HLE 10.2%
AIME
LiveCodeBench 74.1%
SciCode 29.6%
Math 500
AA Indexes
Intelligence
24.9%
Coding
19.0%
Math
91.0%

gpt-oss-20B (high)

OpenAI

Input: $0.070
Output: $0.200
Pricing
Input (1M)
$0.070
Output (1M)
$0.200
Blended (3:1)
$0.100
Performance
Tokens / s
307.3
TTFT (token)
0.53s
TTFT (answer)
7.04s
Benchmarks
MMLU Pro 74.8%
GPQA 68.8%
HLE 9.8%
AIME
LiveCodeBench 77.7%
SciCode 34.4%
Math 500
AA Indexes
Intelligence
24.8%
Coding
18.5%
Math
89.3%

Qwen3 235B A22B 2507 Instruct

Alibaba

Input: $0.700
Output: $2.800
Pricing
Input (1M)
$0.700
Output (1M)
$2.800
Blended (3:1)
$1.225
Performance
Tokens / s
55.4
TTFT (token)
1.04s
TTFT (answer)
1.04s
Benchmarks
MMLU Pro 82.8%
GPQA 75.3%
HLE 10.6%
AIME 71.7%
LiveCodeBench 52.4%
SciCode 36.0%
Math 500 98.0%
AA Indexes
Intelligence
24.7%
Coding
22.1%
Math
71.7%

GPT-5 (minimal)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
75.0
TTFT (token)
0.82s
TTFT (answer)
0.82s
Benchmarks
MMLU Pro 80.6%
GPQA 67.3%
HLE 5.4%
AIME 36.7%
LiveCodeBench 55.8%
SciCode 38.8%
Math 500 86.1%
AA Indexes
Intelligence
24.6%
Coding
25.1%
Math
31.7%

MiniMax M1 80k

MiniMax

Input: $0.550
Output: $2.200
Pricing
Input (1M)
$0.550
Output (1M)
$2.200
Blended (3:1)
$0.963
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 81.6%
GPQA 69.7%
HLE 8.2%
AIME 84.7%
LiveCodeBench 71.1%
SciCode 37.4%
Math 500 98.0%
AA Indexes
Intelligence
24.4%
Coding
14.5%
Math
61.0%

Nova 2.0 Omni (low)

Amazon

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 79.8%
GPQA 69.9%
HLE 4.0%
AIME
LiveCodeBench 59.2%
SciCode 34.3%
Math 500
AA Indexes
Intelligence
24.1%
Coding
13.9%
Math
56.0%

GLM-4.6V (Reasoning)

Z AI

Input: $0.300
Output: $0.900
Pricing
Input (1M)
$0.300
Output (1M)
$0.900
Blended (3:1)
$0.450
Performance
Tokens / s
68.0
TTFT (token)
0.72s
TTFT (answer)
30.13s
Benchmarks
MMLU Pro 79.9%
GPQA 71.9%
HLE 8.9%
AIME
LiveCodeBench 16.0%
SciCode 30.4%
Math 500
AA Indexes
Intelligence
24.0%
Coding
19.7%
Math
85.3%

GLM-4.5-Air

Z AI

Input: $0.200
Output: $1.100
Pricing
Input (1M)
$0.200
Output (1M)
$1.100
Blended (3:1)
$0.425
Performance
Tokens / s
98.9
TTFT (token)
0.61s
TTFT (answer)
20.84s
Benchmarks
MMLU Pro 81.5%
GPQA 73.3%
HLE 6.8%
AIME 67.3%
LiveCodeBench 68.4%
SciCode 30.6%
Math 500 96.5%
AA Indexes
Intelligence
23.8%
Coding
23.8%
Math
80.7%

gpt-oss-120B (low)

OpenAI

Input: $0.150
Output: $0.595
Pricing
Input (1M)
$0.150
Output (1M)
$0.595
Blended (3:1)
$0.263
Performance
Tokens / s
299.1
TTFT (token)
0.48s
TTFT (answer)
7.17s
Benchmarks
MMLU Pro 77.5%
GPQA 67.2%
HLE 5.2%
AIME
LiveCodeBench 70.7%
SciCode 36.0%
Math 500
AA Indexes
Intelligence
23.8%
Coding
15.5%
Math
66.7%

Grok 4.1 Fast (Non-reasoning)

xAI

Input: $0.200
Output: $0.500
Pricing
Input (1M)
$0.200
Output (1M)
$0.500
Blended (3:1)
$0.275
Performance
Tokens / s
81.1
TTFT (token)
0.69s
TTFT (answer)
0.69s
Benchmarks
MMLU Pro 74.3%
GPQA 63.7%
HLE 5.0%
AIME
LiveCodeBench 39.9%
SciCode 29.6%
Math 500
AA Indexes
Intelligence
23.8%
Coding
19.5%
Math
34.3%

Nova 2.0 Pro Preview (Non-reasoning)

Amazon

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
159.0
TTFT (token)
0.46s
TTFT (answer)
0.46s
Benchmarks
MMLU Pro 77.2%
GPQA 63.6%
HLE 4.0%
AIME
LiveCodeBench 47.3%
SciCode 28.1%
Math 500
AA Indexes
Intelligence
23.7%
Coding
20.5%
Math
30.7%

o1-preview

OpenAI

Input: $16.500
Output: $66.000
Pricing
Input (1M)
$16.500
Output (1M)
$66.000
Blended (3:1)
$28.875
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500 92.4%
AA Indexes
Intelligence
23.7%
Coding
34.0%
Math

Claude 4.1 Opus (Non-reasoning)

Anthropic

Input: $15.000
Output: $75.000
Pricing
Input (1M)
$15.000
Output (1M)
$75.000
Blended (3:1)
$30.000
Performance
Tokens / s
37.9
TTFT (token)
1.30s
TTFT (answer)
1.30s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
23.6%
Coding
Math

GPT-4.1 mini

OpenAI

Input: $0.400
Output: $1.600
Pricing
Input (1M)
$0.400
Output (1M)
$1.600
Blended (3:1)
$0.700
Performance
Tokens / s
67.2
TTFT (token)
0.45s
TTFT (answer)
0.45s
Benchmarks
MMLU Pro 78.1%
GPQA 66.4%
HLE 4.6%
AIME 43.0%
LiveCodeBench 48.3%
SciCode 40.4%
Math 500 92.5%
AA Indexes
Intelligence
23.0%
Coding
18.5%
Math
46.3%

Qwen3 30B A3B 2507 (Reasoning)

Alibaba

Input: $0.200
Output: $2.400
Pricing
Input (1M)
$0.200
Output (1M)
$2.400
Blended (3:1)
$0.750
Performance
Tokens / s
165.3
TTFT (token)
1.02s
TTFT (answer)
13.11s
Benchmarks
MMLU Pro 80.5%
GPQA 70.7%
HLE 9.8%
AIME 90.7%
LiveCodeBench 70.7%
SciCode 33.3%
Math 500 97.6%
AA Indexes
Intelligence
22.9%
Coding
14.7%
Math
56.3%

Grok 4 Fast (Non-reasoning)

xAI

Input: $0.200
Output: $0.500
Pricing
Input (1M)
$0.200
Output (1M)
$0.500
Blended (3:1)
$0.275
Performance
Tokens / s
127.6
TTFT (token)
0.54s
TTFT (answer)
0.54s
Benchmarks
MMLU Pro 73.0%
GPQA 60.6%
HLE 5.0%
AIME
LiveCodeBench 40.1%
SciCode 32.9%
Math 500
AA Indexes
Intelligence
22.9%
Coding
19.0%
Math
41.3%

DeepSeek V3 0324

DeepSeek

Input: $1.195
Output: $1.350
Pricing
Input (1M)
$1.195
Output (1M)
$1.350
Blended (3:1)
$1.250
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 81.9%
GPQA 65.5%
HLE 5.2%
AIME 52.0%
LiveCodeBench 40.5%
SciCode 35.8%
Math 500 94.2%
AA Indexes
Intelligence
22.8%
Coding
22.0%
Math
41.0%

Ring-1T

InclusionAI

Input: $0.560
Output: $2.240
Pricing
Input (1M)
$0.560
Output (1M)
$2.240
Blended (3:1)
$0.980
Performance
Tokens / s
57.9
TTFT (token)
1.74s
TTFT (answer)
36.31s
Benchmarks
MMLU Pro 80.6%
GPQA 77.4%
HLE 10.2%
AIME
LiveCodeBench 64.3%
SciCode 36.7%
Math 500
AA Indexes
Intelligence
22.7%
Coding
16.8%
Math
89.3%

Mistral Large 3

Mistral

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
50.9
TTFT (token)
0.55s
TTFT (answer)
0.55s
Benchmarks
MMLU Pro 80.7%
GPQA 68.0%
HLE 4.1%
AIME
LiveCodeBench 46.5%
SciCode 36.2%
Math 500
AA Indexes
Intelligence
22.7%
Coding
22.7%
Math
38.0%

Magistral Small 1.2

Mistral

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
192.7
TTFT (token)
0.35s
TTFT (answer)
10.73s
Benchmarks
MMLU Pro 76.8%
GPQA 66.3%
HLE 6.1%
AIME
LiveCodeBench 72.3%
SciCode 35.2%
Math 500
AA Indexes
Intelligence
22.5%
Coding
14.8%
Math
80.3%

Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)

Google

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
576.3
TTFT (token)
6.52s
TTFT (answer)
6.52s
Benchmarks
MMLU Pro 80.8%
GPQA 70.9%
HLE 6.6%
AIME
LiveCodeBench 68.8%
SciCode 28.7%
Math 500
AA Indexes
Intelligence
22.5%
Coding
18.1%
Math
68.7%

INTELLECT-3

Prime Intellect

Input: $0.200
Output: $1.100
Pricing
Input (1M)
$0.200
Output (1M)
$1.100
Blended (3:1)
$0.425
Performance
Tokens / s
84.8
TTFT (token)
0.59s
TTFT (answer)
24.18s
Benchmarks
MMLU Pro 82.2%
GPQA 76.1%
HLE 12.1%
AIME
LiveCodeBench 77.7%
SciCode 39.1%
Math 500
AA Indexes
Intelligence
22.3%
Coding
19.1%
Math
88.0%

Claude 4 Opus (Non-reasoning)

Anthropic

Input: $15.000
Output: $75.000
Pricing
Input (1M)
$15.000
Output (1M)
$75.000
Blended (3:1)
$30.000
Performance
Tokens / s
40.2
TTFT (token)
1.18s
TTFT (answer)
1.18s
Benchmarks
MMLU Pro 86.0%
GPQA 70.1%
HLE 5.9%
AIME 56.3%
LiveCodeBench 54.2%
SciCode 40.9%
Math 500 94.1%
AA Indexes
Intelligence
22.2%
Coding
Math
36.3%

CompactifAI Llama 4 Scout Slim

Multiverse Computing

Input: $0.070
Output: $0.100
Pricing
Input (1M)
$0.070
Output (1M)
$0.100
Blended (3:1)
$0.077
Performance
Tokens / s
116.7
TTFT (token)
0.31s
TTFT (answer)
0.31s
Benchmarks
MMLU Pro 70.3%
GPQA 42.6%
HLE 4.3%
AIME
LiveCodeBench 18.6%
SciCode 17.9%
Math 500
AA Indexes
Intelligence
22.1%
Coding
12.9%
Math
10.0%

GPT-5 (ChatGPT)

OpenAI

Input: $1.250
Output: $10.000
Pricing
Input (1M)
$1.250
Output (1M)
$10.000
Blended (3:1)
$3.438
Performance
Tokens / s
169.6
TTFT (token)
0.67s
TTFT (answer)
0.67s
Benchmarks
MMLU Pro 82.0%
GPQA 68.6%
HLE 5.8%
AIME
LiveCodeBench 54.3%
SciCode 37.8%
Math 500
AA Indexes
Intelligence
21.8%
Coding
21.2%
Math
48.3%

Hermes 4 - Llama-3.1 405B (Reasoning)

Nous Research

Input: $1.000
Output: $3.000
Pricing
Input (1M)
$1.000
Output (1M)
$3.000
Blended (3:1)
$1.500
Performance
Tokens / s
33.8
TTFT (token)
0.72s
TTFT (answer)
59.97s
Benchmarks
MMLU Pro 82.9%
GPQA 72.7%
HLE 10.3%
AIME
LiveCodeBench 68.6%
SciCode 25.2%
Math 500
AA Indexes
Intelligence
21.7%
Coding
16.0%
Math
69.7%

GPT-5 mini (minimal)

OpenAI

Input: $0.250
Output: $2.000
Pricing
Input (1M)
$0.250
Output (1M)
$2.000
Blended (3:1)
$0.688
Performance
Tokens / s
72.1
TTFT (token)
0.67s
TTFT (answer)
0.67s
Benchmarks
MMLU Pro 77.5%
GPQA 68.7%
HLE 5.0%
AIME
LiveCodeBench 54.5%
SciCode 36.9%
Math 500
AA Indexes
Intelligence
21.6%
Coding
21.9%
Math
46.7%

gpt-oss-20B (low)

OpenAI

Input: $0.070
Output: $0.200
Pricing
Input (1M)
$0.070
Output (1M)
$0.200
Blended (3:1)
$0.100
Performance
Tokens / s
246.9
TTFT (token)
0.51s
TTFT (answer)
8.61s
Benchmarks
MMLU Pro 71.8%
GPQA 61.1%
HLE 5.1%
AIME
LiveCodeBench 65.2%
SciCode 34.0%
Math 500
AA Indexes
Intelligence
21.2%
Coding
14.4%
Math
62.3%

Mistral Medium 3.1

Mistral

Input: $0.400
Output: $2.000
Pricing
Input (1M)
$0.400
Output (1M)
$2.000
Blended (3:1)
$0.800
Performance
Tokens / s
90.6
TTFT (token)
0.39s
TTFT (answer)
0.39s
Benchmarks
MMLU Pro 68.3%
GPQA 58.8%
HLE 4.4%
AIME
LiveCodeBench 40.6%
SciCode 33.8%
Math 500
AA Indexes
Intelligence
21.2%
Coding
18.3%
Math
38.3%

Gemini 2.5 Flash (Non-reasoning)

Google

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
224.2
TTFT (token)
0.41s
TTFT (answer)
0.41s
Benchmarks
MMLU Pro 80.9%
GPQA 68.3%
HLE 5.1%
AIME 50.0%
LiveCodeBench 49.5%
SciCode 29.1%
Math 500 93.2%
AA Indexes
Intelligence
21.0%
Coding
17.8%
Math
60.3%

Qwen3 VL 235B A22B Instruct

Alibaba

Input: $0.700
Output: $2.800
Pricing
Input (1M)
$0.700
Output (1M)
$2.800
Blended (3:1)
$1.225
Performance
Tokens / s
35.3
TTFT (token)
1.09s
TTFT (answer)
1.09s
Benchmarks
MMLU Pro 82.3%
GPQA 71.2%
HLE 6.3%
AIME
LiveCodeBench 59.4%
SciCode 35.9%
Math 500
AA Indexes
Intelligence
20.9%
Coding
16.5%
Math
70.7%

Ring-flash-2.0

InclusionAI

Input: $0.140
Output: $0.570
Pricing
Input (1M)
$0.140
Output (1M)
$0.570
Blended (3:1)
$0.247
Performance
Tokens / s
87.3
TTFT (token)
1.50s
TTFT (answer)
24.41s
Benchmarks
MMLU Pro 79.3%
GPQA 72.5%
HLE 8.9%
AIME
LiveCodeBench 62.8%
SciCode 16.8%
Math 500
AA Indexes
Intelligence
20.6%
Coding
10.6%
Math
83.7%

Hermes 4 - Llama-3.1 70B (Reasoning)

Nous Research

Input: $0.130
Output: $0.400
Pricing
Input (1M)
$0.130
Output (1M)
$0.400
Blended (3:1)
$0.198
Performance
Tokens / s
81.0
TTFT (token)
0.57s
TTFT (answer)
25.27s
Benchmarks
MMLU Pro 81.1%
GPQA 69.9%
HLE 7.9%
AIME
LiveCodeBench 65.3%
SciCode 34.1%
Math 500
AA Indexes
Intelligence
20.4%
Coding
14.4%
Math
68.7%

Qwen3 Coder 30B A3B Instruct

Alibaba

Input: $0.450
Output: $2.250
Pricing
Input (1M)
$0.450
Output (1M)
$2.250
Blended (3:1)
$0.900
Performance
Tokens / s
98.6
TTFT (token)
1.57s
TTFT (answer)
1.57s
Benchmarks
MMLU Pro 70.6%
GPQA 51.6%
HLE 4.0%
AIME 29.7%
LiveCodeBench 40.3%
SciCode 27.8%
Math 500 89.3%
AA Indexes
Intelligence
20.3%
Coding
19.4%
Math
29.0%

Qwen3 Next 80B A3B Instruct

Alibaba

Input: $0.500
Output: $2.000
Pricing
Input (1M)
$0.500
Output (1M)
$2.000
Blended (3:1)
$0.875
Performance
Tokens / s
145.9
TTFT (token)
1.23s
TTFT (answer)
1.23s
Benchmarks
MMLU Pro 81.9%
GPQA 73.8%
HLE 7.3%
AIME
LiveCodeBench 68.4%
SciCode 30.7%
Math 500
AA Indexes
Intelligence
20.3%
Coding
15.3%
Math
66.3%

Codestral (Jan '25)

Mistral

Input: $0.300
Output: $0.900
Pricing
Input (1M)
$0.300
Output (1M)
$0.900
Blended (3:1)
$0.450
Performance
Tokens / s
212.0
TTFT (token)
0.39s
TTFT (answer)
0.39s
Benchmarks
MMLU Pro 44.6%
GPQA 31.2%
HLE 4.5%
AIME 4.3%
LiveCodeBench 24.3%
SciCode 24.7%
Math 500 60.7%
AA Indexes
Intelligence
20.1%
Coding
16.3%
Math
6.0%

Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)

Google

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
450.6
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro 79.6%
GPQA 65.1%
HLE 4.6%
AIME
LiveCodeBench 64.1%
SciCode 28.5%
Math 500
AA Indexes
Intelligence
20.1%
Coding
14.5%
Math
46.7%

Qwen3 235B A22B (Reasoning)

Alibaba

Input: $0.700
Output: $8.400
Pricing
Input (1M)
$0.700
Output (1M)
$8.400
Blended (3:1)
$2.625
Performance
Tokens / s
54.2
TTFT (token)
1.22s
TTFT (answer)
38.14s
Benchmarks
MMLU Pro 82.8%
GPQA 70.0%
HLE 11.7%
AIME 84.0%
LiveCodeBench 62.2%
SciCode 39.9%
Math 500 93.0%
AA Indexes
Intelligence
20.0%
Coding
17.4%
Math
82.0%

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)

NVIDIA

Input: $0.600
Output: $1.800
Pricing
Input (1M)
$0.600
Output (1M)
$1.800
Blended (3:1)
$0.900
Performance
Tokens / s
37.2
TTFT (token)
0.68s
TTFT (answer)
54.46s
Benchmarks
MMLU Pro 82.5%
GPQA 72.8%
HLE 8.1%
AIME 74.7%
LiveCodeBench 64.1%
SciCode 34.7%
Math 500 95.2%
AA Indexes
Intelligence
20.0%
Coding
13.1%
Math
63.7%

Ling-flash-2.0

InclusionAI

Input: $0.140
Output: $0.570
Pricing
Input (1M)
$0.140
Output (1M)
$0.570
Blended (3:1)
$0.247
Performance
Tokens / s
54.6
TTFT (token)
1.56s
TTFT (answer)
1.56s
Benchmarks
MMLU Pro 77.7%
GPQA 65.7%
HLE 6.3%
AIME
LiveCodeBench 58.9%
SciCode 28.9%
Math 500
AA Indexes
Intelligence
19.9%
Coding
16.7%
Math
65.3%

Qwen3 VL 30B A3B (Reasoning)

Alibaba

Input: $0.200
Output: $2.400
Pricing
Input (1M)
$0.200
Output (1M)
$2.400
Blended (3:1)
$0.750
Performance
Tokens / s
106.6
TTFT (token)
0.97s
TTFT (answer)
19.74s
Benchmarks
MMLU Pro 80.7%
GPQA 72.0%
HLE 8.7%
AIME
LiveCodeBench 69.7%
SciCode 28.8%
Math 500
AA Indexes
Intelligence
19.8%
Coding
13.1%
Math
82.3%

QwQ 32B

Alibaba

Input: $0.430
Output: $0.600
Pricing
Input (1M)
$0.430
Output (1M)
$0.600
Blended (3:1)
$0.473
Performance
Tokens / s
29.6
TTFT (token)
1.14s
TTFT (answer)
85.17s
Benchmarks
MMLU Pro 76.4%
GPQA 59.3%
HLE 8.2%
AIME 78.0%
LiveCodeBench 63.1%
SciCode 35.8%
Math 500 95.7%
AA Indexes
Intelligence
19.7%
Coding
Math
29.0%

Llama Nemotron Super 49B v1.5 (Reasoning)

NVIDIA

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
74.1
TTFT (token)
0.42s
TTFT (answer)
27.39s
Benchmarks
MMLU Pro 81.4%
GPQA 74.8%
HLE 6.8%
AIME 86.0%
LiveCodeBench 73.7%
SciCode 34.8%
Math 500 98.3%
AA Indexes
Intelligence
19.4%
Coding
15.2%
Math
76.7%

Reka Core

Reka AI

Input: $2.000
Output: $2.000
Pricing
Input (1M)
$2.000
Output (1M)
$2.000
Blended (3:1)
$2.000
Performance
Tokens / s
40.9
TTFT (token)
1.42s
TTFT (answer)
1.42s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500 55.8%
AA Indexes
Intelligence
19.4%
Coding
Math

GLM-4.5V (Reasoning)

Z AI

Input: $0.600
Output: $1.800
Pricing
Input (1M)
$0.600
Output (1M)
$1.800
Blended (3:1)
$0.900
Performance
Tokens / s
31.4
TTFT (token)
0.77s
TTFT (answer)
64.44s
Benchmarks
MMLU Pro 78.8%
GPQA 68.4%
HLE 5.9%
AIME
LiveCodeBench 60.4%
SciCode 22.1%
Math 500
AA Indexes
Intelligence
19.3%
Coding
10.9%
Math
73.0%

Nova Premier

Amazon

Input: $2.500
Output: $12.500
Pricing
Input (1M)
$2.500
Output (1M)
$12.500
Blended (3:1)
$5.000
Performance
Tokens / s
75.1
TTFT (token)
0.81s
TTFT (answer)
0.81s
Benchmarks
MMLU Pro 73.3%
GPQA 56.9%
HLE 4.7%
AIME 17.0%
LiveCodeBench 31.7%
SciCode 27.9%
Math 500 83.9%
AA Indexes
Intelligence
19.3%
Coding
13.8%
Math
17.3%

Magistral Medium 1

Mistral

Input: $2.000
Output: $5.000
Pricing
Input (1M)
$2.000
Output (1M)
$5.000
Blended (3:1)
$2.750
Performance
Tokens / s
60.2
TTFT (token)
0.49s
TTFT (answer)
33.71s
Benchmarks
MMLU Pro 75.3%
GPQA 67.9%
HLE 9.5%
AIME 70.0%
LiveCodeBench 52.7%
SciCode 29.7%
Math 500 91.7%
AA Indexes
Intelligence
19.1%
Coding
16.0%
Math
40.3%

GPT-4o (Aug '24)

OpenAI

Input: $2.500
Output: $10.000
Pricing
Input (1M)
$2.500
Output (1M)
$10.000
Blended (3:1)
$4.375
Performance
Tokens / s
79.0
TTFT (token)
0.53s
TTFT (answer)
0.53s
Benchmarks
MMLU Pro
GPQA 52.1%
HLE 2.9%
AIME 11.7%
LiveCodeBench 31.7%
SciCode 33.1%
Math 500 79.5%
AA Indexes
Intelligence
19.0%
Coding
16.6%
Math

Llama 4 Maverick

Meta

Input: $0.310
Output: $0.850
Pricing
Input (1M)
$0.310
Output (1M)
$0.850
Blended (3:1)
$0.461
Performance
Tokens / s
127.7
TTFT (token)
0.46s
TTFT (answer)
0.46s
Benchmarks
MMLU Pro 80.9%
GPQA 67.1%
HLE 4.8%
AIME 39.0%
LiveCodeBench 39.7%
SciCode 33.1%
Math 500 88.9%
AA Indexes
Intelligence
19.0%
Coding
15.6%
Math
19.3%

Gemini 2.0 Flash (Feb '25)

Google

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 77.9%
GPQA 62.3%
HLE 5.3%
AIME 33.0%
LiveCodeBench 33.4%
SciCode 33.3%
Math 500 93.0%
AA Indexes
Intelligence
19.0%
Coding
13.6%
Math
21.7%

Devstral Medium

Mistral

Input: $0.400
Output: $2.000
Pricing
Input (1M)
$0.400
Output (1M)
$2.000
Blended (3:1)
$0.800
Performance
Tokens / s
112.4
TTFT (token)
0.43s
TTFT (answer)
0.43s
Benchmarks
MMLU Pro 70.8%
GPQA 49.2%
HLE 3.8%
AIME 6.7%
LiveCodeBench 33.7%
SciCode 29.4%
Math 500 70.7%
AA Indexes
Intelligence
18.9%
Coding
15.9%
Math
4.7%

Nova 2.0 Lite (Non-reasoning)

Amazon

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
224.9
TTFT (token)
0.51s
TTFT (answer)
0.51s
Benchmarks
MMLU Pro 74.3%
GPQA 60.3%
HLE 3.0%
AIME
LiveCodeBench 34.6%
SciCode 24.0%
Math 500
AA Indexes
Intelligence
18.9%
Coding
12.5%
Math
33.7%

Claude 3.5 Haiku

Anthropic

Input: $0.800
Output: $4.000
Pricing
Input (1M)
$0.800
Output (1M)
$4.000
Blended (3:1)
$1.600
Performance
Tokens / s
48.4
TTFT (token)
0.70s
TTFT (answer)
0.70s
Benchmarks
MMLU Pro 63.4%
GPQA 40.8%
HLE 3.5%
AIME 3.3%
LiveCodeBench 31.4%
SciCode 27.4%
Math 500 72.1%
AA Indexes
Intelligence
18.8%
Coding
10.7%
Math

DeepSeek R1 (Jan '25)

DeepSeek

Input: $1.350
Output: $4.000
Pricing
Input (1M)
$1.350
Output (1M)
$4.000
Blended (3:1)
$2.362
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 84.4%
GPQA 70.8%
HLE 9.3%
AIME 68.3%
LiveCodeBench 61.7%
SciCode 35.7%
Math 500 96.6%
AA Indexes
Intelligence
18.8%
Coding
15.9%
Math
68.0%

Reka Flash (Feb '24)

Reka AI

Input: $0.200
Output: $0.800
Pricing
Input (1M)
$0.200
Output (1M)
$0.800
Blended (3:1)
$0.350
Performance
Tokens / s
69.1
TTFT (token)
1.32s
TTFT (answer)
1.32s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500 32.6%
AA Indexes
Intelligence
18.7%
Coding
Math

GPT-4o (March 2025, chatgpt-4o-latest)

OpenAI

Input: $5.000
Output: $15.000
Pricing
Input (1M)
$5.000
Output (1M)
$15.000
Blended (3:1)
$7.500
Performance
Tokens / s
215.5
TTFT (token)
0.51s
TTFT (answer)
0.51s
Benchmarks
MMLU Pro 80.3%
GPQA 65.5%
HLE 5.0%
AIME 32.7%
LiveCodeBench 42.5%
SciCode 36.6%
Math 500 89.3%
AA Indexes
Intelligence
18.6%
Coding
Math
25.7%

Reka Edge

Reka AI

Input: $0.100
Output: $0.100
Pricing
Input (1M)
$0.100
Output (1M)
$0.100
Blended (3:1)
$0.100
Performance
Tokens / s
63.8
TTFT (token)
1.29s
TTFT (answer)
1.29s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500 21.6%
AA Indexes
Intelligence
18.5%
Coding
Math

Gemini 2.5 Flash-Lite (Reasoning)

Google

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
369.9
TTFT (token)
15.73s
TTFT (answer)
15.73s
Benchmarks
MMLU Pro 75.9%
GPQA 62.5%
HLE 6.4%
AIME 70.3%
LiveCodeBench 59.3%
SciCode 19.3%
Math 500 96.9%
AA Indexes
Intelligence
18.0%
Coding
9.5%
Math
53.3%

Sonar Reasoning

Perplexity

Input: $1.000
Output: $5.000
Pricing
Input (1M)
$1.000
Output (1M)
$5.000
Blended (3:1)
$2.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro
GPQA 62.3%
HLE
AIME 77.0%
LiveCodeBench
SciCode
Math 500 92.1%
AA Indexes
Intelligence
17.9%
Coding
Math

Devstral Small (May '25)

Mistral

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
200.0
TTFT (token)
0.36s
TTFT (answer)
0.36s
Benchmarks
MMLU Pro 63.2%
GPQA 43.4%
HLE 4.0%
AIME 6.7%
LiveCodeBench 25.8%
SciCode 24.5%
Math 500 68.4%
AA Indexes
Intelligence
17.9%
Coding
12.2%
Math

Llama 3.1 Instruct 405B

Meta

Input: $3.750
Output: $6.750
Pricing
Input (1M)
$3.750
Output (1M)
$6.750
Blended (3:1)
$4.188
Performance
Tokens / s
24.9
TTFT (token)
0.77s
TTFT (answer)
0.77s
Benchmarks
MMLU Pro 73.2%
GPQA 51.5%
HLE 4.2%
AIME 21.3%
LiveCodeBench 30.5%
SciCode 29.9%
Math 500 70.3%
AA Indexes
Intelligence
17.6%
Coding
14.5%
Math
3.0%

Mistral Medium 3

Mistral

Input: $0.400
Output: $2.000
Pricing
Input (1M)
$0.400
Output (1M)
$2.000
Blended (3:1)
$0.800
Performance
Tokens / s
95.7
TTFT (token)
0.38s
TTFT (answer)
0.38s
Benchmarks
MMLU Pro 76.0%
GPQA 57.8%
HLE 4.3%
AIME 44.0%
LiveCodeBench 40.0%
SciCode 33.1%
Math 500 90.7%
AA Indexes
Intelligence
17.6%
Coding
13.6%
Math
30.3%

GLM-4.6V (Non-reasoning)

Z AI

Input: $0.300
Output: $0.900
Pricing
Input (1M)
$0.300
Output (1M)
$0.900
Blended (3:1)
$0.450
Performance
Tokens / s
46.6
TTFT (token)
1.01s
TTFT (answer)
1.01s
Benchmarks
MMLU Pro 75.2%
GPQA 56.6%
HLE 3.7%
AIME
LiveCodeBench 41.1%
SciCode 27.2%
Math 500
AA Indexes
Intelligence
17.4%
Coding
11.1%
Math
26.3%

Nova 2.0 Omni (Non-reasoning)

Amazon

Input: $0.300
Output: $2.500
Pricing
Input (1M)
$0.300
Output (1M)
$2.500
Blended (3:1)
$0.850
Performance
Tokens / s
226.3
TTFT (token)
0.72s
TTFT (answer)
0.72s
Benchmarks
MMLU Pro 71.9%
GPQA 55.5%
HLE 3.9%
AIME
LiveCodeBench 30.5%
SciCode 27.9%
Math 500
AA Indexes
Intelligence
17.3%
Coding
13.8%
Math
37.0%

Qwen3 235B A22B (Non-reasoning)

Alibaba

Input: $0.700
Output: $2.800
Pricing
Input (1M)
$0.700
Output (1M)
$2.800
Blended (3:1)
$1.225
Performance
Tokens / s
45.7
TTFT (token)
1.10s
TTFT (answer)
1.10s
Benchmarks
MMLU Pro 76.2%
GPQA 61.3%
HLE 4.7%
AIME 32.7%
LiveCodeBench 34.3%
SciCode 29.9%
Math 500 90.2%
AA Indexes
Intelligence
17.3%
Coding
14.0%
Math
23.7%

ERNIE 4.5 300B A47B

Baidu

Input: $0.280
Output: $1.100
Pricing
Input (1M)
$0.280
Output (1M)
$1.100
Blended (3:1)
$0.485
Performance
Tokens / s
26.4
TTFT (token)
2.03s
TTFT (answer)
2.03s
Benchmarks
MMLU Pro 77.6%
GPQA 81.1%
HLE 3.5%
AIME 49.3%
LiveCodeBench 46.7%
SciCode 31.5%
Math 500 93.1%
AA Indexes
Intelligence
17.3%
Coding
14.5%
Math
41.3%

Qwen3 VL 32B Instruct

Alibaba

Input: $0.700
Output: $2.800
Pricing
Input (1M)
$0.700
Output (1M)
$2.800
Blended (3:1)
$1.225
Performance
Tokens / s
45.6
TTFT (token)
0.95s
TTFT (answer)
0.95s
Benchmarks
MMLU Pro 79.1%
GPQA 67.1%
HLE 6.3%
AIME
LiveCodeBench 51.4%
SciCode 30.1%
Math 500
AA Indexes
Intelligence
17.2%
Coding
15.6%
Math
68.3%

DeepSeek R1 Distill Qwen 32B

DeepSeek

Input: $0.285
Output: $0.285
Pricing
Input (1M)
$0.285
Output (1M)
$0.285
Blended (3:1)
$0.285
Performance
Tokens / s
39.4
TTFT (token)
0.38s
TTFT (answer)
51.14s
Benchmarks
MMLU Pro 73.9%
GPQA 61.5%
HLE 5.5%
AIME 68.7%
LiveCodeBench 27.0%
SciCode 37.6%
Math 500 94.1%
AA Indexes
Intelligence
17.2%
Coding
Math
63.0%

Qwen3 VL 8B (Reasoning)

Alibaba

Input: $0.180
Output: $2.100
Pricing
Input (1M)
$0.180
Output (1M)
$2.100
Blended (3:1)
$0.660
Performance
Tokens / s
63.6
TTFT (token)
0.93s
TTFT (answer)
32.39s
Benchmarks
MMLU Pro 74.9%
GPQA 57.9%
HLE 3.3%
AIME
LiveCodeBench 35.3%
SciCode 21.9%
Math 500
AA Indexes
Intelligence
17.1%
Coding
9.8%
Math
30.7%

Qwen3 32B (Reasoning)

Alibaba

Input: $0.700
Output: $8.400
Pricing
Input (1M)
$0.700
Output (1M)
$8.400
Blended (3:1)
$2.625
Performance
Tokens / s
88.7
TTFT (token)
1.03s
TTFT (answer)
23.57s
Benchmarks
MMLU Pro 79.8%
GPQA 66.8%
HLE 8.3%
AIME 80.7%
LiveCodeBench 54.6%
SciCode 35.4%
Math 500 96.1%
AA Indexes
Intelligence
17.1%
Coding
13.8%
Math
73.0%

DeepSeek V3 (Dec '24)

DeepSeek

Input: $0.400
Output: $0.890
Pricing
Input (1M)
$0.400
Output (1M)
$0.890
Blended (3:1)
$0.625
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 75.2%
GPQA 55.7%
HLE 3.6%
AIME 25.3%
LiveCodeBench 35.9%
SciCode 35.4%
Math 500 88.7%
AA Indexes
Intelligence
17.1%
Coding
16.4%
Math
26.0%

Hermes 4 - Llama-3.1 405B (Non-reasoning)

Nous Research

Input: $1.000
Output: $3.000
Pricing
Input (1M)
$1.000
Output (1M)
$3.000
Blended (3:1)
$1.500
Performance
Tokens / s
32.9
TTFT (token)
0.72s
TTFT (answer)
0.72s
Benchmarks
MMLU Pro 72.9%
GPQA 53.6%
HLE 4.2%
AIME
LiveCodeBench 54.6%
SciCode 34.6%
Math 500
AA Indexes
Intelligence
17.1%
Coding
18.1%
Math
15.3%

EXAONE 4.0 32B (Reasoning)

LG AI Research

Input: $0.600
Output: $1.000
Pricing
Input (1M)
$0.600
Output (1M)
$1.000
Blended (3:1)
$0.700
Performance
Tokens / s
96.2
TTFT (token)
0.30s
TTFT (answer)
21.08s
Benchmarks
MMLU Pro 81.8%
GPQA 73.9%
HLE 10.5%
AIME 84.3%
LiveCodeBench 74.7%
SciCode 34.4%
Math 500 97.7%
AA Indexes
Intelligence
17.0%
Coding
14.0%
Math
80.0%

Qwen3 14B (Reasoning)

Alibaba

Input: $0.350
Output: $4.200
Pricing
Input (1M)
$0.350
Output (1M)
$4.200
Blended (3:1)
$1.313
Performance
Tokens / s
58.8
TTFT (token)
1.04s
TTFT (answer)
35.08s
Benchmarks
MMLU Pro 77.4%
GPQA 60.4%
HLE 4.3%
AIME 76.3%
LiveCodeBench 52.3%
SciCode 31.6%
Math 500 96.1%
AA Indexes
Intelligence
16.8%
Coding
13.1%
Math
55.7%

Magistral Small 1

Mistral

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
118.7
TTFT (token)
0.35s
TTFT (answer)
17.20s
Benchmarks
MMLU Pro 74.6%
GPQA 64.1%
HLE 7.2%
AIME 71.3%
LiveCodeBench 51.4%
SciCode 24.1%
Math 500 96.3%
AA Indexes
Intelligence
16.8%
Coding
11.1%
Math
41.3%

Olmo 3 7B Think

Allen Institute for AI

Input: $0.120
Output: $0.200
Pricing
Input (1M)
$0.120
Output (1M)
$0.200
Blended (3:1)
$0.140
Performance
Tokens / s
170.3
TTFT (token)
0.36s
TTFT (answer)
12.11s
Benchmarks
MMLU Pro 65.5%
GPQA 51.6%
HLE 5.7%
AIME
LiveCodeBench 61.7%
SciCode 21.2%
Math 500
AA Indexes
Intelligence
16.8%
Coding
7.6%
Math
70.7%

CompactifAI Llama 3.3 70B Slim

Multiverse Computing

Input: $0.160
Output: $0.310
Pricing
Input (1M)
$0.160
Output (1M)
$0.310
Blended (3:1)
$0.198
Performance
Tokens / s
130.7
TTFT (token)
0.30s
TTFT (answer)
0.30s
Benchmarks
MMLU Pro 57.1%
GPQA 35.5%
HLE 2.6%
AIME
LiveCodeBench 21.0%
SciCode 3.2%
Math 500
AA Indexes
Intelligence
16.5%
Coding
8.1%
Math
3.3%

CompactifAI Mistral Small 3.1 Slim

Multiverse Computing

Input: $0.050
Output: $0.080
Pricing
Input (1M)
$0.050
Output (1M)
$0.080
Blended (3:1)
$0.058
Performance
Tokens / s
121.6
TTFT (token)
0.31s
TTFT (answer)
0.31s
Benchmarks
MMLU Pro 53.8%
GPQA 32.9%
HLE 5.1%
AIME
LiveCodeBench 16.8%
SciCode 12.8%
Math 500
AA Indexes
Intelligence
16.5%
Coding
10.1%
Math
100.0%

Qwen3 VL 30B A3B Instruct

Alibaba

Input: $0.200
Output: $0.800
Pricing
Input (1M)
$0.200
Output (1M)
$0.800
Blended (3:1)
$0.350
Performance
Tokens / s
99.2
TTFT (token)
0.90s
TTFT (answer)
0.90s
Benchmarks
MMLU Pro 76.4%
GPQA 69.5%
HLE 6.4%
AIME
LiveCodeBench 47.6%
SciCode 30.8%
Math 500
AA Indexes
Intelligence
16.4%
Coding
14.3%
Math
72.3%

DeepSeek R1 0528 Qwen3 8B

DeepSeek

Input: $0.060
Output: $0.090
Pricing
Input (1M)
$0.060
Output (1M)
$0.090
Blended (3:1)
$0.068
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 73.9%
GPQA 61.2%
HLE 5.6%
AIME 65.0%
LiveCodeBench 51.3%
SciCode 20.4%
Math 500 93.2%
AA Indexes
Intelligence
16.4%
Coding
7.8%
Math
63.7%

Ministral 3 14B

Mistral

Input: $0.200
Output: $0.200
Pricing
Input (1M)
$0.200
Output (1M)
$0.200
Blended (3:1)
$0.200
Performance
Tokens / s
126.2
TTFT (token)
0.30s
TTFT (answer)
0.30s
Benchmarks
MMLU Pro 69.3%
GPQA 57.2%
HLE 4.6%
AIME
LiveCodeBench 35.1%
SciCode 23.6%
Math 500
AA Indexes
Intelligence
16.3%
Coding
10.9%
Math
30.0%

Qwen2.5 Max

Alibaba

Input: $1.600
Output: $6.400
Pricing
Input (1M)
$1.600
Output (1M)
$6.400
Blended (3:1)
$2.800
Performance
Tokens / s
41.4
TTFT (token)
1.13s
TTFT (answer)
1.13s
Benchmarks
MMLU Pro 76.2%
GPQA 58.7%
HLE 4.5%
AIME 23.3%
LiveCodeBench 35.9%
SciCode 33.7%
Math 500 83.5%
AA Indexes
Intelligence
16.3%
Coding
Math

DeepSeek R1 Distill Llama 70B

DeepSeek

Input: $0.875
Output: $1.300
Pricing
Input (1M)
$0.875
Output (1M)
$1.300
Blended (3:1)
$0.963
Performance
Tokens / s
38.9
TTFT (token)
0.95s
TTFT (answer)
52.36s
Benchmarks
MMLU Pro 79.5%
GPQA 40.2%
HLE 6.1%
AIME 67.0%
LiveCodeBench 26.6%
SciCode 31.2%
Math 500 93.5%
AA Indexes
Intelligence
16.0%
Coding
11.4%
Math
53.7%

Claude 3.5 Sonnet (Oct '24)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 77.2%
GPQA 59.9%
HLE 3.9%
AIME 15.7%
LiveCodeBench 38.1%
SciCode 36.6%
Math 500 77.1%
AA Indexes
Intelligence
15.9%
Coding
30.2%
Math

DeepSeek R1 Distill Qwen 14B

DeepSeek

Input: $0.150
Output: $0.150
Pricing
Input (1M)
$0.150
Output (1M)
$0.150
Blended (3:1)
$0.150
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 74.0%
GPQA 48.4%
HLE 4.4%
AIME 66.7%
LiveCodeBench 37.6%
SciCode 23.9%
Math 500 94.9%
AA Indexes
Intelligence
15.8%
Coding
Math
55.7%

Qwen3 30B A3B (Reasoning)

Alibaba

Input: $0.200
Output: $2.400
Pricing
Input (1M)
$0.200
Output (1M)
$2.400
Blended (3:1)
$0.750
Performance
Tokens / s
66.5
TTFT (token)
1.09s
TTFT (answer)
31.19s
Benchmarks
MMLU Pro 77.7%
GPQA 61.6%
HLE 6.6%
AIME 75.3%
LiveCodeBench 50.6%
SciCode 28.5%
Math 500 95.9%
AA Indexes
Intelligence
15.8%
Coding
11.0%
Math
72.3%

Qwen3 Omni 30B A3B (Reasoning)

Alibaba

Input: $0.250
Output: $0.970
Pricing
Input (1M)
$0.250
Output (1M)
$0.970
Blended (3:1)
$0.430
Performance
Tokens / s
97.1
TTFT (token)
0.93s
TTFT (answer)
21.52s
Benchmarks
MMLU Pro 79.2%
GPQA 72.6%
HLE 7.3%
AIME
LiveCodeBench 67.9%
SciCode 30.6%
Math 500
AA Indexes
Intelligence
15.8%
Coding
12.7%
Math
74.0%

Devstral Small (Jul '25)

Mistral

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
237.5
TTFT (token)
0.37s
TTFT (answer)
0.37s
Benchmarks
MMLU Pro 62.2%
GPQA 41.4%
HLE 3.7%
AIME 0.3%
LiveCodeBench 25.4%
SciCode 24.3%
Math 500 63.5%
AA Indexes
Intelligence
15.7%
Coding
12.1%
Math
29.3%

Qwen3 30B A3B 2507 Instruct

Alibaba

Input: $0.200
Output: $0.800
Pricing
Input (1M)
$0.200
Output (1M)
$0.800
Blended (3:1)
$0.350
Performance
Tokens / s
59.8
TTFT (token)
1.00s
TTFT (answer)
1.00s
Benchmarks
MMLU Pro 77.7%
GPQA 65.9%
HLE 6.8%
AIME 72.7%
LiveCodeBench 51.5%
SciCode 30.4%
Math 500 97.5%
AA Indexes
Intelligence
15.5%
Coding
14.2%
Math
66.3%

Llama Nemotron Super 49B v1.5 (Non-reasoning)

NVIDIA

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
69.4
TTFT (token)
0.42s
TTFT (answer)
0.42s
Benchmarks
MMLU Pro 69.2%
GPQA 48.1%
HLE 4.3%
AIME 13.7%
LiveCodeBench 29.0%
SciCode 23.8%
Math 500 77.0%
AA Indexes
Intelligence
15.5%
Coding
10.5%
Math
8.0%

NVIDIA Nemotron Nano 9B V2 (Reasoning)

NVIDIA

Input: $0.040
Output: $0.160
Pricing
Input (1M)
$0.040
Output (1M)
$0.160
Blended (3:1)
$0.070
Performance
Tokens / s
118.2
TTFT (token)
0.39s
TTFT (answer)
17.31s
Benchmarks
MMLU Pro 74.2%
GPQA 57.0%
HLE 4.6%
AIME
LiveCodeBench 72.4%
SciCode 22.0%
Math 500
AA Indexes
Intelligence
15.5%
Coding
8.3%
Math
69.7%

Sonar

Perplexity

Input: $1.000
Output: $1.000
Pricing
Input (1M)
$1.000
Output (1M)
$1.000
Blended (3:1)
$1.000
Performance
Tokens / s
78.3
TTFT (token)
1.56s
TTFT (answer)
1.56s
Benchmarks
MMLU Pro 68.9%
GPQA 47.1%
HLE 7.3%
AIME 48.7%
LiveCodeBench 29.5%
SciCode 22.9%
Math 500 81.7%
AA Indexes
Intelligence
15.5%
Coding
Math

Mistral Small 3.2

Mistral

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
111.3
TTFT (token)
0.29s
TTFT (answer)
0.29s
Benchmarks
MMLU Pro 68.1%
GPQA 50.5%
HLE 4.3%
AIME 32.3%
LiveCodeBench 27.5%
SciCode 26.4%
Math 500 88.3%
AA Indexes
Intelligence
15.4%
Coding
13.3%
Math
27.0%

Qwen3 8B (Reasoning)

Alibaba

Input: $0.180
Output: $2.100
Pricing
Input (1M)
$0.180
Output (1M)
$2.100
Blended (3:1)
$0.660
Performance
Tokens / s
85.1
TTFT (token)
0.95s
TTFT (answer)
24.46s
Benchmarks
MMLU Pro 74.3%
GPQA 58.9%
HLE 4.2%
AIME 74.7%
LiveCodeBench 40.6%
SciCode 22.6%
Math 500 90.4%
AA Indexes
Intelligence
15.3%
Coding
9.0%
Math
19.0%

Ministral 3 8B

Mistral

Input: $0.150
Output: $0.150
Pricing
Input (1M)
$0.150
Output (1M)
$0.150
Blended (3:1)
$0.150
Performance
Tokens / s
196.9
TTFT (token)
0.28s
TTFT (answer)
0.28s
Benchmarks
MMLU Pro 64.2%
GPQA 47.1%
HLE 4.3%
AIME
LiveCodeBench 30.3%
SciCode 20.8%
Math 500
AA Indexes
Intelligence
15.2%
Coding
10.0%
Math
31.7%

NVIDIA Nemotron Nano 12B v2 VL (Reasoning)

NVIDIA

Input: $0.200
Output: $0.600
Pricing
Input (1M)
$0.200
Output (1M)
$0.600
Blended (3:1)
$0.300
Performance
Tokens / s
128.9
TTFT (token)
0.21s
TTFT (answer)
15.72s
Benchmarks
MMLU Pro 75.9%
GPQA 57.2%
HLE 5.3%
AIME
LiveCodeBench 69.4%
SciCode 26.2%
Math 500
AA Indexes
Intelligence
15.2%
Coding
11.8%
Math
75.0%

QwQ 32B-Preview

Alibaba

Input: $0.120
Output: $0.180
Pricing
Input (1M)
$0.120
Output (1M)
$0.180
Blended (3:1)
$0.135
Performance
Tokens / s
40.8
TTFT (token)
0.42s
TTFT (answer)
49.44s
Benchmarks
MMLU Pro 64.8%
GPQA 55.7%
HLE 4.8%
AIME 45.3%
LiveCodeBench 33.7%
SciCode 3.8%
Math 500 91.0%
AA Indexes
Intelligence
15.2%
Coding
Math

Sonar Pro

Perplexity

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
86.2
TTFT (token)
1.56s
TTFT (answer)
1.56s
Benchmarks
MMLU Pro 75.5%
GPQA 57.8%
HLE 7.9%
AIME 29.0%
LiveCodeBench 27.5%
SciCode 22.6%
Math 500 74.5%
AA Indexes
Intelligence
15.2%
Coding
Math

Llama 3.3 Instruct 70B

Meta

Input: $0.585
Output: $0.715
Pricing
Input (1M)
$0.585
Output (1M)
$0.715
Blended (3:1)
$0.675
Performance
Tokens / s
105.5
TTFT (token)
0.49s
TTFT (answer)
0.49s
Benchmarks
MMLU Pro 71.3%
GPQA 49.8%
HLE 4.0%
AIME 30.0%
LiveCodeBench 28.8%
SciCode 26.0%
Math 500 77.3%
AA Indexes
Intelligence
15.1%
Coding
10.7%
Math
7.7%

Ling-mini-2.0

InclusionAI

Input: $0.070
Output: $0.280
Pricing
Input (1M)
$0.070
Output (1M)
$0.280
Blended (3:1)
$0.122
Performance
Tokens / s
156.9
TTFT (token)
2.12s
TTFT (answer)
2.12s
Benchmarks
MMLU Pro 67.1%
GPQA 56.2%
HLE 5.0%
AIME
LiveCodeBench 42.9%
SciCode 13.5%
Math 500
AA Indexes
Intelligence
15.1%
Coding
5.0%
Math
49.3%

GPT-4o (Nov '24)

OpenAI

Input: $2.500
Output: $10.000
Pricing
Input (1M)
$2.500
Output (1M)
$10.000
Blended (3:1)
$4.375
Performance
Tokens / s
116.4
TTFT (token)
0.46s
TTFT (answer)
0.46s
Benchmarks
MMLU Pro 74.8%
GPQA 54.3%
HLE 3.3%
AIME 15.0%
LiveCodeBench 30.9%
SciCode 33.3%
Math 500 75.9%
AA Indexes
Intelligence
14.8%
Coding
16.7%
Math
6.0%

Qwen3 VL 8B Instruct

Alibaba

Input: $0.180
Output: $0.700
Pricing
Input (1M)
$0.180
Output (1M)
$0.700
Blended (3:1)
$0.310
Performance
Tokens / s
98.3
TTFT (token)
0.90s
TTFT (answer)
0.90s
Benchmarks
MMLU Pro 68.6%
GPQA 42.7%
HLE 2.9%
AIME
LiveCodeBench 33.2%
SciCode 17.4%
Math 500
AA Indexes
Intelligence
14.7%
Coding
7.3%
Math
27.3%

Gemini 2.0 Flash-Lite (Feb '25)

Google

Input: $0.075
Output: $0.300
Pricing
Input (1M)
$0.075
Output (1M)
$0.300
Blended (3:1)
$0.131
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 72.4%
GPQA 53.5%
HLE 3.6%
AIME 27.7%
LiveCodeBench 18.5%
SciCode 25.0%
Math 500 87.3%
AA Indexes
Intelligence
14.7%
Coding
Math

Mistral Large 2 (Nov '24)

Mistral

Input: $2.000
Output: $6.000
Pricing
Input (1M)
$2.000
Output (1M)
$6.000
Blended (3:1)
$3.000
Performance
Tokens / s
45.9
TTFT (token)
0.44s
TTFT (answer)
0.44s
Benchmarks
MMLU Pro 69.7%
GPQA 48.6%
HLE 4.0%
AIME 11.0%
LiveCodeBench 29.3%
SciCode 29.2%
Math 500 73.6%
AA Indexes
Intelligence
14.7%
Coding
13.8%
Math
14.0%

Gemini 2.0 Flash-Lite (Preview)

Google

Input: $0.075
Output: $0.300
Pricing
Input (1M)
$0.075
Output (1M)
$0.300
Blended (3:1)
$0.131
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro
GPQA 54.2%
HLE 4.4%
AIME 30.3%
LiveCodeBench 17.9%
SciCode 24.7%
Math 500 87.3%
AA Indexes
Intelligence
14.5%
Coding
Math

GPT-4o (May '24)

OpenAI

Input: $5.000
Output: $15.000
Pricing
Input (1M)
$5.000
Output (1M)
$15.000
Blended (3:1)
$7.500
Performance
Tokens / s
78.4
TTFT (token)
0.51s
TTFT (answer)
0.51s
Benchmarks
MMLU Pro 74.0%
GPQA 52.6%
HLE 2.8%
AIME 11.0%
LiveCodeBench 33.4%
SciCode 30.9%
Math 500 79.1%
AA Indexes
Intelligence
14.5%
Coding
24.2%
Math

Qwen3 32B (Non-reasoning)

Alibaba

Input: $0.700
Output: $2.800
Pricing
Input (1M)
$0.700
Output (1M)
$2.800
Blended (3:1)
$1.225
Performance
Tokens / s
85.6
TTFT (token)
0.98s
TTFT (answer)
0.98s
Benchmarks
MMLU Pro 72.7%
GPQA 53.5%
HLE 4.3%
AIME 30.3%
LiveCodeBench 28.8%
SciCode 28.0%
Math 500 86.9%
AA Indexes
Intelligence
14.5%
Coding
Math
19.7%

Phi-3 Medium Instruct 14B

Microsoft Azure

Input: $0.170
Output: $0.680
Pricing
Input (1M)
$0.170
Output (1M)
$0.680
Blended (3:1)
$0.297
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 54.3%
GPQA 32.6%
HLE 4.5%
AIME 1.3%
LiveCodeBench 15.0%
SciCode 11.8%
Math 500 46.3%
AA Indexes
Intelligence
14.4%
Coding
8.9%
Math
1.3%

Reka Flash 3

Reka AI

Input: $0.200
Output: $0.800
Pricing
Input (1M)
$0.200
Output (1M)
$0.800
Blended (3:1)
$0.350
Performance
Tokens / s
49.3
TTFT (token)
1.31s
TTFT (answer)
41.91s
Benchmarks
MMLU Pro 66.9%
GPQA 52.9%
HLE 5.1%
AIME 51.0%
LiveCodeBench 43.5%
SciCode 26.7%
Math 500 89.3%
AA Indexes
Intelligence
14.3%
Coding
8.9%
Math
33.7%

Claude 3.5 Sonnet (June '24)

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 75.1%
GPQA 56.0%
HLE 3.7%
AIME 9.7%
LiveCodeBench
SciCode 31.6%
Math 500 69.5%
AA Indexes
Intelligence
14.2%
Coding
26.0%
Math

Qwen3 4B (Reasoning)

Alibaba

Input: $0.110
Output: $1.260
Pricing
Input (1M)
$0.110
Output (1M)
$1.260
Blended (3:1)
$0.398
Performance
Tokens / s
91.0
TTFT (token)
0.96s
TTFT (answer)
22.94s
Benchmarks
MMLU Pro 69.6%
GPQA 52.2%
HLE 5.1%
AIME 65.7%
LiveCodeBench 46.5%
SciCode 3.5%
Math 500 93.3%
AA Indexes
Intelligence
14.2%
Coding
Math
22.3%

Llama 3.1 Nemotron Instruct 70B

NVIDIA

Input: $1.200
Output: $1.200
Pricing
Input (1M)
$1.200
Output (1M)
$1.200
Blended (3:1)
$1.200
Performance
Tokens / s
39.6
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro 69.0%
GPQA 46.5%
HLE 4.6%
AIME 24.7%
LiveCodeBench 16.9%
SciCode 23.3%
Math 500 73.3%
AA Indexes
Intelligence
14.1%
Coding
10.8%
Math
11.0%

GPT-4o (ChatGPT)

OpenAI

Input: $5.000
Output: $15.000
Pricing
Input (1M)
$5.000
Output (1M)
$15.000
Blended (3:1)
$7.500
Performance
Tokens / s
202.7
TTFT (token)
0.54s
TTFT (answer)
0.54s
Benchmarks
MMLU Pro 77.3%
GPQA 51.1%
HLE 3.7%
AIME 10.3%
LiveCodeBench
SciCode 33.4%
Math 500 79.7%
AA Indexes
Intelligence
14.1%
Coding
Math

GPT-5 nano (minimal)

OpenAI

Input: $0.050
Output: $0.400
Pricing
Input (1M)
$0.050
Output (1M)
$0.400
Blended (3:1)
$0.138
Performance
Tokens / s
116.8
TTFT (token)
0.64s
TTFT (answer)
0.64s
Benchmarks
MMLU Pro 55.6%
GPQA 42.8%
HLE 4.1%
AIME
LiveCodeBench 47.0%
SciCode 29.1%
Math 500
AA Indexes
Intelligence
14.1%
Coding
14.2%
Math
27.3%

Llama 4 Scout

Meta

Input: $0.180
Output: $0.625
Pricing
Input (1M)
$0.180
Output (1M)
$0.625
Blended (3:1)
$0.287
Performance
Tokens / s
141.7
TTFT (token)
0.51s
TTFT (answer)
0.51s
Benchmarks
MMLU Pro 75.2%
GPQA 58.7%
HLE 4.3%
AIME 28.3%
LiveCodeBench 29.9%
SciCode 17.0%
Math 500 84.4%
AA Indexes
Intelligence
14.0%
Coding
6.7%
Math
14.0%

NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning)

NVIDIA

Input: $0.060
Output: $0.240
Pricing
Input (1M)
$0.060
Output (1M)
$0.240
Blended (3:1)
$0.105
Performance
Tokens / s
174.9
TTFT (token)
0.23s
TTFT (answer)
0.23s
Benchmarks
MMLU Pro 57.9%
GPQA 39.9%
HLE 4.6%
AIME
LiveCodeBench 36.0%
SciCode 23.0%
Math 500
AA Indexes
Intelligence
14.0%
Coding
15.8%
Math
13.3%

Mistral Small 3.1

Mistral

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
106.0
TTFT (token)
0.29s
TTFT (answer)
0.29s
Benchmarks
MMLU Pro 65.9%
GPQA 45.4%
HLE 4.8%
AIME 9.3%
LiveCodeBench 21.2%
SciCode 26.5%
Math 500 70.7%
AA Indexes
Intelligence
14.0%
Coding
13.9%
Math
3.7%

Pixtral Large

Mistral

Input: $2.000
Output: $6.000
Pricing
Input (1M)
$2.000
Output (1M)
$6.000
Blended (3:1)
$3.000
Performance
Tokens / s
30.6
TTFT (token)
0.62s
TTFT (answer)
0.62s
Benchmarks
MMLU Pro 70.1%
GPQA 50.5%
HLE 3.6%
AIME 7.0%
LiveCodeBench 26.1%
SciCode 29.2%
Math 500 71.4%
AA Indexes
Intelligence
14.0%
Coding
Math
2.3%

NVIDIA Nemotron Nano 9B V2 (Non-reasoning)

NVIDIA

Input: $0.060
Output: $0.230
Pricing
Input (1M)
$0.060
Output (1M)
$0.230
Blended (3:1)
$0.102
Performance
Tokens / s
111.8
TTFT (token)
0.54s
TTFT (answer)
0.54s
Benchmarks
MMLU Pro 73.9%
GPQA 55.7%
HLE 4.0%
AIME
LiveCodeBench 70.1%
SciCode 20.9%
Math 500
AA Indexes
Intelligence
13.8%
Coding
7.5%
Math
62.3%

Command A

Cohere

Input: $2.500
Output: $10.000
Pricing
Input (1M)
$2.500
Output (1M)
$10.000
Blended (3:1)
$4.375
Performance
Tokens / s
52.8
TTFT (token)
0.29s
TTFT (answer)
0.29s
Benchmarks
MMLU Pro 71.2%
GPQA 52.7%
HLE 4.6%
AIME 9.7%
LiveCodeBench 28.7%
SciCode 28.1%
Math 500 81.9%
AA Indexes
Intelligence
13.7%
Coding
9.9%
Math
13.0%

GPT-4 Turbo

OpenAI

Input: $10.000
Output: $30.000
Pricing
Input (1M)
$10.000
Output (1M)
$30.000
Blended (3:1)
$15.000
Performance
Tokens / s
27.0
TTFT (token)
1.07s
TTFT (answer)
1.07s
Benchmarks
MMLU Pro 69.4%
GPQA
HLE 3.3%
AIME 15.0%
LiveCodeBench 29.1%
SciCode 31.9%
Math 500 73.7%
AA Indexes
Intelligence
13.7%
Coding
21.5%
Math

Hermes 4 - Llama-3.1 70B (Non-reasoning)

Nous Research

Input: $0.130
Output: $0.400
Pricing
Input (1M)
$0.130
Output (1M)
$0.400
Blended (3:1)
$0.198
Performance
Tokens / s
73.0
TTFT (token)
0.56s
TTFT (answer)
0.56s
Benchmarks
MMLU Pro 66.4%
GPQA 49.1%
HLE 3.6%
AIME
LiveCodeBench 26.9%
SciCode 27.7%
Math 500
AA Indexes
Intelligence
13.6%
Coding
9.2%
Math
11.3%

Aya Expanse 32B

Cohere

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
41.9
TTFT (token)
0.30s
TTFT (answer)
0.30s
Benchmarks
MMLU Pro 37.7%
GPQA 23.0%
HLE 4.5%
AIME
LiveCodeBench 13.7%
SciCode 14.9%
Math 500 44.9%
AA Indexes
Intelligence
13.6%
Coding
9.8%
Math
2.3%

Nova Pro

Amazon

Input: $0.800
Output: $3.200
Pricing
Input (1M)
$0.800
Output (1M)
$3.200
Blended (3:1)
$1.400
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 69.1%
GPQA 49.9%
HLE 3.4%
AIME 10.7%
LiveCodeBench 23.3%
SciCode 20.8%
Math 500 78.6%
AA Indexes
Intelligence
13.5%
Coding
11.0%
Math
7.0%

Qwen3 14B (Non-reasoning)

Alibaba

Input: $0.350
Output: $1.400
Pricing
Input (1M)
$0.350
Output (1M)
$1.400
Blended (3:1)
$0.613
Performance
Tokens / s
54.8
TTFT (token)
1.15s
TTFT (answer)
1.15s
Benchmarks
MMLU Pro 67.5%
GPQA 47.0%
HLE 4.2%
AIME 28.0%
LiveCodeBench 28.0%
SciCode 26.5%
Math 500 87.1%
AA Indexes
Intelligence
13.4%
Coding
12.4%
Math
58.0%

GPT-4.1 nano

OpenAI

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
130.1
TTFT (token)
0.38s
TTFT (answer)
0.38s
Benchmarks
MMLU Pro 65.7%
GPQA 51.2%
HLE 3.9%
AIME 23.7%
LiveCodeBench 32.6%
SciCode 25.9%
Math 500 84.8%
AA Indexes
Intelligence
13.4%
Coding
11.2%
Math
24.0%

Phi-4

Microsoft Azure

Input: $0.125
Output: $0.500
Pricing
Input (1M)
$0.125
Output (1M)
$0.500
Blended (3:1)
$0.219
Performance
Tokens / s
10.9
TTFT (token)
0.58s
TTFT (answer)
0.58s
Benchmarks
MMLU Pro 71.4%
GPQA 57.5%
HLE 4.1%
AIME 14.3%
LiveCodeBench 23.1%
SciCode 26.0%
Math 500 81.0%
AA Indexes
Intelligence
13.2%
Coding
11.2%
Math
18.0%

GLM-4.5V (Non-reasoning)

Z AI

Input: $0.600
Output: $1.800
Pricing
Input (1M)
$0.600
Output (1M)
$1.800
Blended (3:1)
$0.900
Performance
Tokens / s
28.3
TTFT (token)
0.76s
TTFT (answer)
0.76s
Benchmarks
MMLU Pro 75.1%
GPQA 57.3%
HLE 3.6%
AIME
LiveCodeBench 35.2%
SciCode 18.8%
Math 500
AA Indexes
Intelligence
13.2%
Coding
10.8%
Math
15.3%

Gemini 2.5 Flash-Lite (Non-reasoning)

Google

Input: $0.100
Output: $0.400
Pricing
Input (1M)
$0.100
Output (1M)
$0.400
Blended (3:1)
$0.175
Performance
Tokens / s
256.1
TTFT (token)
0.36s
TTFT (answer)
0.36s
Benchmarks
MMLU Pro 72.4%
GPQA 47.4%
HLE 3.7%
AIME 50.0%
LiveCodeBench 40.0%
SciCode 17.7%
Math 500 92.6%
AA Indexes
Intelligence
13.1%
Coding
7.4%
Math
35.3%

Llama 3.1 Instruct 70B

Meta

Input: $0.560
Output: $0.560
Pricing
Input (1M)
$0.560
Output (1M)
$0.560
Blended (3:1)
$0.560
Performance
Tokens / s
61.5
TTFT (token)
0.40s
TTFT (answer)
0.40s
Benchmarks
MMLU Pro 67.6%
GPQA 40.9%
HLE 4.6%
AIME 17.3%
LiveCodeBench 23.2%
SciCode 26.7%
Math 500 64.9%
AA Indexes
Intelligence
13.1%
Coding
10.9%
Math
4.0%

Qwen3 1.7B (Reasoning)

Alibaba

Input: $0.110
Output: $1.260
Pricing
Input (1M)
$0.110
Output (1M)
$1.260
Blended (3:1)
$0.398
Performance
Tokens / s
124.8
TTFT (token)
0.93s
TTFT (answer)
16.96s
Benchmarks
MMLU Pro 57.0%
GPQA 35.6%
HLE 4.8%
AIME 51.0%
LiveCodeBench 30.8%
SciCode 4.3%
Math 500 89.4%
AA Indexes
Intelligence
13.1%
Coding
1.4%
Math
38.7%

Mistral Large 2 (Jul '24)

Mistral

Input: $2.000
Output: $6.000
Pricing
Input (1M)
$2.000
Output (1M)
$6.000
Blended (3:1)
$3.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 68.3%
GPQA 47.2%
HLE 3.2%
AIME 9.3%
LiveCodeBench 26.7%
SciCode 27.1%
Math 500 71.4%
AA Indexes
Intelligence
13.0%
Coding
Math

Qwen2.5 Coder Instruct 32B

Alibaba

Input: $0.130
Output: $0.175
Pricing
Input (1M)
$0.130
Output (1M)
$0.175
Blended (3:1)
$0.141
Performance
Tokens / s
35.7
TTFT (token)
0.51s
TTFT (answer)
0.51s
Benchmarks
MMLU Pro 63.5%
GPQA 41.7%
HLE 3.8%
AIME 12.0%
LiveCodeBench 29.5%
SciCode 27.1%
Math 500 76.7%
AA Indexes
Intelligence
12.9%
Coding
Math

GPT-4

OpenAI

Input: $30.000
Output: $60.000
Pricing
Input (1M)
$30.000
Output (1M)
$60.000
Blended (3:1)
$37.500
Performance
Tokens / s
28.3
TTFT (token)
0.68s
TTFT (answer)
0.68s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
12.8%
Coding
13.1%
Math

Mistral Small 3

Mistral

Input: $0.100
Output: $0.300
Pricing
Input (1M)
$0.100
Output (1M)
$0.300
Blended (3:1)
$0.150
Performance
Tokens / s
231.4
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro 65.2%
GPQA 46.2%
HLE 4.1%
AIME 8.0%
LiveCodeBench 25.2%
SciCode 23.6%
Math 500 71.5%
AA Indexes
Intelligence
12.7%
Coding
Math
4.3%

GPT-4o mini

OpenAI

Input: $0.150
Output: $0.600
Pricing
Input (1M)
$0.150
Output (1M)
$0.600
Blended (3:1)
$0.263
Performance
Tokens / s
49.0
TTFT (token)
0.53s
TTFT (answer)
0.53s
Benchmarks
MMLU Pro 64.8%
GPQA 42.6%
HLE 4.0%
AIME 11.7%
LiveCodeBench 23.4%
SciCode 22.9%
Math 500 78.9%
AA Indexes
Intelligence
12.6%
Coding
Math
14.7%

Qwen3 4B (Non-reasoning)

Alibaba

Input: $0.110
Output: $0.420
Pricing
Input (1M)
$0.110
Output (1M)
$0.420
Blended (3:1)
$0.188
Performance
Tokens / s
85.1
TTFT (token)
0.94s
TTFT (answer)
0.94s
Benchmarks
MMLU Pro 58.6%
GPQA 39.8%
HLE 3.7%
AIME 21.3%
LiveCodeBench 23.3%
SciCode 16.7%
Math 500 84.3%
AA Indexes
Intelligence
12.5%
Coding
Math

Claude 3 Opus

Anthropic

Input: $15.000
Output: $75.000
Pricing
Input (1M)
$15.000
Output (1M)
$75.000
Blended (3:1)
$30.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 69.6%
GPQA 48.9%
HLE 3.1%
AIME 3.3%
LiveCodeBench 27.9%
SciCode 23.3%
Math 500 64.1%
AA Indexes
Intelligence
12.5%
Coding
19.5%
Math

Nova Lite

Amazon

Input: $0.060
Output: $0.240
Pricing
Input (1M)
$0.060
Output (1M)
$0.240
Blended (3:1)
$0.105
Performance
Tokens / s
238.3
TTFT (token)
0.38s
TTFT (answer)
0.38s
Benchmarks
MMLU Pro 59.0%
GPQA 43.3%
HLE 4.6%
AIME 10.7%
LiveCodeBench 16.7%
SciCode 13.9%
Math 500 76.5%
AA Indexes
Intelligence
12.5%
Coding
5.1%
Math
7.0%

Qwen3 30B A3B (Non-reasoning)

Alibaba

Input: $0.200
Output: $0.800
Pricing
Input (1M)
$0.200
Output (1M)
$0.800
Blended (3:1)
$0.350
Performance
Tokens / s
61.7
TTFT (token)
1.07s
TTFT (answer)
1.07s
Benchmarks
MMLU Pro 71.0%
GPQA 51.5%
HLE 4.6%
AIME 26.0%
LiveCodeBench 32.2%
SciCode 26.4%
Math 500 86.3%
AA Indexes
Intelligence
12.4%
Coding
13.3%
Math
21.7%

Ministral 8B

Mistral

Input: $0.100
Output: $0.100
Pricing
Input (1M)
$0.100
Output (1M)
$0.100
Blended (3:1)
$0.100
Performance
Tokens / s
195.9
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro 38.9%
GPQA 27.6%
HLE 4.9%
AIME 3.7%
LiveCodeBench 11.2%
SciCode 11.5%
Math 500 57.1%
AA Indexes
Intelligence
12.4%
Coding
7.6%
Math
3.0%

Llama 3.1 Instruct 8B

Meta

Input: $0.100
Output: $0.100
Pricing
Input (1M)
$0.100
Output (1M)
$0.100
Blended (3:1)
$0.100
Performance
Tokens / s
162.8
TTFT (token)
0.36s
TTFT (answer)
0.36s
Benchmarks
MMLU Pro 47.6%
GPQA 25.9%
HLE 5.1%
AIME 7.7%
LiveCodeBench 11.6%
SciCode 13.2%
Math 500 51.9%
AA Indexes
Intelligence
12.2%
Coding
4.9%
Math
4.3%

Ministral 3 3B

Mistral

Input: $0.100
Output: $0.100
Pricing
Input (1M)
$0.100
Output (1M)
$0.100
Blended (3:1)
$0.100
Performance
Tokens / s
293.0
TTFT (token)
0.27s
TTFT (answer)
0.27s
Benchmarks
MMLU Pro 52.4%
GPQA 35.8%
HLE 5.3%
AIME
LiveCodeBench 24.7%
SciCode 14.4%
Math 500
AA Indexes
Intelligence
12.1%
Coding
4.8%
Math
22.0%

Olmo 3.1 32B Instruct

Allen Institute for AI

Input: $0.200
Output: $0.600
Pricing
Input (1M)
$0.200
Output (1M)
$0.600
Blended (3:1)
$0.300
Performance
Tokens / s
47.6
TTFT (token)
0.26s
TTFT (answer)
0.26s
Benchmarks
MMLU Pro
GPQA 53.9%
HLE 4.9%
AIME
LiveCodeBench
SciCode 16.7%
Math 500
AA Indexes
Intelligence
12.0%
Coding
5.6%
Math

Reka Flash (Sep '24)

Reka AI

Input: $0.200
Output: $0.800
Pricing
Input (1M)
$0.200
Output (1M)
$0.800
Blended (3:1)
$0.350
Performance
Tokens / s
71.1
TTFT (token)
1.28s
TTFT (answer)
1.28s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500 52.9%
AA Indexes
Intelligence
12.0%
Coding
Math

EXAONE 4.0 32B (Non-reasoning)

LG AI Research

Input: $0.600
Output: $1.000
Pricing
Input (1M)
$0.600
Output (1M)
$1.000
Blended (3:1)
$0.700
Performance
Tokens / s
88.3
TTFT (token)
0.31s
TTFT (answer)
0.31s
Benchmarks
MMLU Pro 76.8%
GPQA 62.8%
HLE 4.9%
AIME 47.0%
LiveCodeBench 47.2%
SciCode 25.2%
Math 500 93.9%
AA Indexes
Intelligence
12.0%
Coding
9.4%
Math
39.3%

Qwen2.5 Turbo

Alibaba

Input: $0.050
Output: $0.200
Pricing
Input (1M)
$0.050
Output (1M)
$0.200
Blended (3:1)
$0.087
Performance
Tokens / s
66.4
TTFT (token)
1.02s
TTFT (answer)
1.02s
Benchmarks
MMLU Pro 63.3%
GPQA 41.0%
HLE 4.2%
AIME 12.0%
LiveCodeBench 16.3%
SciCode 15.3%
Math 500 80.5%
AA Indexes
Intelligence
12.0%
Coding
Math

Llama 3.2 Instruct 90B (Vision)

Meta

Input: $0.720
Output: $0.720
Pricing
Input (1M)
$0.720
Output (1M)
$0.720
Blended (3:1)
$0.720
Performance
Tokens / s
36.6
TTFT (token)
0.35s
TTFT (answer)
0.35s
Benchmarks
MMLU Pro 67.1%
GPQA 43.2%
HLE 4.9%
AIME 5.0%
LiveCodeBench 21.4%
SciCode 24.0%
Math 500 62.9%
AA Indexes
Intelligence
11.9%
Coding
Math

Solar Mini

Upstage

Input: $0.150
Output: $0.150
Pricing
Input (1M)
$0.150
Output (1M)
$0.150
Blended (3:1)
$0.150
Performance
Tokens / s
84.5
TTFT (token)
1.01s
TTFT (answer)
1.01s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500 33.1%
AA Indexes
Intelligence
11.9%
Coding
Math

Granite 4.0 H Small

IBM

Input: $0.060
Output: $0.250
Pricing
Input (1M)
$0.060
Output (1M)
$0.250
Blended (3:1)
$0.107
Performance
Tokens / s
370.8
TTFT (token)
8.80s
TTFT (answer)
8.80s
Benchmarks
MMLU Pro 62.4%
GPQA 41.6%
HLE 3.7%
AIME
LiveCodeBench 25.1%
SciCode 20.9%
Math 500
AA Indexes
Intelligence
11.5%
Coding
8.5%
Math
13.7%

CompactifAI Llama 3.1 8B Slim

Multiverse Computing

Input: $0.050
Output: $0.070
Pricing
Input (1M)
$0.050
Output (1M)
$0.070
Blended (3:1)
$0.055
Performance
Tokens / s
228.4
TTFT (token)
0.24s
TTFT (answer)
0.24s
Benchmarks
MMLU Pro 32.1%
GPQA 22.1%
HLE 5.5%
AIME
LiveCodeBench 11.3%
SciCode 6.2%
Math 500
AA Indexes
Intelligence
11.2%
Coding
5.9%
Math
6.7%

Llama 3.2 Instruct 11B (Vision)

Meta

Input: $0.160
Output: $0.160
Pricing
Input (1M)
$0.160
Output (1M)
$0.160
Blended (3:1)
$0.160
Performance
Tokens / s
48.8
TTFT (token)
0.41s
TTFT (answer)
0.41s
Benchmarks
MMLU Pro 46.4%
GPQA 22.1%
HLE 5.2%
AIME 9.3%
LiveCodeBench 11.0%
SciCode 11.2%
Math 500 51.6%
AA Indexes
Intelligence
10.9%
Coding
4.3%
Math
1.7%

Ministral 3B

Mistral

Input: $0.040
Output: $0.040
Pricing
Input (1M)
$0.040
Output (1M)
$0.040
Blended (3:1)
$0.040
Performance
Tokens / s
272.6
TTFT (token)
0.33s
TTFT (answer)
0.33s
Benchmarks
MMLU Pro 33.9%
GPQA 26.0%
HLE 5.5%
AIME
LiveCodeBench 6.9%
SciCode 9.4%
Math 500 53.7%
AA Indexes
Intelligence
10.9%
Coding
5.4%
Math
30.0%

Granite 3.3 8B (Non-reasoning)

IBM

Input: $0.030
Output: $0.250
Pricing
Input (1M)
$0.030
Output (1M)
$0.250
Blended (3:1)
$0.085
Performance
Tokens / s
488.2
TTFT (token)
7.57s
TTFT (answer)
7.57s
Benchmarks
MMLU Pro 46.8%
GPQA 33.8%
HLE 4.2%
AIME 4.7%
LiveCodeBench 12.7%
SciCode 10.1%
Math 500 66.5%
AA Indexes
Intelligence
10.8%
Coding
3.4%
Math
6.7%

Qwen3 8B (Non-reasoning)

Alibaba

Input: $0.180
Output: $0.700
Pricing
Input (1M)
$0.180
Output (1M)
$0.700
Blended (3:1)
$0.310
Performance
Tokens / s
79.7
TTFT (token)
0.92s
TTFT (answer)
0.92s
Benchmarks
MMLU Pro 64.3%
GPQA 45.2%
HLE 2.8%
AIME 24.3%
LiveCodeBench 20.2%
SciCode 16.8%
Math 500 82.8%
AA Indexes
Intelligence
10.8%
Coding
7.1%
Math
24.3%

Qwen3 Omni 30B A3B Instruct

Alibaba

Input: $0.250
Output: $0.970
Pricing
Input (1M)
$0.250
Output (1M)
$0.970
Blended (3:1)
$0.430
Performance
Tokens / s
88.3
TTFT (token)
0.91s
TTFT (answer)
0.91s
Benchmarks
MMLU Pro 72.5%
GPQA 62.0%
HLE 5.1%
AIME
LiveCodeBench 42.2%
SciCode 18.6%
Math 500
AA Indexes
Intelligence
10.7%
Coding
7.2%
Math
52.3%

Jamba 1.5 Large

AI21 Labs

Input: $2.000
Output: $8.000
Pricing
Input (1M)
$2.000
Output (1M)
$8.000
Blended (3:1)
$3.500
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 57.2%
GPQA 42.7%
HLE 4.0%
AIME 4.7%
LiveCodeBench 14.3%
SciCode 16.3%
Math 500 60.6%
AA Indexes
Intelligence
10.7%
Coding
Math

Jamba 1.6 Large

AI21 Labs

Input: $2.000
Output: $8.000
Pricing
Input (1M)
$2.000
Output (1M)
$8.000
Blended (3:1)
$3.500
Performance
Tokens / s
44.9
TTFT (token)
0.82s
TTFT (answer)
0.82s
Benchmarks
MMLU Pro 56.5%
GPQA 38.7%
HLE 4.0%
AIME 4.7%
LiveCodeBench 17.2%
SciCode 18.4%
Math 500 58.0%
AA Indexes
Intelligence
10.6%
Coding
Math

Hermes 3 - Llama-3.1 70B

Nous Research

Input: $0.300
Output: $0.300
Pricing
Input (1M)
$0.300
Output (1M)
$0.300
Blended (3:1)
$0.300
Performance
Tokens / s
41.3
TTFT (token)
0.29s
TTFT (answer)
0.29s
Benchmarks
MMLU Pro 57.1%
GPQA 40.1%
HLE 4.1%
AIME 2.3%
LiveCodeBench 18.8%
SciCode 23.1%
Math 500 53.8%
AA Indexes
Intelligence
10.6%
Coding
Math

NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)

NVIDIA

Input: $0.200
Output: $0.600
Pricing
Input (1M)
$0.200
Output (1M)
$0.600
Blended (3:1)
$0.300
Performance
Tokens / s
131.2
TTFT (token)
0.56s
TTFT (answer)
0.56s
Benchmarks
MMLU Pro 64.9%
GPQA 43.9%
HLE 4.5%
AIME
LiveCodeBench 34.5%
SciCode 17.6%
Math 500
AA Indexes
Intelligence
10.4%
Coding
5.9%
Math
26.7%

Nova Micro

Amazon

Input: $0.035
Output: $0.140
Pricing
Input (1M)
$0.035
Output (1M)
$0.140
Blended (3:1)
$0.061
Performance
Tokens / s
421.4
TTFT (token)
0.36s
TTFT (answer)
0.36s
Benchmarks
MMLU Pro 53.1%
GPQA 35.8%
HLE 4.7%
AIME 8.0%
LiveCodeBench 14.0%
SciCode 9.4%
Math 500 70.3%
AA Indexes
Intelligence
10.3%
Coding
4.1%
Math
6.0%

Claude 3 Sonnet

Anthropic

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 57.9%
GPQA 40.0%
HLE 3.8%
AIME 4.7%
LiveCodeBench 17.5%
SciCode 22.9%
Math 500 41.4%
AA Indexes
Intelligence
10.3%
Coding
Math

Mistral Small (Sep '24)

Mistral

Input: $0.200
Output: $0.600
Pricing
Input (1M)
$0.200
Output (1M)
$0.600
Blended (3:1)
$0.300
Performance
Tokens / s
110.9
TTFT (token)
0.30s
TTFT (answer)
0.30s
Benchmarks
MMLU Pro 52.9%
GPQA 38.1%
HLE 4.3%
AIME 6.3%
LiveCodeBench 14.1%
SciCode 15.6%
Math 500 56.3%
AA Indexes
Intelligence
10.2%
Coding
Math

Llama 3 Instruct 70B

Meta

Input: $0.510
Output: $0.740
Pricing
Input (1M)
$0.510
Output (1M)
$0.740
Blended (3:1)
$0.568
Performance
Tokens / s
38.0
TTFT (token)
0.43s
TTFT (answer)
0.43s
Benchmarks
MMLU Pro 57.4%
GPQA 37.9%
HLE 4.4%
AIME
LiveCodeBench 19.8%
SciCode 18.9%
Math 500 48.3%
AA Indexes
Intelligence
10.2%
Coding
6.8%
Math

Phi-3 Mini Instruct 3.8B

Microsoft Azure

Input: $0.130
Output: $0.520
Pricing
Input (1M)
$0.130
Output (1M)
$0.520
Blended (3:1)
$0.228
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 43.5%
GPQA 31.9%
HLE 4.4%
AIME 4.0%
LiveCodeBench 11.6%
SciCode 9.0%
Math 500 45.7%
AA Indexes
Intelligence
10.1%
Coding
3.0%
Math
30.0%

Aya Expanse 8B

Cohere

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
79.5
TTFT (token)
0.26s
TTFT (answer)
0.26s
Benchmarks
MMLU Pro 31.2%
GPQA 24.7%
HLE 5.1%
AIME
LiveCodeBench 7.0%
SciCode 7.8%
Math 500 32.1%
AA Indexes
Intelligence
10.0%
Coding
4.9%
Math

Mistral Large (Feb '24)

Mistral

Input: $4.000
Output: $12.000
Pricing
Input (1M)
$4.000
Output (1M)
$12.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 51.5%
GPQA 35.1%
HLE 3.4%
AIME
LiveCodeBench 17.8%
SciCode 20.8%
Math 500 52.7%
AA Indexes
Intelligence
9.9%
Coding
Math

Llama 3.2 Instruct 3B

Meta

Input: $0.060
Output: $0.060
Pricing
Input (1M)
$0.060
Output (1M)
$0.060
Blended (3:1)
$0.060
Performance
Tokens / s
46.5
TTFT (token)
0.50s
TTFT (answer)
0.50s
Benchmarks
MMLU Pro 34.7%
GPQA 25.5%
HLE 5.2%
AIME 6.7%
LiveCodeBench 8.3%
SciCode 5.2%
Math 500 48.9%
AA Indexes
Intelligence
9.7%
Coding
Math
3.3%

Llama 2 Chat 7B

Meta

Input: $0.050
Output: $0.250
Pricing
Input (1M)
$0.050
Output (1M)
$0.250
Blended (3:1)
$0.100
Performance
Tokens / s
109.5
TTFT (token)
0.54s
TTFT (answer)
0.54s
Benchmarks
MMLU Pro 16.4%
GPQA 22.7%
HLE 5.8%
AIME
LiveCodeBench 0.2%
SciCode
Math 500 5.9%
AA Indexes
Intelligence
9.7%
Coding
Math

Jamba 1.7 Large

AI21 Labs

Input: $2.000
Output: $8.000
Pricing
Input (1M)
$2.000
Output (1M)
$8.000
Blended (3:1)
$3.500
Performance
Tokens / s
41.9
TTFT (token)
0.77s
TTFT (answer)
0.77s
Benchmarks
MMLU Pro 57.7%
GPQA 39.0%
HLE 3.8%
AIME 5.7%
LiveCodeBench 18.1%
SciCode 18.8%
Math 500 60.0%
AA Indexes
Intelligence
9.7%
Coding
7.8%
Math
2.3%

Claude 3 Haiku

Anthropic

Input: $0.250
Output: $1.250
Pricing
Input (1M)
$0.250
Output (1M)
$1.250
Blended (3:1)
$0.500
Performance
Tokens / s
111.8
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro
GPQA
HLE
AIME 1.0%
LiveCodeBench 15.4%
SciCode 18.6%
Math 500 39.4%
AA Indexes
Intelligence
9.3%
Coding
Math

Llama 3.2 Instruct 1B

Meta

Input: $0.053
Output: $0.055
Pricing
Input (1M)
$0.053
Output (1M)
$0.055
Blended (3:1)
$0.053
Performance
Tokens / s
74.5
TTFT (token)
0.55s
TTFT (answer)
0.55s
Benchmarks
MMLU Pro 20.0%
GPQA 19.6%
HLE 5.3%
AIME
LiveCodeBench 1.9%
SciCode 1.7%
Math 500 14.0%
AA Indexes
Intelligence
9.1%
Coding
60.0%
Math

Mistral Small (Feb '24)

Mistral

Input: $1.000
Output: $3.000
Pricing
Input (1M)
$1.000
Output (1M)
$3.000
Blended (3:1)
$1.500
Performance
Tokens / s
108.1
TTFT (token)
0.29s
TTFT (answer)
0.29s
Benchmarks
MMLU Pro 41.9%
GPQA 30.2%
HLE 4.4%
AIME 0.7%
LiveCodeBench 11.1%
SciCode 13.4%
Math 500 56.2%
AA Indexes
Intelligence
9.0%
Coding
Math

GPT-3.5 Turbo

OpenAI

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
112.3
TTFT (token)
0.43s
TTFT (answer)
0.43s
Benchmarks
MMLU Pro 46.2%
GPQA 29.7%
HLE
AIME
LiveCodeBench
SciCode
Math 500 44.1%
AA Indexes
Intelligence
9.0%
Coding
10.7%
Math

Mistral Medium

Mistral

Input: $2.750
Output: $8.100
Pricing
Input (1M)
$2.750
Output (1M)
$8.100
Blended (3:1)
$4.088
Performance
Tokens / s
92.0
TTFT (token)
0.40s
TTFT (answer)
0.40s
Benchmarks
MMLU Pro 49.1%
GPQA 34.9%
HLE 3.4%
AIME 3.7%
LiveCodeBench 9.9%
SciCode 11.8%
Math 500 40.5%
AA Indexes
Intelligence
9.0%
Coding
Math

Pixtral 12B (2409)

Mistral

Input: $0.150
Output: $0.150
Pricing
Input (1M)
$0.150
Output (1M)
$0.150
Blended (3:1)
$0.150
Performance
Tokens / s
136.6
TTFT (token)
0.43s
TTFT (answer)
0.43s
Benchmarks
MMLU Pro 47.3%
GPQA 34.3%
HLE 5.3%
AIME
LiveCodeBench 11.5%
SciCode 13.5%
Math 500 45.8%
AA Indexes
Intelligence
8.9%
Coding
Math

Llama 3 Instruct 8B

Meta

Input: $0.045
Output: $0.155
Pricing
Input (1M)
$0.045
Output (1M)
$0.155
Blended (3:1)
$0.070
Performance
Tokens / s
67.1
TTFT (token)
0.37s
TTFT (answer)
0.37s
Benchmarks
MMLU Pro 40.5%
GPQA 29.6%
HLE 5.1%
AIME
LiveCodeBench 9.6%
SciCode 11.9%
Math 500 49.9%
AA Indexes
Intelligence
8.7%
Coding
4.0%
Math

Command-R+ (Apr '24)

Cohere

Input: $3.000
Output: $15.000
Pricing
Input (1M)
$3.000
Output (1M)
$15.000
Blended (3:1)
$6.000
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 43.2%
GPQA 32.3%
HLE 4.5%
AIME 0.7%
LiveCodeBench 12.2%
SciCode 11.8%
Math 500 27.9%
AA Indexes
Intelligence
8.3%
Coding
Math

Olmo 3 7B Instruct

Allen Institute for AI

Input: $0.100
Output: $0.200
Pricing
Input (1M)
$0.100
Output (1M)
$0.200
Blended (3:1)
$0.125
Performance
Tokens / s
37.0
TTFT (token)
0.44s
TTFT (answer)
0.44s
Benchmarks
MMLU Pro 52.2%
GPQA 40.0%
HLE 5.8%
AIME
LiveCodeBench 26.6%
SciCode 10.3%
Math 500
AA Indexes
Intelligence
8.2%
Coding
3.4%
Math
41.3%

Jamba 1.5 Mini

AI21 Labs

Input: $0.200
Output: $0.400
Pricing
Input (1M)
$0.200
Output (1M)
$0.400
Blended (3:1)
$0.250
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 37.1%
GPQA 30.2%
HLE 5.1%
AIME 1.0%
LiveCodeBench 6.2%
SciCode 8.0%
Math 500 35.7%
AA Indexes
Intelligence
8.0%
Coding
Math

Jamba 1.6 Mini

AI21 Labs

Input: $0.200
Output: $0.400
Pricing
Input (1M)
$0.200
Output (1M)
$0.400
Blended (3:1)
$0.250
Performance
Tokens / s
127.1
TTFT (token)
0.63s
TTFT (answer)
0.63s
Benchmarks
MMLU Pro 36.7%
GPQA 30.0%
HLE 4.6%
AIME 3.3%
LiveCodeBench 7.1%
SciCode 10.1%
Math 500 25.7%
AA Indexes
Intelligence
7.9%
Coding
Math

Jamba 1.7 Mini

AI21 Labs

Input: $0.200
Output: $0.400
Pricing
Input (1M)
$0.200
Output (1M)
$0.400
Blended (3:1)
$0.250
Performance
Tokens / s
158.1
TTFT (token)
0.65s
TTFT (answer)
0.65s
Benchmarks
MMLU Pro 38.8%
GPQA 32.2%
HLE 4.5%
AIME 1.3%
LiveCodeBench 6.1%
SciCode 9.3%
Math 500 25.8%
AA Indexes
Intelligence
7.9%
Coding
3.1%
Math
30.0%

Gemma 2 9B

Google

Input: $0.030
Output: $0.090
Pricing
Input (1M)
$0.030
Output (1M)
$0.090
Blended (3:1)
$0.045
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 49.5%
GPQA 31.1%
HLE 3.9%
AIME
LiveCodeBench 12.6%
SciCode 0.7%
Math 500 51.7%
AA Indexes
Intelligence
7.8%
Coding
Math

Mixtral 8x7B Instruct

Mistral

Input: $0.540
Output: $0.600
Pricing
Input (1M)
$0.540
Output (1M)
$0.600
Blended (3:1)
$0.540
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 38.7%
GPQA 29.2%
HLE 4.5%
AIME
LiveCodeBench 6.6%
SciCode 2.8%
Math 500 29.9%
AA Indexes
Intelligence
7.7%
Coding
Math

Command-R (Mar '24)

Cohere

Input: $0.500
Output: $1.500
Pricing
Input (1M)
$0.500
Output (1M)
$1.500
Blended (3:1)
$0.750
Performance
Tokens / s
TTFT (token)
TTFT (answer)
Benchmarks
MMLU Pro 33.8%
GPQA 28.4%
HLE 4.8%
AIME 0.7%
LiveCodeBench 4.8%
SciCode 6.2%
Math 500 16.4%
AA Indexes
Intelligence
7.4%
Coding
Math

Mistral 7B Instruct

Mistral

Input: $0.250
Output: $0.250
Pricing
Input (1M)
$0.250
Output (1M)
$0.250
Blended (3:1)
$0.250
Performance
Tokens / s
125.6
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro 24.5%
GPQA 17.7%
HLE 4.3%
AIME
LiveCodeBench 4.6%
SciCode 2.4%
Math 500 12.1%
AA Indexes
Intelligence
7.4%
Coding
Math

Command-R+ (Aug '24)

Cohere

Input: $2.500
Output: $10.000
Pricing
Input (1M)
$2.500
Output (1M)
$10.000
Blended (3:1)
$4.375
Performance
Tokens / s
21.0
TTFT (token)
0.56s
TTFT (answer)
0.56s
Benchmarks
MMLU Pro 42.7%
GPQA 33.7%
HLE 5.0%
AIME
LiveCodeBench 11.1%
SciCode 12.2%
Math 500 40.2%
AA Indexes
Intelligence
7.1%
Coding
Math

Qwen3 1.7B (Non-reasoning)

Alibaba

Input: $0.110
Output: $0.420
Pricing
Input (1M)
$0.110
Output (1M)
$0.420
Blended (3:1)
$0.188
Performance
Tokens / s
116.5
TTFT (token)
0.91s
TTFT (answer)
0.91s
Benchmarks
MMLU Pro 41.1%
GPQA 28.3%
HLE 5.2%
AIME 9.7%
LiveCodeBench 12.6%
SciCode 6.9%
Math 500 71.7%
AA Indexes
Intelligence
6.8%
Coding
2.3%
Math
7.3%

Qwen3 0.6B (Reasoning)

Alibaba

Input: $0.110
Output: $1.260
Pricing
Input (1M)
$0.110
Output (1M)
$1.260
Blended (3:1)
$0.398
Performance
Tokens / s
199.0
TTFT (token)
0.90s
TTFT (answer)
10.95s
Benchmarks
MMLU Pro 34.7%
GPQA 23.9%
HLE 5.7%
AIME 10.0%
LiveCodeBench 12.1%
SciCode 2.8%
Math 500 75.0%
AA Indexes
Intelligence
6.4%
Coding
90.0%
Math
18.0%

Gemma 3n E4B Instruct

Google

Input: $0.020
Output: $0.040
Pricing
Input (1M)
$0.020
Output (1M)
$0.040
Blended (3:1)
$0.025
Performance
Tokens / s
60.5
TTFT (token)
0.33s
TTFT (answer)
0.33s
Benchmarks
MMLU Pro 48.8%
GPQA 29.6%
HLE 4.4%
AIME 13.7%
LiveCodeBench 14.6%
SciCode 8.1%
Math 500 77.1%
AA Indexes
Intelligence
6.3%
Coding
4.2%
Math
14.3%

Qwen3 0.6B (Non-reasoning)

Alibaba

Input: $0.110
Output: $0.420
Pricing
Input (1M)
$0.110
Output (1M)
$0.420
Blended (3:1)
$0.188
Performance
Tokens / s
190.1
TTFT (token)
0.94s
TTFT (answer)
0.94s
Benchmarks
MMLU Pro 23.1%
GPQA 23.1%
HLE 5.2%
AIME 1.7%
LiveCodeBench 7.3%
SciCode 4.1%
Math 500 52.1%
AA Indexes
Intelligence
6.1%
Coding
1.4%
Math
10.3%

Mistral NeMo

Mistral

Input: $0.150
Output: $0.150
Pricing
Input (1M)
$0.150
Output (1M)
$0.150
Blended (3:1)
$0.150
Performance
Tokens / s
190.2
TTFT (token)
0.34s
TTFT (answer)
0.34s
Benchmarks
MMLU Pro 39.9%
GPQA 31.4%
HLE 4.4%
AIME 0.3%
LiveCodeBench 5.7%
SciCode 10.4%
Math 500 39.5%
AA Indexes
Intelligence
5.2%
Coding
Math

Command-R (Aug '24)

Cohere

Input: $0.150
Output: $0.600
Pricing
Input (1M)
$0.150
Output (1M)
$0.600
Blended (3:1)
$0.263
Performance
Tokens / s
59.8
TTFT (token)
0.24s
TTFT (answer)
0.24s
Benchmarks
MMLU Pro 33.7%
GPQA 28.9%
HLE 5.1%
AIME 0.3%
LiveCodeBench 4.4%
SciCode 8.7%
Math 500 14.9%
AA Indexes
Intelligence
100.0%
Coding
Math

Cogito v2.1 (Reasoning)

Deep Cogito

Input: $1.250
Output: $1.250
Pricing
Input (1M)
$1.250
Output (1M)
$1.250
Blended (3:1)
$1.250
Performance
Tokens / s
73.9
TTFT (token)
0.32s
TTFT (answer)
27.39s
Benchmarks
MMLU Pro 84.9%
GPQA 76.8%
HLE 11.0%
AIME
LiveCodeBench 68.8%
SciCode 41.0%
Math 500
AA Indexes
Intelligence
Coding
24.8%
Math
72.7%

DeepSeek-OCR

DeepSeek

Input: $0.030
Output: $0.100
Pricing
Input (1M)
$0.030
Output (1M)
$0.100
Blended (3:1)
$0.048
Performance
Tokens / s
306.8
TTFT (token)
0.37s
TTFT (answer)
0.37s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
Coding
Math

Grok 3 mini Reasoning (Low)

xAI

Input: $0.300
Output: $0.500
Pricing
Input (1M)
$0.300
Output (1M)
$0.500
Blended (3:1)
$0.350
Performance
Tokens / s
110.8
TTFT (token)
0.56s
TTFT (answer)
18.62s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
Coding
Math

Doubao-Seed-1.8

ByteDance Seed

Input: $0.110
Output: $0.280
Pricing
Input (1M)
$0.110
Output (1M)
$0.280
Blended (3:1)
$0.152
Performance
Tokens / s
43.1
TTFT (token)
2.34s
TTFT (answer)
48.71s
Benchmarks
MMLU Pro
GPQA
HLE
AIME
LiveCodeBench
SciCode
Math 500
AA Indexes
Intelligence
Coding
Math