Live AI model pricing & benchmarks

AI Stats

255 LLM models tracked with pricing & performance benchmarks.
Data sourced from ArtificialAnalysis.com & Epoch.ai

Data Partners

Epoch AI

Top MMLU model

Gemini 3 Pro Preview (high)

89.8%

Cheapest blended price (3:1)

Gemma 3n E4B Instruct

$0.025

Top ECI Score

Gemini 3 Pro Preview

154.4 ECI

Epoch.ai Models

203

84 benchmark runs

Total models

255

From ArtificialAnalysis.com

Avg input price

$2.028

per 1M input tokens

Avg output price

$8.436

per 1M output tokens

Leaderboard

Switch metrics to see different top-5 rankings.

Metric

1

Gemini 3 Pro Preview (high)

Google

89.8%
2

Claude Opus 4.5 (Reasoning)

Anthropic

89.5%
3

Gemini 3 Pro Preview (low)

Google

89.5%
4

Claude Opus 4.5 (Non-reasoning)

Anthropic

88.9%
5

Claude 4.1 Opus (Reasoning)

Anthropic

88.0%

All models

Search and scan every tracked model.

Showing 20 of 255 models

Search models

Sort models

Gemini 3 Pro Preview (high)

Google

Input: $2.000

Output: $12.000

Pricing

Input (1M): $2.000
Output (1M): $12.000
Blended (3:1): $4.500

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 89.8%

GPQA 90.8%

HLE 37.2%

AIME —

LiveCodeBench 91.7%

SciCode 56.1%

Math 500 —

AA Indexes

Intelligence

72.8%

Coding

62.3%

Math

95.7%

Claude Opus 4.5 (Reasoning)

Anthropic

Input: $5.000

Output: $25.000

Pricing

Input (1M): $5.000
Output (1M): $25.000
Blended (3:1): $10.000

Performance

Tokens / s: 44.7
TTFT (token): 1.87s
TTFT (answer): 46.59s

Benchmarks

MMLU Pro 89.5%

GPQA 86.6%

HLE 28.4%

AIME —

LiveCodeBench 87.1%

SciCode 49.5%

Math 500 —

AA Indexes

Intelligence

69.8%

Coding

60.2%

Math

91.3%

GPT-5.1 (high)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 147.0
TTFT (token): 26.56s
TTFT (answer): 26.56s

Benchmarks

MMLU Pro 87.0%

GPQA 87.3%

HLE 26.5%

AIME —

LiveCodeBench 86.8%

SciCode 43.3%

Math 500 —

AA Indexes

Intelligence

69.7%

Coding

57.5%

Math

94.0%

GPT-5 Codex (high)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 232.9
TTFT (token): 11.07s
TTFT (answer): 11.07s

Benchmarks

MMLU Pro 86.5%

GPQA 83.7%

HLE 25.6%

AIME —

LiveCodeBench 84.0%

SciCode 40.9%

Math 500 —

AA Indexes

Intelligence

68.5%

Coding

53.5%

Math

98.7%

GPT-5 (high)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 87.1%

GPQA 85.4%

HLE 26.5%

AIME 95.7%

LiveCodeBench 84.6%

SciCode 42.9%

Math 500 99.4%

AA Indexes

Intelligence

68.5%

Coding

52.7%

Math

94.3%

Kimi K2 Thinking

Moonshot AI

Input: $0.600

Output: $2.500

Pricing

Input (1M): $0.600
Output (1M): $2.500
Blended (3:1): $1.075

Performance

Tokens / s: 89.8
TTFT (token): 0.81s
TTFT (answer): 23.09s

Benchmarks

MMLU Pro 84.8%

GPQA 83.8%

HLE 22.3%

AIME —

LiveCodeBench 85.3%

SciCode 42.4%

Math 500 —

AA Indexes

Intelligence

67.0%

Coding

52.2%

Math

94.7%

GPT-5 (medium)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 86.7%

GPQA 84.2%

HLE 23.5%

AIME 91.7%

LiveCodeBench 70.3%

SciCode 41.1%

Math 500 99.1%

AA Indexes

Intelligence

66.4%

Coding

49.2%

Math

91.7%

o3

OpenAI

Input: $2.000

Output: $8.000

Pricing

Input (1M): $2.000
Output (1M): $8.000
Blended (3:1): $3.500

Performance

Tokens / s: 212.4
TTFT (token): 12.22s
TTFT (answer): 12.22s

Benchmarks

MMLU Pro 85.3%

GPQA 82.7%

HLE 20.0%

AIME 90.3%

LiveCodeBench 80.8%

SciCode 41.0%

Math 500 99.2%

AA Indexes

Intelligence

65.5%

Coding

52.2%

Math

88.3%

Grok 4

xAI

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 54.3
TTFT (token): 49.81s
TTFT (answer): 49.81s

Benchmarks

MMLU Pro 86.6%

GPQA 87.7%

HLE 23.9%

AIME 94.3%

LiveCodeBench 81.9%

SciCode 45.7%

Math 500 99.0%

AA Indexes

Intelligence

65.3%

Coding

55.1%

Math

92.7%

o3-pro

OpenAI

Input: $20.000

Output: $80.000

Pricing

Input (1M): $20.000
Output (1M): $80.000
Blended (3:1): $35.000

Performance

Tokens / s: 49.9
TTFT (token): 40.45s
TTFT (answer): 40.45s

Benchmarks

MMLU Pro —

GPQA 84.5%

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 —

AA Indexes

Intelligence

65.3%

Coding

—

Math

—

Gemini 3 Pro Preview (low)

Google

Input: $2.000

Output: $12.000

Pricing

Input (1M): $2.000
Output (1M): $12.000
Blended (3:1): $4.500

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 89.5%

GPQA 88.7%

HLE 27.6%

AIME —

LiveCodeBench 85.7%

SciCode 49.9%

Math 500 —

AA Indexes

Intelligence

64.5%

Coding

55.8%

Math

86.7%

GPT-5 mini (high)

OpenAI

Input: $0.250

Output: $2.000

Pricing

Input (1M): $0.250
Output (1M): $2.000
Blended (3:1): $0.688

Performance

Tokens / s: 73.2
TTFT (token): 103.08s
TTFT (answer): 103.08s

Benchmarks

MMLU Pro 83.7%

GPQA 82.8%

HLE 19.7%

AIME —

LiveCodeBench 83.8%

SciCode 39.2%

Math 500 —

AA Indexes

Intelligence

64.3%

Coding

51.4%

Math

90.7%

Grok 4.1 Fast (Reasoning)

xAI

Input: $0.200

Output: $0.500

Pricing

Input (1M): $0.200
Output (1M): $0.500
Blended (3:1): $0.275

Performance

Tokens / s: 84.4
TTFT (token): 14.11s
TTFT (answer): 14.11s

Benchmarks

MMLU Pro 85.4%

GPQA 85.3%

HLE 17.6%

AIME —

LiveCodeBench 82.2%

SciCode 44.2%

Math 500 —

AA Indexes

Intelligence

64.1%

Coding

49.7%

Math

89.3%

Claude 4.5 Sonnet (Reasoning)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 69.5
TTFT (token): 2.21s
TTFT (answer): 30.96s

Benchmarks

MMLU Pro 87.5%

GPQA 83.4%

HLE 17.3%

AIME —

LiveCodeBench 71.4%

SciCode 44.7%

Math 500 —

AA Indexes

Intelligence

62.7%

Coding

49.8%

Math

88.0%

GPT-5 (low)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 129.5
TTFT (token): 20.23s
TTFT (answer): 20.23s

Benchmarks

MMLU Pro 86.0%

GPQA 80.8%

HLE 18.4%

AIME 83.0%

LiveCodeBench 76.3%

SciCode 39.1%

Math 500 98.7%

AA Indexes

Intelligence

61.8%

Coding

46.8%

Math

83.0%

MiniMax-M2

MiniMax

Input: $0.300

Output: $1.200

Pricing

Input (1M): $0.300
Output (1M): $1.200
Blended (3:1): $0.525

Performance

Tokens / s: 108.4
TTFT (token): 1.51s
TTFT (answer): 19.97s

Benchmarks

MMLU Pro 82.0%

GPQA 77.7%

HLE 12.5%

AIME —

LiveCodeBench 82.6%

SciCode 36.1%

Math 500 —

AA Indexes

Intelligence

61.4%

Coding

47.6%

Math

78.3%

GPT-5 mini (medium)

OpenAI

Input: $0.250

Output: $2.000

Pricing

Input (1M): $0.250
Output (1M): $2.000
Blended (3:1): $0.688

Performance

Tokens / s: 79.0
TTFT (token): 26.41s
TTFT (answer): 26.41s

Benchmarks

MMLU Pro 82.8%

GPQA 80.3%

HLE 14.6%

AIME —

LiveCodeBench 69.2%

SciCode 41.0%

Math 500 —

AA Indexes

Intelligence

60.8%

Coding

45.7%

Math

85.0%

gpt-oss-120B (high)

OpenAI

Input: $0.150

Output: $0.600

Pricing

Input (1M): $0.150
Output (1M): $0.600
Blended (3:1): $0.263

Performance

Tokens / s: 353.7
TTFT (token): 0.51s
TTFT (answer): 6.17s

Benchmarks

MMLU Pro 80.8%

GPQA 78.2%

HLE 18.5%

AIME —

LiveCodeBench 87.8%

SciCode 38.9%

Math 500 —

AA Indexes

Intelligence

60.5%

Coding

49.6%

Math

93.4%

Grok 4 Fast (Reasoning)

xAI

Input: $0.200

Output: $0.500

Pricing

Input (1M): $0.200
Output (1M): $0.500
Blended (3:1): $0.275

Performance

Tokens / s: 226.6
TTFT (token): 0.99s
TTFT (answer): 0.99s

Benchmarks

MMLU Pro 85.0%

GPQA 84.7%

HLE 17.0%

AIME —

LiveCodeBench 83.2%

SciCode 44.2%

Math 500 —

AA Indexes

Intelligence

60.3%

Coding

48.4%

Math

89.7%

Claude Opus 4.5 (Non-reasoning)

Anthropic

Input: $5.000

Output: $25.000

Pricing

Input (1M): $5.000
Output (1M): $25.000
Blended (3:1): $10.000

Performance

Tokens / s: 65.3
TTFT (token): 2.01s
TTFT (answer): 2.01s

Benchmarks

MMLU Pro 88.9%

GPQA 81.0%

HLE 12.9%

AIME —

LiveCodeBench 73.8%

SciCode 47.0%

Math 500 —

AA Indexes

Intelligence

59.9%

Coding

53.0%

Math

62.7%

Gemini 2.5 Pro

Google

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 41.8
TTFT (token): 15.02s
TTFT (answer): 15.02s

Benchmarks

MMLU Pro 86.2%

GPQA 84.4%

HLE 21.1%

AIME 88.7%

LiveCodeBench 80.1%

SciCode 42.8%

Math 500 96.7%

AA Indexes

Intelligence

59.6%

Coding

49.3%

Math

87.7%

o4-mini (high)

OpenAI

Input: $1.100

Output: $4.400

Pricing

Input (1M): $1.100
Output (1M): $4.400
Blended (3:1): $1.925

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 83.2%

GPQA 78.4%

HLE 17.5%

AIME 94.0%

LiveCodeBench 85.9%

SciCode 46.5%

Math 500 98.9%

AA Indexes

Intelligence

59.6%

Coding

48.9%

Math

90.7%

Claude 4.1 Opus (Reasoning)

Anthropic

Input: $15.000

Output: $75.000

Pricing

Input (1M): $15.000
Output (1M): $75.000
Blended (3:1): $30.000

Performance

Tokens / s: 38.9
TTFT (token): 1.27s
TTFT (answer): 52.71s

Benchmarks

MMLU Pro 88.0%

GPQA 80.9%

HLE 11.9%

AIME —

LiveCodeBench 65.4%

SciCode 40.9%

Math 500 —

AA Indexes

Intelligence

59.3%

Coding

46.1%

Math

80.3%

DeepSeek V3.1 Terminus (Reasoning)

DeepSeek

Input: $0.400

Output: $2.000

Pricing

Input (1M): $0.400
Output (1M): $2.000
Blended (3:1): $0.800

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 85.1%

GPQA 79.2%

HLE 15.2%

AIME —

LiveCodeBench 79.8%

SciCode 40.6%

Math 500 —

AA Indexes

Intelligence

57.7%

Coding

49.6%

Math

89.7%

Qwen3 235B A22B 2507 (Reasoning)

Alibaba

Input: $0.700

Output: $8.400

Pricing

Input (1M): $0.700
Output (1M): $8.400
Blended (3:1): $2.625

Performance

Tokens / s: 80.8
TTFT (token): 1.28s
TTFT (answer): 26.03s

Benchmarks

MMLU Pro 84.3%

GPQA 79.0%

HLE 15.0%

AIME 94.0%

LiveCodeBench 78.8%

SciCode 42.4%

Math 500 98.4%

AA Indexes

Intelligence

57.5%

Coding

44.6%

Math

91.0%

Grok 3 mini Reasoning (high)

xAI

Input: $0.300

Output: $0.500

Pricing

Input (1M): $0.300
Output (1M): $0.500
Blended (3:1): $0.350

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 82.8%

GPQA 79.1%

HLE 11.1%

AIME 93.3%

LiveCodeBench 69.6%

SciCode 40.6%

Math 500 99.2%

AA Indexes

Intelligence

57.1%

Coding

42.2%

Math

84.7%

Doubao Seed Code

ByteDance Seed

Input: $0.170

Output: $1.120

Pricing

Input (1M): $0.170
Output (1M): $1.120
Blended (3:1): $0.407

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 85.4%

GPQA 76.4%

HLE 13.3%

AIME —

LiveCodeBench 76.6%

SciCode 40.7%

Math 500 —

AA Indexes

Intelligence

57.1%

Coding

47.4%

Math

79.3%

DeepSeek V3.2 Exp (Reasoning)

DeepSeek

Input: $0.280

Output: $0.420

Pricing

Input (1M): $0.280
Output (1M): $0.420
Blended (3:1): $0.315

Performance

Tokens / s: 29.1
TTFT (token): 1.28s
TTFT (answer): 70.01s

Benchmarks

MMLU Pro 85.0%

GPQA 79.7%

HLE 13.8%

AIME —

LiveCodeBench 78.9%

SciCode 37.7%

Math 500 —

AA Indexes

Intelligence

56.9%

Coding

48.6%

Math

87.7%

Claude 4 Sonnet (Reasoning)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 65.7
TTFT (token): 0.78s
TTFT (answer): 31.21s

Benchmarks

MMLU Pro 84.2%

GPQA 77.7%

HLE 9.6%

AIME 77.3%

LiveCodeBench 65.5%

SciCode 40.0%

Math 500 99.1%

AA Indexes

Intelligence

56.5%

Coding

45.1%

Math

74.3%

GLM-4.6 (Reasoning)

Z AI

Input: $0.600

Output: $2.200

Pricing

Input (1M): $0.600
Output (1M): $2.200
Blended (3:1): $1.000

Performance

Tokens / s: 112.0
TTFT (token): 0.64s
TTFT (answer): 18.50s

Benchmarks

MMLU Pro 82.9%

GPQA 78.0%

HLE 13.3%

AIME —

LiveCodeBench 69.5%

SciCode 38.4%

Math 500 —

AA Indexes

Intelligence

56.0%

Coding

43.8%

Math

86.0%

Qwen3 Max Thinking

Alibaba

Input: $1.200

Output: $6.000

Pricing

Input (1M): $1.200
Output (1M): $6.000
Blended (3:1): $2.400

Performance

Tokens / s: 36.1
TTFT (token): 1.72s
TTFT (answer): 57.14s

Benchmarks

MMLU Pro 82.4%

GPQA 77.6%

HLE 12.0%

AIME —

LiveCodeBench 53.5%

SciCode 38.7%

Math 500 —

AA Indexes

Intelligence

55.8%

Coding

36.2%

Math

82.3%

Qwen3 Max

Alibaba

Input: $1.200

Output: $6.000

Pricing

Input (1M): $1.200
Output (1M): $6.000
Blended (3:1): $2.400

Performance

Tokens / s: 27.9
TTFT (token): 1.74s
TTFT (answer): 1.74s

Benchmarks

MMLU Pro 84.1%

GPQA 76.4%

HLE 11.1%

AIME —

LiveCodeBench 76.7%

SciCode 38.3%

Math 500 —

AA Indexes

Intelligence

55.1%

Coding

44.7%

Math

80.7%

Claude 4.5 Haiku (Reasoning)

Anthropic

Input: $1.000

Output: $5.000

Pricing

Input (1M): $1.000
Output (1M): $5.000
Blended (3:1): $2.000

Performance

Tokens / s: 83.9
TTFT (token): 0.84s
TTFT (answer): 24.67s

Benchmarks

MMLU Pro 76.0%

GPQA 67.2%

HLE 9.7%

AIME —

LiveCodeBench 61.5%

SciCode 43.3%

Math 500 —

AA Indexes

Intelligence

54.6%

Coding

43.4%

Math

83.7%

Gemini 2.5 Flash Preview (Sep '25) (Reasoning)

Google

Input: $0.300

Output: $2.500

Pricing

Input (1M): $0.300
Output (1M): $2.500
Blended (3:1): $0.850

Performance

Tokens / s: 142.3
TTFT (token): 8.13s
TTFT (answer): 8.13s

Benchmarks

MMLU Pro 84.2%

GPQA 79.3%

HLE 12.7%

AIME —

LiveCodeBench 71.3%

SciCode 40.5%

Math 500 —

AA Indexes

Intelligence

54.4%

Coding

42.5%

Math

78.3%

Qwen3 VL 235B A22B (Reasoning)

Alibaba

Input: $0.700

Output: $8.400

Pricing

Input (1M): $0.700
Output (1M): $8.400
Blended (3:1): $2.625

Performance

Tokens / s: 43.8
TTFT (token): 1.28s
TTFT (answer): 46.93s

Benchmarks

MMLU Pro 83.6%

GPQA 77.2%

HLE 10.1%

AIME —

LiveCodeBench 64.6%

SciCode 39.9%

Math 500 —

AA Indexes

Intelligence

54.4%

Coding

38.4%

Math

88.3%

Qwen3 Next 80B A3B (Reasoning)

Alibaba

Input: $0.500

Output: $6.000

Pricing

Input (1M): $0.500
Output (1M): $6.000
Blended (3:1): $1.875

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 82.4%

GPQA 75.9%

HLE 11.7%

AIME —

LiveCodeBench 78.4%

SciCode 38.8%

Math 500 —

AA Indexes

Intelligence

54.3%

Coding

42.1%

Math

84.3%

Claude 4 Opus (Reasoning)

Anthropic

Input: $15.000

Output: $75.000

Pricing

Input (1M): $15.000
Output (1M): $75.000
Blended (3:1): $30.000

Performance

Tokens / s: 41.2
TTFT (token): 1.21s
TTFT (answer): 49.72s

Benchmarks

MMLU Pro 87.3%

GPQA 79.6%

HLE 11.7%

AIME 75.7%

LiveCodeBench 63.6%

SciCode 39.8%

Math 500 98.2%

AA Indexes

Intelligence

54.2%

Coding

44.2%

Math

73.3%

Gemini 2.5 Pro Preview (Mar' 25)

Google

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 39.8
TTFT (token): 15.19s
TTFT (answer): 15.19s

Benchmarks

MMLU Pro 85.8%

GPQA 83.6%

HLE 17.1%

AIME 87.0%

LiveCodeBench 77.8%

SciCode 39.5%

Math 500 98.0%

AA Indexes

Intelligence

54.1%

Coding

46.7%

Math

—

DeepSeek V3.1 (Reasoning)

DeepSeek

Input: $0.425

Output: $1.340

Pricing

Input (1M): $0.425
Output (1M): $1.340
Blended (3:1): $0.654

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 85.1%

GPQA 77.9%

HLE 13.0%

AIME —

LiveCodeBench 78.4%

SciCode 39.1%

Math 500 —

AA Indexes

Intelligence

54.0%

Coding

47.2%

Math

89.7%

Gemini 2.5 Pro Preview (May' 25)

Google

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 40.7
TTFT (token): 14.96s
TTFT (answer): 14.96s

Benchmarks

MMLU Pro 83.7%

GPQA 82.2%

HLE 15.4%

AIME 84.3%

LiveCodeBench 77.0%

SciCode 41.6%

Math 500 98.6%

AA Indexes

Intelligence

53.2%

Coding

—

Math

—

gpt-oss-20B (high)

OpenAI

Input: $0.070

Output: $0.200

Pricing

Input (1M): $0.070
Output (1M): $0.200
Blended (3:1): $0.100

Performance

Tokens / s: 236.6
TTFT (token): 0.57s
TTFT (answer): 9.03s

Benchmarks

MMLU Pro 74.8%

GPQA 68.8%

HLE 9.8%

AIME —

LiveCodeBench 77.7%

SciCode 34.4%

Math 500 —

AA Indexes

Intelligence

52.1%

Coding

40.7%

Math

89.3%

Magistral Medium 1.2

Mistral

Input: $2.000

Output: $5.000

Pricing

Input (1M): $2.000
Output (1M): $5.000
Blended (3:1): $2.750

Performance

Tokens / s: 102.6
TTFT (token): 0.48s
TTFT (answer): 19.98s

Benchmarks

MMLU Pro 81.5%

GPQA 73.9%

HLE 9.6%

AIME —

LiveCodeBench 75.0%

SciCode 39.2%

Math 500 —

AA Indexes

Intelligence

52.0%

Coding

42.3%

Math

82.0%

DeepSeek R1 0528 (May '25)

DeepSeek

Input: $1.350

Output: $4.000

Pricing

Input (1M): $1.350
Output (1M): $4.000
Blended (3:1): $2.362

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 84.9%

GPQA 81.3%

HLE 14.9%

AIME 89.3%

LiveCodeBench 77.0%

SciCode 40.3%

Math 500 98.3%

AA Indexes

Intelligence

52.0%

Coding

44.1%

Math

76.0%

Qwen3 VL 32B (Reasoning)

Alibaba

Input: $0.700

Output: $8.400

Pricing

Input (1M): $0.700
Output (1M): $8.400
Blended (3:1): $2.625

Performance

Tokens / s: 51.1
TTFT (token): 1.08s
TTFT (answer): 40.24s

Benchmarks

MMLU Pro 81.8%

GPQA 73.3%

HLE 9.6%

AIME —

LiveCodeBench 73.8%

SciCode 28.5%

Math 500 —

AA Indexes

Intelligence

51.9%

Coding

36.4%

Math

84.7%

Seed-OSS-36B-Instruct

ByteDance Seed

Input: $0.210

Output: $0.570

Pricing

Input (1M): $0.210
Output (1M): $0.570
Blended (3:1): $0.300

Performance

Tokens / s: 27.1
TTFT (token): 1.69s
TTFT (answer): 75.44s

Benchmarks

MMLU Pro 81.5%

GPQA 72.6%

HLE 9.1%

AIME —

LiveCodeBench 76.5%

SciCode 36.5%

Math 500 —

AA Indexes

Intelligence

51.6%

Coding

39.8%

Math

84.7%

GLM-4.5 (Reasoning)

Z AI

Input: $0.575

Output: $2.195

Pricing

Input (1M): $0.575
Output (1M): $2.195
Blended (3:1): $0.980

Performance

Tokens / s: 51.3
TTFT (token): 0.71s
TTFT (answer): 39.73s

Benchmarks

MMLU Pro 83.5%

GPQA 78.2%

HLE 12.2%

AIME 87.3%

LiveCodeBench 73.8%

SciCode 34.8%

Math 500 97.9%

AA Indexes

Intelligence

51.3%

Coding

43.3%

Math

73.7%

Gemini 2.5 Flash (Reasoning)

Google

Input: $0.300

Output: $2.500

Pricing

Input (1M): $0.300
Output (1M): $2.500
Blended (3:1): $0.850

Performance

Tokens / s: 146.5
TTFT (token): 8.02s
TTFT (answer): 8.02s

Benchmarks

MMLU Pro 83.2%

GPQA 79.0%

HLE 11.1%

AIME 82.3%

LiveCodeBench 69.5%

SciCode 39.4%

Math 500 98.1%

AA Indexes

Intelligence

51.2%

Coding

40.5%

Math

73.3%

GPT-5 nano (high)

OpenAI

Input: $0.050

Output: $0.400

Pricing

Input (1M): $0.050
Output (1M): $0.400
Blended (3:1): $0.138

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 78.0%

GPQA 67.6%

HLE 8.2%

AIME —

LiveCodeBench 78.9%

SciCode 36.6%

Math 500 —

AA Indexes

Intelligence

51.0%

Coding

42.3%

Math

83.7%

o3-mini (high)

OpenAI

Input: $1.100

Output: $4.400

Pricing

Input (1M): $1.100
Output (1M): $4.400
Blended (3:1): $1.925

Performance

Tokens / s: 143.0
TTFT (token): 30.43s
TTFT (answer): 30.43s

Benchmarks

MMLU Pro 80.2%

GPQA 77.3%

HLE 12.3%

AIME 86.0%

LiveCodeBench 73.4%

SciCode 39.8%

Math 500 98.5%

AA Indexes

Intelligence

50.8%

Coding

42.1%

Math

—

Kimi K2 0905

Moonshot AI

Input: $0.990

Output: $2.500

Pricing

Input (1M): $0.990
Output (1M): $2.500
Blended (3:1): $1.200

Performance

Tokens / s: 93.9
TTFT (token): 0.49s
TTFT (answer): 0.49s

Benchmarks

MMLU Pro 81.9%

GPQA 76.7%

HLE 6.3%

AIME —

LiveCodeBench 61.0%

SciCode 30.7%

Math 500 —

AA Indexes

Intelligence

50.4%

Coding

38.1%

Math

57.3%

Claude 3.7 Sonnet (Reasoning)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 83.7%

GPQA 77.2%

HLE 10.3%

AIME 48.7%

LiveCodeBench 47.3%

SciCode 40.3%

Math 500 94.7%

AA Indexes

Intelligence

49.9%

Coding

35.8%

Math

56.3%

Claude 4.5 Sonnet (Non-reasoning)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 72.4
TTFT (token): 1.80s
TTFT (answer): 1.80s

Benchmarks

MMLU Pro 86.0%

GPQA 72.7%

HLE 7.1%

AIME —

LiveCodeBench 59.0%

SciCode 42.8%

Math 500 —

AA Indexes

Intelligence

49.6%

Coding

42.9%

Math

37.0%

GPT-5 nano (medium)

OpenAI

Input: $0.050

Output: $0.400

Pricing

Input (1M): $0.050
Output (1M): $0.400
Blended (3:1): $0.138

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 77.2%

GPQA 67.0%

HLE 7.6%

AIME —

LiveCodeBench 76.3%

SciCode 33.8%

Math 500 —

AA Indexes

Intelligence

49.3%

Coding

42.1%

Math

78.3%

GLM-4.5-Air

Z AI

Input: $0.200

Output: $1.100

Pricing

Input (1M): $0.200
Output (1M): $1.100
Blended (3:1): $0.425

Performance

Tokens / s: 104.8
TTFT (token): 0.69s
TTFT (answer): 19.77s

Benchmarks

MMLU Pro 81.5%

GPQA 73.3%

HLE 6.8%

AIME 67.3%

LiveCodeBench 68.4%

SciCode 30.6%

Math 500 96.5%

AA Indexes

Intelligence

48.8%

Coding

39.4%

Math

80.7%

Grok Code Fast 1

xAI

Input: $0.200

Output: $1.500

Pricing

Input (1M): $0.200
Output (1M): $1.500
Blended (3:1): $0.525

Performance

Tokens / s: 209.0
TTFT (token): 4.37s
TTFT (answer): 4.37s

Benchmarks

MMLU Pro 79.3%

GPQA 72.7%

HLE 7.5%

AIME —

LiveCodeBench 65.7%

SciCode 36.2%

Math 500 —

AA Indexes

Intelligence

48.6%

Coding

39.4%

Math

43.3%

Qwen3 Max (Preview)

Alibaba

Input: $1.200

Output: $6.000

Pricing

Input (1M): $1.200
Output (1M): $6.000
Blended (3:1): $2.400

Performance

Tokens / s: 30.1
TTFT (token): 1.72s
TTFT (answer): 1.72s

Benchmarks

MMLU Pro 83.8%

GPQA 76.4%

HLE 9.3%

AIME —

LiveCodeBench 65.1%

SciCode 37.0%

Math 500 —

AA Indexes

Intelligence

48.5%

Coding

40.2%

Math

75.0%

Kimi K2

Moonshot AI

Input: $0.600

Output: $2.500

Pricing

Input (1M): $0.600
Output (1M): $2.500
Blended (3:1): $1.075

Performance

Tokens / s: 55.9
TTFT (token): 0.74s
TTFT (answer): 0.74s

Benchmarks

MMLU Pro 82.4%

GPQA 76.6%

HLE 7.0%

AIME 69.3%

LiveCodeBench 55.6%

SciCode 34.5%

Math 500 97.1%

AA Indexes

Intelligence

48.1%

Coding

35.0%

Math

57.0%

o3-mini

OpenAI

Input: $1.100

Output: $4.400

Pricing

Input (1M): $1.100
Output (1M): $4.400
Blended (3:1): $1.925

Performance

Tokens / s: 132.2
TTFT (token): 13.05s
TTFT (answer): 13.05s

Benchmarks

MMLU Pro 79.1%

GPQA 74.8%

HLE 8.7%

AIME 77.0%

LiveCodeBench 71.7%

SciCode 39.9%

Math 500 97.3%

AA Indexes

Intelligence

48.1%

Coding

39.4%

Math

—

o1-pro

OpenAI

Input: $150.000

Output: $600.000

Pricing

Input (1M): $150.000
Output (1M): $600.000
Blended (3:1): $262.500

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 —

AA Indexes

Intelligence

48.0%

Coding

—

Math

—

Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)

Google

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 6.2
TTFT (token): 3.73s
TTFT (answer): 3.73s

Benchmarks

MMLU Pro 80.8%

GPQA 70.9%

HLE 6.6%

AIME —

LiveCodeBench 68.8%

SciCode 28.7%

Math 500 —

AA Indexes

Intelligence

47.9%

Coding

36.5%

Math

68.7%

gpt-oss-120B (low)

OpenAI

Input: $0.150

Output: $0.595

Pricing

Input (1M): $0.150
Output (1M): $0.595
Blended (3:1): $0.263

Performance

Tokens / s: 337.9
TTFT (token): 0.51s
TTFT (answer): 6.43s

Benchmarks

MMLU Pro 77.5%

GPQA 67.2%

HLE 5.2%

AIME —

LiveCodeBench 70.7%

SciCode 36.0%

Math 500 —

AA Indexes

Intelligence

47.5%

Coding

37.2%

Math

66.7%

o1

OpenAI

Input: $15.000

Output: $60.000

Pricing

Input (1M): $15.000
Output (1M): $60.000
Blended (3:1): $26.250

Performance

Tokens / s: 189.5
TTFT (token): 10.58s
TTFT (answer): 10.58s

Benchmarks

MMLU Pro 84.1%

GPQA 74.7%

HLE 7.7%

AIME 72.3%

LiveCodeBench 67.9%

SciCode 35.8%

Math 500 97.0%

AA Indexes

Intelligence

47.2%

Coding

38.6%

Math

—

Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)

Google

Input: $0.300

Output: $2.500

Pricing

Input (1M): $0.300
Output (1M): $2.500
Blended (3:1): $0.850

Performance

Tokens / s: 231.1
TTFT (token): 0.36s
TTFT (answer): 0.36s

Benchmarks

MMLU Pro 83.6%

GPQA 76.6%

HLE 7.8%

AIME —

LiveCodeBench 62.5%

SciCode 37.5%

Math 500 —

AA Indexes

Intelligence

46.7%

Coding

37.8%

Math

56.7%

Qwen3 30B A3B 2507 (Reasoning)

Alibaba

Input: $0.200

Output: $2.400

Pricing

Input (1M): $0.200
Output (1M): $2.400
Blended (3:1): $0.750

Performance

Tokens / s: 187.6
TTFT (token): 1.10s
TTFT (answer): 11.76s

Benchmarks

MMLU Pro 80.5%

GPQA 70.7%

HLE 9.8%

AIME 90.7%

LiveCodeBench 70.7%

SciCode 33.3%

Math 500 97.6%

AA Indexes

Intelligence

46.4%

Coding

36.3%

Math

56.3%

DeepSeek V3.2 Exp (Non-reasoning)

DeepSeek

Input: $0.280

Output: $0.420

Pricing

Input (1M): $0.280
Output (1M): $0.420
Blended (3:1): $0.315

Performance

Tokens / s: 26.8
TTFT (token): 1.20s
TTFT (answer): 1.20s

Benchmarks

MMLU Pro 83.6%

GPQA 73.8%

HLE 8.6%

AIME —

LiveCodeBench 55.4%

SciCode 39.9%

Math 500 —

AA Indexes

Intelligence

46.3%

Coding

39.6%

Math

57.7%

MiniMax M1 80k

MiniMax

Input: $0.400

Output: $2.100

Pricing

Input (1M): $0.400
Output (1M): $2.100
Blended (3:1): $0.825

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 81.6%

GPQA 69.7%

HLE 8.2%

AIME 84.7%

LiveCodeBench 71.1%

SciCode 37.4%

Math 500 98.0%

AA Indexes

Intelligence

46.2%

Coding

37.1%

Math

61.0%

DeepSeek V3.1 Terminus (Non-reasoning)

DeepSeek

Input: $0.400

Output: $1.680

Pricing

Input (1M): $0.400
Output (1M): $1.680
Blended (3:1): $0.800

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 83.6%

GPQA 75.1%

HLE 8.4%

AIME —

LiveCodeBench 52.9%

SciCode 32.1%

Math 500 —

AA Indexes

Intelligence

45.7%

Coding

38.3%

Math

53.7%

Qwen3 235B A22B 2507 Instruct

Alibaba

Input: $0.700

Output: $2.800

Pricing

Input (1M): $0.700
Output (1M): $2.800
Blended (3:1): $1.225

Performance

Tokens / s: 49.5
TTFT (token): 1.28s
TTFT (answer): 1.28s

Benchmarks

MMLU Pro 82.8%

GPQA 75.3%

HLE 10.6%

AIME 71.7%

LiveCodeBench 52.4%

SciCode 36.0%

Math 500 98.0%

AA Indexes

Intelligence

45.3%

Coding

34.2%

Math

71.7%

Grok 3

xAI

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 53.3
TTFT (token): 0.64s
TTFT (answer): 0.64s

Benchmarks

MMLU Pro 79.9%

GPQA 69.3%

HLE 5.1%

AIME 33.0%

LiveCodeBench 42.5%

SciCode 36.8%

Math 500 87.0%

AA Indexes

Intelligence

45.3%

Coding

30.0%

Math

58.0%

Qwen3 VL 30B A3B (Reasoning)

Alibaba

Input: $0.200

Output: $2.400

Pricing

Input (1M): $0.200
Output (1M): $2.400
Blended (3:1): $0.750

Performance

Tokens / s: 106.5
TTFT (token): 0.98s
TTFT (answer): 19.77s

Benchmarks

MMLU Pro 80.7%

GPQA 72.0%

HLE 8.7%

AIME —

LiveCodeBench 69.7%

SciCode 28.8%

Math 500 —

AA Indexes

Intelligence

45.3%

Coding

34.5%

Math

82.3%

Llama Nemotron Super 49B v1.5 (Reasoning)

NVIDIA

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 77.7
TTFT (token): 0.22s
TTFT (answer): 25.96s

Benchmarks

MMLU Pro 81.4%

GPQA 74.8%

HLE 6.8%

AIME 86.0%

LiveCodeBench 73.7%

SciCode 34.8%

Math 500 98.3%

AA Indexes

Intelligence

45.2%

Coding

37.8%

Math

76.7%

o1-preview

OpenAI

Input: $16.500

Output: $66.000

Pricing

Input (1M): $16.500
Output (1M): $66.000
Blended (3:1): $28.875

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 92.4%

AA Indexes

Intelligence

44.9%

Coding

34.0%

Math

—

Qwen3 Next 80B A3B Instruct

Alibaba

Input: $0.500

Output: $2.000

Pricing

Input (1M): $0.500
Output (1M): $2.000
Blended (3:1): $0.875

Performance

Tokens / s: 158.6
TTFT (token): 1.38s
TTFT (answer): 1.38s

Benchmarks

MMLU Pro 81.9%

GPQA 73.8%

HLE 7.3%

AIME —

LiveCodeBench 68.4%

SciCode 30.7%

Math 500 —

AA Indexes

Intelligence

44.8%

Coding

35.4%

Math

66.3%

DeepSeek V3.1 (Non-reasoning)

DeepSeek

Input: $0.560

Output: $1.660

Pricing

Input (1M): $0.560
Output (1M): $1.660
Blended (3:1): $0.840

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 83.3%

GPQA 73.5%

HLE 6.3%

AIME —

LiveCodeBench 57.7%

SciCode 36.7%

Math 500 —

AA Indexes

Intelligence

44.8%

Coding

39.0%

Math

49.7%

Ling-1T

InclusionAI

Input: $0.570

Output: $2.280

Pricing

Input (1M): $0.570
Output (1M): $2.280
Blended (3:1): $0.998

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 82.2%

GPQA 71.9%

HLE 7.2%

AIME —

LiveCodeBench 67.7%

SciCode 35.2%

Math 500 —

AA Indexes

Intelligence

44.8%

Coding

37.6%

Math

71.3%

GLM-4.6 (Non-reasoning)

Z AI

Input: $0.600

Output: $2.200

Pricing

Input (1M): $0.600
Output (1M): $2.200
Blended (3:1): $1.000

Performance

Tokens / s: 44.2
TTFT (token): 0.72s
TTFT (answer): 0.72s

Benchmarks

MMLU Pro 78.4%

GPQA 63.2%

HLE 5.2%

AIME —

LiveCodeBench 56.1%

SciCode 33.1%

Math 500 —

AA Indexes

Intelligence

44.7%

Coding

38.7%

Math

44.3%

Claude 4.1 Opus (Non-reasoning)

Anthropic

Input: $15.000

Output: $75.000

Pricing

Input (1M): $15.000
Output (1M): $75.000
Blended (3:1): $30.000

Performance

Tokens / s: 38.7
TTFT (token): 1.45s
TTFT (answer): 1.45s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 —

AA Indexes

Intelligence

44.6%

Coding

—

Math

—

Claude 4 Sonnet (Non-reasoning)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 60.0
TTFT (token): 1.08s
TTFT (answer): 1.08s

Benchmarks

MMLU Pro 83.7%

GPQA 68.3%

HLE 4.0%

AIME 40.7%

LiveCodeBench 44.9%

SciCode 37.3%

Math 500 93.4%

AA Indexes

Intelligence

44.4%

Coding

35.9%

Math

38.0%

gpt-oss-20B (low)

OpenAI

Input: $0.070

Output: $0.200

Pricing

Input (1M): $0.070
Output (1M): $0.200
Blended (3:1): $0.100

Performance

Tokens / s: 234.0
TTFT (token): 0.61s
TTFT (answer): 9.16s

Benchmarks

MMLU Pro 71.8%

GPQA 61.1%

HLE 5.1%

AIME —

LiveCodeBench 65.2%

SciCode 34.0%

Math 500 —

AA Indexes

Intelligence

44.3%

Coding

34.5%

Math

62.3%

Qwen3 VL 235B A22B Instruct

Alibaba

Input: $0.700

Output: $2.800

Pricing

Input (1M): $0.700
Output (1M): $2.800
Blended (3:1): $1.225

Performance

Tokens / s: 34.5
TTFT (token): 1.27s
TTFT (answer): 1.27s

Benchmarks

MMLU Pro 82.3%

GPQA 71.2%

HLE 6.3%

AIME —

LiveCodeBench 59.4%

SciCode 35.9%

Math 500 —

AA Indexes

Intelligence

44.1%

Coding

33.9%

Math

70.7%

DeepSeek R1 (Jan '25)

DeepSeek

Input: $1.350

Output: $4.000

Pricing

Input (1M): $1.350
Output (1M): $4.000
Blended (3:1): $2.362

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 84.4%

GPQA 70.8%

HLE 9.3%

AIME 68.3%

LiveCodeBench 61.7%

SciCode 35.7%

Math 500 96.6%

AA Indexes

Intelligence

43.8%

Coding

34.4%

Math

68.0%

GPT-5 (minimal)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 81.1
TTFT (token): 1.02s
TTFT (answer): 1.02s

Benchmarks

MMLU Pro 80.6%

GPQA 67.3%

HLE 5.4%

AIME 36.7%

LiveCodeBench 55.8%

SciCode 38.8%

Math 500 86.1%

AA Indexes

Intelligence

43.5%

Coding

37.4%

Math

31.7%

GPT-4.1

OpenAI

Input: $2.000

Output: $8.000

Pricing

Input (1M): $2.000
Output (1M): $8.000
Blended (3:1): $3.500

Performance

Tokens / s: 123.2
TTFT (token): 0.47s
TTFT (answer): 0.47s

Benchmarks

MMLU Pro 80.6%

GPQA 66.6%

HLE 4.6%

AIME 43.7%

LiveCodeBench 45.7%

SciCode 38.1%

Math 500 91.3%

AA Indexes

Intelligence

43.4%

Coding

32.2%

Math

34.7%

Magistral Small 1.2

Mistral

Input: $0.500

Output: $1.500

Pricing

Input (1M): $0.500
Output (1M): $1.500
Blended (3:1): $0.750

Performance

Tokens / s: 194.8
TTFT (token): 0.34s
TTFT (answer): 10.61s

Benchmarks

MMLU Pro 76.8%

GPQA 66.3%

HLE 6.1%

AIME —

LiveCodeBench 72.3%

SciCode 35.2%

Math 500 —

AA Indexes

Intelligence

43.0%

Coding

37.2%

Math

80.3%

GPT-5.1 (Non-reasoning)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 89.8
TTFT (token): 0.82s
TTFT (answer): 0.82s

Benchmarks

MMLU Pro 80.1%

GPQA 64.3%

HLE 5.2%

AIME —

LiveCodeBench 49.4%

SciCode 36.5%

Math 500 —

AA Indexes

Intelligence

42.9%

Coding

35.7%

Math

38.0%

EXAONE 4.0 32B (Reasoning)

LG AI Research

Input: $0.600

Output: $1.000

Pricing

Input (1M): $0.600
Output (1M): $1.000
Blended (3:1): $0.700

Performance

Tokens / s: 106.4
TTFT (token): 0.37s
TTFT (answer): 19.17s

Benchmarks

MMLU Pro 81.8%

GPQA 73.9%

HLE 10.5%

AIME 84.3%

LiveCodeBench 74.7%

SciCode 34.4%

Math 500 97.7%

AA Indexes

Intelligence

42.6%

Coding

37.5%

Math

80.0%

GPT-4.1 mini

OpenAI

Input: $0.400

Output: $1.600

Pricing

Input (1M): $0.400
Output (1M): $1.600
Blended (3:1): $0.700

Performance

Tokens / s: 77.8
TTFT (token): 0.46s
TTFT (answer): 0.46s

Benchmarks

MMLU Pro 78.1%

GPQA 66.4%

HLE 4.6%

AIME 43.0%

LiveCodeBench 48.3%

SciCode 40.4%

Math 500 92.5%

AA Indexes

Intelligence

42.5%

Coding

31.9%

Math

46.3%

Claude 4 Opus (Non-reasoning)

Anthropic

Input: $15.000

Output: $75.000

Pricing

Input (1M): $15.000
Output (1M): $75.000
Blended (3:1): $30.000

Performance

Tokens / s: 40.7
TTFT (token): 1.29s
TTFT (answer): 1.29s

Benchmarks

MMLU Pro 86.0%

GPQA 70.1%

HLE 5.9%

AIME 56.3%

LiveCodeBench 54.2%

SciCode 40.9%

Math 500 94.1%

AA Indexes

Intelligence

42.3%

Coding

—

Math

36.3%

Qwen3 Coder 480B A35B Instruct

Alibaba

Input: $1.500

Output: $7.500

Pricing

Input (1M): $1.500
Output (1M): $7.500
Blended (3:1): $3.000

Performance

Tokens / s: 49.3
TTFT (token): 1.67s
TTFT (answer): 1.67s

Benchmarks

MMLU Pro 78.8%

GPQA 61.8%

HLE 4.4%

AIME 47.7%

LiveCodeBench 58.5%

SciCode 35.9%

Math 500 94.2%

AA Indexes

Intelligence

42.3%

Coding

37.4%

Math

39.3%

GPT-5 (ChatGPT)

OpenAI

Input: $1.250

Output: $10.000

Pricing

Input (1M): $1.250
Output (1M): $10.000
Blended (3:1): $3.438

Performance

Tokens / s: 165.8
TTFT (token): 0.70s
TTFT (answer): 0.70s

Benchmarks

MMLU Pro 82.0%

GPQA 68.6%

HLE 5.8%

AIME —

LiveCodeBench 54.3%

SciCode 37.8%

Math 500 —

AA Indexes

Intelligence

41.8%

Coding

34.7%

Math

48.3%

Ring-1T

InclusionAI

Input: $0.570

Output: $2.280

Pricing

Input (1M): $0.570
Output (1M): $2.280
Blended (3:1): $0.998

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 80.6%

GPQA 59.5%

HLE 10.2%

AIME —

LiveCodeBench 64.3%

SciCode 36.7%

Math 500 —

AA Indexes

Intelligence

41.8%

Coding

35.8%

Math

89.3%

Qwen3 235B A22B (Reasoning)

Alibaba

Input: $0.700

Output: $8.400

Pricing

Input (1M): $0.700
Output (1M): $8.400
Blended (3:1): $2.625

Performance

Tokens / s: 59.9
TTFT (token): 1.30s
TTFT (answer): 34.69s

Benchmarks

MMLU Pro 82.8%

GPQA 70.0%

HLE 11.7%

AIME 84.0%

LiveCodeBench 62.2%

SciCode 39.9%

Math 500 93.0%

AA Indexes

Intelligence

41.7%

Coding

35.9%

Math

82.0%

Claude 4.5 Haiku (Non-reasoning)

Anthropic

Input: $1.000

Output: $5.000

Pricing

Input (1M): $1.000
Output (1M): $5.000
Blended (3:1): $2.000

Performance

Tokens / s: 86.6
TTFT (token): 0.81s
TTFT (answer): 0.81s

Benchmarks

MMLU Pro 80.0%

GPQA 64.6%

HLE 4.3%

AIME —

LiveCodeBench 51.1%

SciCode 34.4%

Math 500 —

AA Indexes

Intelligence

41.7%

Coding

37.0%

Math

39.0%

GPT-5 mini (minimal)

OpenAI

Input: $0.250

Output: $2.000

Pricing

Input (1M): $0.250
Output (1M): $2.000
Blended (3:1): $0.688

Performance

Tokens / s: 72.2
TTFT (token): 0.98s
TTFT (answer): 0.98s

Benchmarks

MMLU Pro 77.5%

GPQA 68.7%

HLE 5.0%

AIME —

LiveCodeBench 54.5%

SciCode 36.9%

Math 500 —

AA Indexes

Intelligence

41.6%

Coding

35.0%

Math

46.7%

Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)

Google

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 487.7
TTFT (token): 0.32s
TTFT (answer): 0.32s

Benchmarks

MMLU Pro 79.6%

GPQA 65.1%

HLE 4.6%

AIME —

LiveCodeBench 64.1%

SciCode 28.5%

Math 500 —

AA Indexes

Intelligence

41.6%

Coding

33.2%

Math

46.7%

Hermes 4 - Llama-3.1 405B (Reasoning)

Nous Research

Input: $1.000

Output: $3.000

Pricing

Input (1M): $1.000
Output (1M): $3.000
Blended (3:1): $1.500

Performance

Tokens / s: 36.1
TTFT (token): 0.79s
TTFT (answer): 56.24s

Benchmarks

MMLU Pro 82.9%

GPQA 72.7%

HLE 10.3%

AIME —

LiveCodeBench 68.6%

SciCode 25.2%

Math 500 —

AA Indexes

Intelligence

41.6%

Coding

34.8%

Math

69.7%

DeepSeek V3 0324

DeepSeek

Input: $1.140

Output: $1.250

Pricing

Input (1M): $1.140
Output (1M): $1.250
Blended (3:1): $1.250

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 81.9%

GPQA 65.5%

HLE 5.2%

AIME 52.0%

LiveCodeBench 40.5%

SciCode 35.8%

Math 500 94.2%

AA Indexes

Intelligence

41.3%

Coding

30.2%

Math

41.0%

Claude 3.7 Sonnet (Non-reasoning)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 80.3%

GPQA 65.6%

HLE 4.8%

AIME 22.3%

LiveCodeBench 39.4%

SciCode 37.6%

Math 500 85.0%

AA Indexes

Intelligence

41.1%

Coding

32.3%

Math

21.0%

Qwen3 VL 32B Instruct

Alibaba

Input: $0.700

Output: $2.800

Pricing

Input (1M): $0.700
Output (1M): $2.800
Blended (3:1): $1.225

Performance

Tokens / s: 44.7
TTFT (token): 1.18s
TTFT (answer): 1.18s

Benchmarks

MMLU Pro 79.1%

GPQA 67.1%

HLE 6.3%

AIME —

LiveCodeBench 51.4%

SciCode 30.1%

Math 500 —

AA Indexes

Intelligence

41.0%

Coding

29.8%

Math

68.3%

Gemini 2.5 Flash (Non-reasoning)

Google

Input: $0.300

Output: $2.500

Pricing

Input (1M): $0.300
Output (1M): $2.500
Blended (3:1): $0.850

Performance

Tokens / s: 253.5
TTFT (token): 0.37s
TTFT (answer): 0.37s

Benchmarks

MMLU Pro 80.9%

GPQA 68.3%

HLE 5.1%

AIME 50.0%

LiveCodeBench 49.5%

SciCode 29.1%

Math 500 93.2%

AA Indexes

Intelligence

40.4%

Coding

30.0%

Math

60.3%

Gemini 2.5 Flash-Lite (Reasoning)

Google

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 75.9%

GPQA 62.5%

HLE 6.4%

AIME 70.3%

LiveCodeBench 59.3%

SciCode 19.3%

Math 500 96.9%

AA Indexes

Intelligence

40.1%

Coding

27.6%

Math

53.3%

MiniMax M1 40k

MiniMax

Input: $0.400

Output: $2.100

Pricing

Input (1M): $0.400
Output (1M): $2.100
Blended (3:1): $0.825

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 80.8%

GPQA 68.2%

HLE 7.5%

AIME 81.3%

LiveCodeBench 65.7%

SciCode 37.8%

Math 500 97.2%

AA Indexes

Intelligence

40.0%

Coding

35.2%

Math

13.7%

Qwen3 Omni 30B A3B (Reasoning)

Alibaba

Input: $0.250

Output: $0.970

Pricing

Input (1M): $0.250
Output (1M): $0.970
Blended (3:1): $0.430

Performance

Tokens / s: 96.7
TTFT (token): 1.09s
TTFT (answer): 21.77s

Benchmarks

MMLU Pro 79.2%

GPQA 72.6%

HLE 7.3%

AIME —

LiveCodeBench 67.9%

SciCode 30.6%

Math 500 —

AA Indexes

Intelligence

40.0%

Coding

34.0%

Math

74.0%

Ring-flash-2.0

InclusionAI

Input: $0.140

Output: $0.570

Pricing

Input (1M): $0.140
Output (1M): $0.570
Blended (3:1): $0.247

Performance

Tokens / s: 52.6
TTFT (token): 1.74s
TTFT (answer): 39.78s

Benchmarks

MMLU Pro 79.3%

GPQA 72.5%

HLE 8.9%

AIME —

LiveCodeBench 62.8%

SciCode 16.8%

Math 500 —

AA Indexes

Intelligence

39.5%

Coding

28.9%

Math

83.7%

Hermes 4 - Llama-3.1 70B (Reasoning)

Nous Research

Input: $0.130

Output: $0.400

Pricing

Input (1M): $0.130
Output (1M): $0.400
Blended (3:1): $0.198

Performance

Tokens / s: 75.0
TTFT (token): 0.63s
TTFT (answer): 27.31s

Benchmarks

MMLU Pro 81.1%

GPQA 69.9%

HLE 7.9%

AIME —

LiveCodeBench 65.3%

SciCode 34.1%

Math 500 —

AA Indexes

Intelligence

39.2%

Coding

34.6%

Math

68.7%

Qwen3 32B (Reasoning)

Alibaba

Input: $0.700

Output: $8.400

Pricing

Input (1M): $0.700
Output (1M): $8.400
Blended (3:1): $2.625

Performance

Tokens / s: 98.3
TTFT (token): 1.18s
TTFT (answer): 21.53s

Benchmarks

MMLU Pro 79.8%

GPQA 66.8%

HLE 8.3%

AIME 80.7%

LiveCodeBench 54.6%

SciCode 35.4%

Math 500 96.1%

AA Indexes

Intelligence

38.7%

Coding

30.9%

Math

73.0%

Grok 4 Fast (Non-reasoning)

xAI

Input: $0.200

Output: $0.500

Pricing

Input (1M): $0.200
Output (1M): $0.500
Blended (3:1): $0.275

Performance

Tokens / s: 216.5
TTFT (token): 0.47s
TTFT (answer): 0.47s

Benchmarks

MMLU Pro 73.0%

GPQA 60.6%

HLE 5.0%

AIME —

LiveCodeBench 40.1%

SciCode 32.9%

Math 500 —

AA Indexes

Intelligence

38.6%

Coding

28.1%

Math

41.3%

Qwen3 VL 30B A3B Instruct

Alibaba

Input: $0.200

Output: $0.800

Pricing

Input (1M): $0.200
Output (1M): $0.800
Blended (3:1): $0.350

Performance

Tokens / s: 92.4
TTFT (token): 1.05s
TTFT (answer): 1.05s

Benchmarks

MMLU Pro 76.4%

GPQA 69.5%

HLE 6.4%

AIME —

LiveCodeBench 47.6%

SciCode 30.8%

Math 500 —

AA Indexes

Intelligence

38.5%

Coding

28.0%

Math

72.3%

Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)

NVIDIA

Input: $0.600

Output: $1.800

Pricing

Input (1M): $0.600
Output (1M): $1.800
Blended (3:1): $0.900

Performance

Tokens / s: 38.4
TTFT (token): 0.72s
TTFT (answer): 52.86s

Benchmarks

MMLU Pro 82.5%

GPQA 72.8%

HLE 8.1%

AIME 74.7%

LiveCodeBench 64.1%

SciCode 34.7%

Math 500 95.2%

AA Indexes

Intelligence

38.5%

Coding

33.7%

Math

63.7%

Grok 4.1 Fast (Non-reasoning)

xAI

Input: $0.200

Output: $0.500

Pricing

Input (1M): $0.200
Output (1M): $0.500
Blended (3:1): $0.275

Performance

Tokens / s: 68.7
TTFT (token): 0.65s
TTFT (answer): 0.65s

Benchmarks

MMLU Pro 74.3%

GPQA 63.7%

HLE 5.0%

AIME —

LiveCodeBench 39.9%

SciCode 29.6%

Math 500 —

AA Indexes

Intelligence

38.3%

Coding

27.7%

Math

34.3%

Ling-flash-2.0

InclusionAI

Input: $0.140

Output: $0.570

Pricing

Input (1M): $0.140
Output (1M): $0.570
Blended (3:1): $0.247

Performance

Tokens / s: 54.6
TTFT (token): 1.71s
TTFT (answer): 1.71s

Benchmarks

MMLU Pro 77.7%

GPQA 65.7%

HLE 6.3%

AIME —

LiveCodeBench 58.9%

SciCode 28.9%

Math 500 —

AA Indexes

Intelligence

38.3%

Coding

32.6%

Math

65.3%

QwQ 32B

Alibaba

Input: $0.430

Output: $0.600

Pricing

Input (1M): $0.430
Output (1M): $0.600
Blended (3:1): $0.473

Performance

Tokens / s: 28.4
TTFT (token): 1.67s
TTFT (answer): 89.37s

Benchmarks

MMLU Pro 76.4%

GPQA 59.3%

HLE 8.2%

AIME 78.0%

LiveCodeBench 63.1%

SciCode 35.8%

Math 500 95.7%

AA Indexes

Intelligence

37.9%

Coding

—

Math

29.0%

Solar Pro 2 (Reasoning)

Upstage

Input: $0.500

Output: $0.500

Pricing

Input (1M): $0.500
Output (1M): $0.500
Blended (3:1): $0.500

Performance

Tokens / s: 111.3
TTFT (token): 1.16s
TTFT (answer): 19.13s

Benchmarks

MMLU Pro 80.5%

GPQA 68.7%

HLE 7.0%

AIME 69.0%

LiveCodeBench 61.6%

SciCode 30.2%

Math 500 96.7%

AA Indexes

Intelligence

37.7%

Coding

31.5%

Math

61.3%

NVIDIA Nemotron Nano 9B V2 (Reasoning)

NVIDIA

Input: $0.040

Output: $0.160

Pricing

Input (1M): $0.040
Output (1M): $0.160
Blended (3:1): $0.070

Performance

Tokens / s: 98.3
TTFT (token): 0.22s
TTFT (answer): 20.56s

Benchmarks

MMLU Pro 74.2%

GPQA 57.0%

HLE 4.6%

AIME —

LiveCodeBench 72.4%

SciCode 22.0%

Math 500 —

AA Indexes

Intelligence

37.2%

Coding

31.9%

Math

69.7%

GLM-4.5V (Reasoning)

Z AI

Input: $0.550

Output: $1.750

Pricing

Input (1M): $0.550
Output (1M): $1.750
Blended (3:1): $0.850

Performance

Tokens / s: 74.8
TTFT (token): 0.94s
TTFT (answer): 27.68s

Benchmarks

MMLU Pro 78.8%

GPQA 68.4%

HLE 5.9%

AIME —

LiveCodeBench 60.4%

SciCode 22.1%

Math 500 —

AA Indexes

Intelligence

37.0%

Coding

29.2%

Math

73.0%

Qwen3 30B A3B 2507 Instruct

Alibaba

Input: $0.200

Output: $0.800

Pricing

Input (1M): $0.200
Output (1M): $0.800
Blended (3:1): $0.350

Performance

Tokens / s: 61.1
TTFT (token): 1.14s
TTFT (answer): 1.14s

Benchmarks

MMLU Pro 77.7%

GPQA 65.9%

HLE 6.8%

AIME 72.7%

LiveCodeBench 51.5%

SciCode 30.4%

Math 500 97.5%

AA Indexes

Intelligence

37.0%

Coding

29.2%

Math

66.3%

Qwen3 30B A3B (Reasoning)

Alibaba

Input: $0.200

Output: $2.400

Pricing

Input (1M): $0.200
Output (1M): $2.400
Blended (3:1): $0.750

Performance

Tokens / s: 79.5
TTFT (token): 1.16s
TTFT (answer): 26.30s

Benchmarks

MMLU Pro 77.7%

GPQA 61.6%

HLE 6.6%

AIME 75.3%

LiveCodeBench 50.6%

SciCode 28.5%

Math 500 95.9%

AA Indexes

Intelligence

36.7%

Coding

27.1%

Math

72.3%

NVIDIA Nemotron Nano 9B V2 (Non-reasoning)

NVIDIA

Input: $0.040

Output: $0.160

Pricing

Input (1M): $0.040
Output (1M): $0.160
Blended (3:1): $0.070

Performance

Tokens / s: 96.8
TTFT (token): 0.20s
TTFT (answer): 0.20s

Benchmarks

MMLU Pro 73.9%

GPQA 55.7%

HLE 4.0%

AIME —

LiveCodeBench 70.1%

SciCode 20.9%

Math 500 —

AA Indexes

Intelligence

36.1%

Coding

30.6%

Math

62.3%

Qwen3 14B (Reasoning)

Alibaba

Input: $0.350

Output: $4.200

Pricing

Input (1M): $0.350
Output (1M): $4.200
Blended (3:1): $1.313

Performance

Tokens / s: 58.6
TTFT (token): 1.17s
TTFT (answer): 35.32s

Benchmarks

MMLU Pro 77.4%

GPQA 60.4%

HLE 4.3%

AIME 76.3%

LiveCodeBench 52.3%

SciCode 31.6%

Math 500 96.1%

AA Indexes

Intelligence

36.0%

Coding

29.1%

Math

55.7%

Llama 4 Maverick

Meta

Input: $0.270

Output: $0.850

Pricing

Input (1M): $0.270
Output (1M): $0.850
Blended (3:1): $0.422

Performance

Tokens / s: 130.0
TTFT (token): 0.46s
TTFT (answer): 0.46s

Benchmarks

MMLU Pro 80.9%

GPQA 67.1%

HLE 4.8%

AIME 39.0%

LiveCodeBench 39.7%

SciCode 33.1%

Math 500 88.9%

AA Indexes

Intelligence

35.8%

Coding

26.4%

Math

19.3%

GPT-4o (March 2025, chatgpt-4o-latest)

OpenAI

Input: $5.000

Output: $15.000

Pricing

Input (1M): $5.000
Output (1M): $15.000
Blended (3:1): $7.500

Performance

Tokens / s: 272.6
TTFT (token): 0.48s
TTFT (answer): 0.48s

Benchmarks

MMLU Pro 80.3%

GPQA 65.5%

HLE 5.0%

AIME 32.7%

LiveCodeBench 42.5%

SciCode 36.6%

Math 500 89.3%

AA Indexes

Intelligence

35.6%

Coding

—

Math

25.7%

Mistral Medium 3.1

Mistral

Input: $0.400

Output: $2.000

Pricing

Input (1M): $0.400
Output (1M): $2.000
Blended (3:1): $0.800

Performance

Tokens / s: 59.6
TTFT (token): 0.44s
TTFT (answer): 0.44s

Benchmarks

MMLU Pro 68.3%

GPQA 58.8%

HLE 4.4%

AIME —

LiveCodeBench 40.6%

SciCode 33.8%

Math 500 —

AA Indexes

Intelligence

35.4%

Coding

28.1%

Math

38.3%

Sonar Reasoning

Perplexity

Input: $1.000

Output: $5.000

Pricing

Input (1M): $1.000
Output (1M): $5.000
Blended (3:1): $2.000

Performance

Tokens / s: 69.1
TTFT (token): 1.41s
TTFT (answer): 30.35s

Benchmarks

MMLU Pro —

GPQA 62.3%

HLE —

AIME 77.0%

LiveCodeBench —

SciCode —

Math 500 92.1%

AA Indexes

Intelligence

34.2%

Coding

—

Math

—

Gemini 2.0 Flash (Feb '25)

Google

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 181.8
TTFT (token): 0.37s
TTFT (answer): 0.37s

Benchmarks

MMLU Pro 77.9%

GPQA 62.3%

HLE 5.3%

AIME 33.0%

LiveCodeBench 33.4%

SciCode 33.3%

Math 500 93.0%

AA Indexes

Intelligence

33.6%

Coding

23.4%

Math

21.7%

Mistral Medium 3

Mistral

Input: $0.400

Output: $2.000

Pricing

Input (1M): $0.400
Output (1M): $2.000
Blended (3:1): $0.800

Performance

Tokens / s: 33.3
TTFT (token): 0.47s
TTFT (answer): 0.47s

Benchmarks

MMLU Pro 76.0%

GPQA 57.8%

HLE 4.3%

AIME 44.0%

LiveCodeBench 40.0%

SciCode 33.1%

Math 500 90.7%

AA Indexes

Intelligence

33.6%

Coding

25.6%

Math

30.3%

Qwen3 Coder 30B A3B Instruct

Alibaba

Input: $0.450

Output: $2.250

Pricing

Input (1M): $0.450
Output (1M): $2.250
Blended (3:1): $0.900

Performance

Tokens / s: 88.7
TTFT (token): 1.60s
TTFT (answer): 1.60s

Benchmarks

MMLU Pro 70.6%

GPQA 51.6%

HLE 4.0%

AIME 29.7%

LiveCodeBench 40.3%

SciCode 27.8%

Math 500 89.3%

AA Indexes

Intelligence

33.4%

Coding

27.4%

Math

29.0%

Magistral Medium 1

Mistral

Input: $2.000

Output: $5.000

Pricing

Input (1M): $2.000
Output (1M): $5.000
Blended (3:1): $2.750

Performance

Tokens / s: 148.3
TTFT (token): 0.50s
TTFT (answer): 13.99s

Benchmarks

MMLU Pro 75.3%

GPQA 67.9%

HLE 9.5%

AIME 70.0%

LiveCodeBench 52.7%

SciCode 29.7%

Math 500 91.7%

AA Indexes

Intelligence

33.2%

Coding

30.3%

Math

40.3%

ERNIE 4.5 300B A47B

Baidu

Input: $0.280

Output: $1.100

Pricing

Input (1M): $0.280
Output (1M): $1.100
Blended (3:1): $0.485

Performance

Tokens / s: 24.4
TTFT (token): 1.07s
TTFT (answer): 1.07s

Benchmarks

MMLU Pro 77.6%

GPQA 81.1%

HLE 3.5%

AIME 49.3%

LiveCodeBench 46.7%

SciCode 31.5%

Math 500 93.1%

AA Indexes

Intelligence

32.9%

Coding

27.9%

Math

41.3%

DeepSeek R1 Distill Qwen 32B

DeepSeek

Input: $0.285

Output: $0.285

Pricing

Input (1M): $0.285
Output (1M): $0.285
Blended (3:1): $0.285

Performance

Tokens / s: 92.1
TTFT (token): 0.52s
TTFT (answer): 22.23s

Benchmarks

MMLU Pro 73.9%

GPQA 61.5%

HLE 5.5%

AIME 68.7%

LiveCodeBench 27.0%

SciCode 37.6%

Math 500 94.1%

AA Indexes

Intelligence

32.7%

Coding

—

Math

63.0%

Hermes 4 - Llama-3.1 405B (Non-reasoning)

Nous Research

Input: $1.000

Output: $3.000

Pricing

Input (1M): $1.000
Output (1M): $3.000
Blended (3:1): $1.500

Performance

Tokens / s: 32.2
TTFT (token): 0.74s
TTFT (answer): 0.74s

Benchmarks

MMLU Pro 72.9%

GPQA 53.6%

HLE 4.2%

AIME —

LiveCodeBench 54.6%

SciCode 34.6%

Math 500 —

AA Indexes

Intelligence

32.6%

Coding

32.8%

Math

15.3%

DeepSeek V3 (Dec '24)

DeepSeek

Input: $0.400

Output: $0.890

Pricing

Input (1M): $0.400
Output (1M): $0.890
Blended (3:1): $0.625

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 75.2%

GPQA 55.7%

HLE 3.6%

AIME 25.3%

LiveCodeBench 35.9%

SciCode 35.4%

Math 500 88.7%

AA Indexes

Intelligence

32.5%

Coding

25.9%

Math

26.0%

Nova Premier

Amazon

Input: $2.500

Output: $12.500

Pricing

Input (1M): $2.500
Output (1M): $12.500
Blended (3:1): $5.000

Performance

Tokens / s: 57.2
TTFT (token): 0.80s
TTFT (answer): 0.80s

Benchmarks

MMLU Pro 73.3%

GPQA 56.9%

HLE 4.7%

AIME 17.0%

LiveCodeBench 31.7%

SciCode 27.9%

Math 500 83.9%

AA Indexes

Intelligence

32.3%

Coding

22.0%

Math

17.3%

Qwen3 VL 8B (Reasoning)

Alibaba

Input: $0.180

Output: $2.100

Pricing

Input (1M): $0.180
Output (1M): $2.100
Blended (3:1): $0.660

Performance

Tokens / s: 63.7
TTFT (token): 1.13s
TTFT (answer): 32.54s

Benchmarks

MMLU Pro 74.9%

GPQA 57.9%

HLE 3.3%

AIME —

LiveCodeBench 35.3%

SciCode 21.9%

Math 500 —

AA Indexes

Intelligence

32.1%

Coding

20.3%

Math

30.7%

Magistral Small 1

Mistral

Input: $0.500

Output: $1.500

Pricing

Input (1M): $0.500
Output (1M): $1.500
Blended (3:1): $0.750

Performance

Tokens / s: 259.0
TTFT (token): 0.45s
TTFT (answer): 8.18s

Benchmarks

MMLU Pro 74.6%

GPQA 64.1%

HLE 7.2%

AIME 71.3%

LiveCodeBench 51.4%

SciCode 24.1%

Math 500 96.3%

AA Indexes

Intelligence

31.9%

Coding

26.6%

Math

41.3%

DeepSeek R1 0528 Qwen3 8B

DeepSeek

Input: $0.060

Output: $0.090

Pricing

Input (1M): $0.060
Output (1M): $0.090
Blended (3:1): $0.068

Performance

Tokens / s: 78.6
TTFT (token): 0.82s
TTFT (answer): 26.28s

Benchmarks

MMLU Pro 73.9%

GPQA 61.2%

HLE 5.6%

AIME 65.0%

LiveCodeBench 51.3%

SciCode 20.4%

Math 500 93.2%

AA Indexes

Intelligence

31.0%

Coding

24.4%

Math

63.7%

Qwen2.5 Max

Alibaba

Input: $1.600

Output: $6.400

Pricing

Input (1M): $1.600
Output (1M): $6.400
Blended (3:1): $2.800

Performance

Tokens / s: 28.3
TTFT (token): 1.25s
TTFT (answer): 1.25s

Benchmarks

MMLU Pro 76.2%

GPQA 58.7%

HLE 4.5%

AIME 23.3%

LiveCodeBench 35.9%

SciCode 33.7%

Math 500 83.5%

AA Indexes

Intelligence

30.7%

Coding

—

Math

—

EXAONE 4.0 32B (Non-reasoning)

LG AI Research

Input: $0.600

Output: $1.000

Pricing

Input (1M): $0.600
Output (1M): $1.000
Blended (3:1): $0.700

Performance

Tokens / s: 87.7
TTFT (token): 0.41s
TTFT (answer): 0.41s

Benchmarks

MMLU Pro 76.8%

GPQA 62.8%

HLE 4.9%

AIME 47.0%

LiveCodeBench 47.2%

SciCode 25.2%

Math 500 93.9%

AA Indexes

Intelligence

30.3%

Coding

24.6%

Math

39.3%

Solar Pro 2 (Non-reasoning)

Upstage

Input: $0.500

Output: $0.500

Pricing

Input (1M): $0.500
Output (1M): $0.500
Blended (3:1): $0.500

Performance

Tokens / s: 107.6
TTFT (token): 1.17s
TTFT (answer): 1.17s

Benchmarks

MMLU Pro 75.0%

GPQA 56.1%

HLE 3.8%

AIME 40.7%

LiveCodeBench 42.4%

SciCode 24.8%

Math 500 88.9%

AA Indexes

Intelligence

30.2%

Coding

23.8%

Math

30.0%

Qwen3 Omni 30B A3B Instruct

Alibaba

Input: $0.250

Output: $0.970

Pricing

Input (1M): $0.250
Output (1M): $0.970
Blended (3:1): $0.430

Performance

Tokens / s: 91.1
TTFT (token): 1.15s
TTFT (answer): 1.15s

Benchmarks

MMLU Pro 72.5%

GPQA 62.0%

HLE 5.1%

AIME —

LiveCodeBench 42.2%

SciCode 18.6%

Math 500 —

AA Indexes

Intelligence

30.2%

Coding

20.8%

Math

52.3%

Gemini 2.5 Flash-Lite (Non-reasoning)

Google

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 263.0
TTFT (token): 0.34s
TTFT (answer): 0.34s

Benchmarks

MMLU Pro 72.4%

GPQA 47.4%

HLE 3.7%

AIME 50.0%

LiveCodeBench 40.0%

SciCode 17.7%

Math 500 92.6%

AA Indexes

Intelligence

30.1%

Coding

19.9%

Math

35.3%

Qwen3 235B A22B (Non-reasoning)

Alibaba

Input: $0.700

Output: $2.800

Pricing

Input (1M): $0.700
Output (1M): $2.800
Blended (3:1): $1.225

Performance

Tokens / s: 55.9
TTFT (token): 1.19s
TTFT (answer): 1.19s

Benchmarks

MMLU Pro 76.2%

GPQA 61.3%

HLE 4.7%

AIME 32.7%

LiveCodeBench 34.3%

SciCode 29.9%

Math 500 90.2%

AA Indexes

Intelligence

29.9%

Coding

23.3%

Math

23.7%

DeepSeek R1 Distill Llama 70B

DeepSeek

Input: $0.800

Output: $1.050

Pricing

Input (1M): $0.800
Output (1M): $1.050
Blended (3:1): $0.875

Performance

Tokens / s: 102.2
TTFT (token): 0.92s
TTFT (answer): 20.48s

Benchmarks

MMLU Pro 79.5%

GPQA 40.2%

HLE 6.1%

AIME 67.0%

LiveCodeBench 26.6%

SciCode 31.2%

Math 500 93.5%

AA Indexes

Intelligence

29.9%

Coding

19.7%

Math

53.7%

Claude 3.5 Sonnet (Oct '24)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 77.2%

GPQA 59.9%

HLE 3.9%

AIME 15.7%

LiveCodeBench 38.1%

SciCode 36.6%

Math 500 77.1%

AA Indexes

Intelligence

29.9%

Coding

30.2%

Math

—

DeepSeek R1 Distill Qwen 14B

DeepSeek

Input: $0.150

Output: $0.150

Pricing

Input (1M): $0.150
Output (1M): $0.150
Blended (3:1): $0.150

Performance

Tokens / s: 41.9
TTFT (token): 1.07s
TTFT (answer): 48.83s

Benchmarks

MMLU Pro 74.0%

GPQA 48.4%

HLE 4.4%

AIME 66.7%

LiveCodeBench 37.6%

SciCode 23.9%

Math 500 94.9%

AA Indexes

Intelligence

29.7%

Coding

—

Math

55.7%

Qwen3 14B (Non-reasoning)

Alibaba

Input: $0.350

Output: $1.400

Pricing

Input (1M): $0.350
Output (1M): $1.400
Blended (3:1): $0.613

Performance

Tokens / s: 53.5
TTFT (token): 1.28s
TTFT (answer): 1.28s

Benchmarks

MMLU Pro 67.5%

GPQA 47.0%

HLE 4.2%

AIME 28.0%

LiveCodeBench 28.0%

SciCode 26.5%

Math 500 87.1%

AA Indexes

Intelligence

29.2%

Coding

19.8%

Math

58.0%

Mistral Small 3.2

Mistral

Input: $0.100

Output: $0.300

Pricing

Input (1M): $0.100
Output (1M): $0.300
Blended (3:1): $0.150

Performance

Tokens / s: 134.6
TTFT (token): 0.32s
TTFT (answer): 0.32s

Benchmarks

MMLU Pro 68.1%

GPQA 50.5%

HLE 4.3%

AIME 32.3%

LiveCodeBench 27.5%

SciCode 26.4%

Math 500 88.3%

AA Indexes

Intelligence

29.1%

Coding

20.1%

Math

27.0%

GPT-5 nano (minimal)

OpenAI

Input: $0.050

Output: $0.400

Pricing

Input (1M): $0.050
Output (1M): $0.400
Blended (3:1): $0.138

Performance

Tokens / s: 118.8
TTFT (token): 0.69s
TTFT (answer): 0.69s

Benchmarks

MMLU Pro 55.6%

GPQA 42.8%

HLE 4.1%

AIME —

LiveCodeBench 47.0%

SciCode 29.1%

Math 500 —

AA Indexes

Intelligence

29.1%

Coding

27.5%

Math

27.3%

GPT-4o (Aug '24)

OpenAI

Input: $2.500

Output: $10.000

Pricing

Input (1M): $2.500
Output (1M): $10.000
Blended (3:1): $4.375

Performance

Tokens / s: 118.0
TTFT (token): 0.72s
TTFT (answer): 0.72s

Benchmarks

MMLU Pro —

GPQA 52.1%

HLE 2.9%

AIME 11.7%

LiveCodeBench 31.7%

SciCode —

Math 500 79.5%

AA Indexes

Intelligence

29.0%

Coding

—

Math

—

Sonar

Perplexity

Input: $1.000

Output: $1.000

Pricing

Input (1M): $1.000
Output (1M): $1.000
Blended (3:1): $1.000

Performance

Tokens / s: 75.6
TTFT (token): 2.03s
TTFT (answer): 2.03s

Benchmarks

MMLU Pro 68.9%

GPQA 47.1%

HLE 7.3%

AIME 48.7%

LiveCodeBench 29.5%

SciCode 22.9%

Math 500 81.7%

AA Indexes

Intelligence

28.8%

Coding

—

Math

—

Qwen3 8B (Reasoning)

Alibaba

Input: $0.180

Output: $2.100

Pricing

Input (1M): $0.180
Output (1M): $2.100
Blended (3:1): $0.660

Performance

Tokens / s: 86.6
TTFT (token): 1.16s
TTFT (answer): 24.26s

Benchmarks

MMLU Pro 74.3%

GPQA 58.9%

HLE 4.2%

AIME 74.7%

LiveCodeBench 40.6%

SciCode 22.6%

Math 500 90.4%

AA Indexes

Intelligence

28.3%

Coding

21.8%

Math

19.0%

MiniMax-Text-01

MiniMax

Input: $0.200

Output: $1.100

Pricing

Input (1M): $0.200
Output (1M): $1.100
Blended (3:1): $0.425

Performance

Tokens / s: 27.6
TTFT (token): 1.35s
TTFT (answer): 1.35s

Benchmarks

MMLU Pro 75.9%

GPQA 57.8%

HLE 4.2%

AIME 13.0%

LiveCodeBench 24.7%

SciCode 25.0%

Math 500 75.3%

AA Indexes

Intelligence

28.3%

Coding

17.3%

Math

12.3%

Sonar Pro

Perplexity

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: 85.8
TTFT (token): 2.38s
TTFT (answer): 2.38s

Benchmarks

MMLU Pro 75.5%

GPQA 57.8%

HLE 7.9%

AIME 29.0%

LiveCodeBench 27.5%

SciCode 22.6%

Math 500 74.5%

AA Indexes

Intelligence

28.2%

Coding

—

Math

—

Llama 3.1 Instruct 405B

Meta

Input: $3.750

Output: $6.750

Pricing

Input (1M): $3.750
Output (1M): $6.750
Blended (3:1): $4.188

Performance

Tokens / s: 24.4
TTFT (token): 0.90s
TTFT (answer): 0.90s

Benchmarks

MMLU Pro 73.2%

GPQA 51.5%

HLE 4.2%

AIME 21.3%

LiveCodeBench 30.5%

SciCode 29.9%

Math 500 70.3%

AA Indexes

Intelligence

28.1%

Coding

22.2%

Math

3.0%

Llama 4 Scout

Meta

Input: $0.140

Output: $0.545

Pricing

Input (1M): $0.140
Output (1M): $0.545
Blended (3:1): $0.241

Performance

Tokens / s: 136.6
TTFT (token): 0.55s
TTFT (answer): 0.55s

Benchmarks

MMLU Pro 75.2%

GPQA 58.7%

HLE 4.3%

AIME 28.3%

LiveCodeBench 29.9%

SciCode 17.0%

Math 500 84.4%

AA Indexes

Intelligence

28.1%

Coding

16.1%

Math

14.0%

QwQ 32B-Preview

Alibaba

Input: $0.120

Output: $0.180

Pricing

Input (1M): $0.120
Output (1M): $0.180
Blended (3:1): $0.135

Performance

Tokens / s: 108.7
TTFT (token): 0.33s
TTFT (answer): 18.73s

Benchmarks

MMLU Pro 64.8%

GPQA 55.7%

HLE 4.8%

AIME 45.3%

LiveCodeBench 33.7%

SciCode 3.8%

Math 500 91.0%

AA Indexes

Intelligence

28.0%

Coding

—

Math

—

Llama 3.3 Instruct 70B

Meta

Input: $0.540

Output: $0.710

Pricing

Input (1M): $0.540
Output (1M): $0.710
Blended (3:1): $0.620

Performance

Tokens / s: 99.2
TTFT (token): 0.51s
TTFT (answer): 0.51s

Benchmarks

MMLU Pro 71.3%

GPQA 49.8%

HLE 4.0%

AIME 30.0%

LiveCodeBench 28.8%

SciCode 26.0%

Math 500 77.3%

AA Indexes

Intelligence

27.9%

Coding

19.2%

Math

7.7%

Devstral Medium

Mistral

Input: $0.400

Output: $2.000

Pricing

Input (1M): $0.400
Output (1M): $2.000
Blended (3:1): $0.800

Performance

Tokens / s: 111.5
TTFT (token): 0.50s
TTFT (answer): 0.50s

Benchmarks

MMLU Pro 70.8%

GPQA 49.2%

HLE 3.8%

AIME 6.7%

LiveCodeBench 33.7%

SciCode 29.4%

Math 500 70.7%

AA Indexes

Intelligence

27.9%

Coding

23.9%

Math

4.7%

Ling-mini-2.0

InclusionAI

Input: $0.070

Output: $0.280

Pricing

Input (1M): $0.070
Output (1M): $0.280
Blended (3:1): $0.122

Performance

Tokens / s: 150.3
TTFT (token): 1.79s
TTFT (answer): 1.79s

Benchmarks

MMLU Pro 67.1%

GPQA 56.2%

HLE 5.0%

AIME —

LiveCodeBench 42.9%

SciCode 13.5%

Math 500 —

AA Indexes

Intelligence

27.8%

Coding

19.0%

Math

49.3%

GPT-4.1 nano

OpenAI

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 149.8
TTFT (token): 0.46s
TTFT (answer): 0.46s

Benchmarks

MMLU Pro 65.7%

GPQA 51.2%

HLE 3.9%

AIME 23.7%

LiveCodeBench 32.6%

SciCode 25.9%

Math 500 84.8%

AA Indexes

Intelligence

27.3%

Coding

20.7%

Math

24.0%

Devstral Small (Jul '25)

Mistral

Input: $0.100

Output: $0.300

Pricing

Input (1M): $0.100
Output (1M): $0.300
Blended (3:1): $0.150

Performance

Tokens / s: 241.5
TTFT (token): 0.42s
TTFT (answer): 0.42s

Benchmarks

MMLU Pro 62.2%

GPQA 41.4%

HLE 3.7%

AIME 0.3%

LiveCodeBench 25.4%

SciCode 24.3%

Math 500 63.5%

AA Indexes

Intelligence

27.2%

Coding

18.5%

Math

29.3%

Qwen3 VL 8B Instruct

Alibaba

Input: $0.180

Output: $0.700

Pricing

Input (1M): $0.180
Output (1M): $0.700
Blended (3:1): $0.310

Performance

Tokens / s: 90.0
TTFT (token): 1.18s
TTFT (answer): 1.18s

Benchmarks

MMLU Pro 68.6%

GPQA 42.7%

HLE 2.9%

AIME —

LiveCodeBench 33.2%

SciCode 17.4%

Math 500 —

AA Indexes

Intelligence

27.1%

Coding

17.6%

Math

27.3%

GPT-4o (Nov '24)

OpenAI

Input: $2.500

Output: $10.000

Pricing

Input (1M): $2.500
Output (1M): $10.000
Blended (3:1): $4.375

Performance

Tokens / s: 189.6
TTFT (token): 0.52s
TTFT (answer): 0.52s

Benchmarks

MMLU Pro 74.8%

GPQA 54.3%

HLE 3.3%

AIME 15.0%

LiveCodeBench 30.9%

SciCode 33.3%

Math 500 75.9%

AA Indexes

Intelligence

27.0%

Coding

24.0%

Math

6.0%

Command A

Cohere

Input: $2.500

Output: $10.000

Pricing

Input (1M): $2.500
Output (1M): $10.000
Blended (3:1): $4.375

Performance

Tokens / s: 55.0
TTFT (token): 0.35s
TTFT (answer): 0.35s

Benchmarks

MMLU Pro 71.2%

GPQA 52.7%

HLE 4.6%

AIME 9.7%

LiveCodeBench 28.7%

SciCode 28.1%

Math 500 81.9%

AA Indexes

Intelligence

26.9%

Coding

19.2%

Math

13.0%

Mistral Large 2 (Nov '24)

Mistral

Input: $2.000

Output: $6.000

Pricing

Input (1M): $2.000
Output (1M): $6.000
Blended (3:1): $3.000

Performance

Tokens / s: 38.1
TTFT (token): 0.54s
TTFT (answer): 0.54s

Benchmarks

MMLU Pro 69.7%

GPQA 48.6%

HLE 4.0%

AIME 11.0%

LiveCodeBench 29.3%

SciCode 29.2%

Math 500 73.6%

AA Indexes

Intelligence

26.8%

Coding

21.4%

Math

14.0%

Gemini 2.0 Flash-Lite (Feb '25)

Google

Input: $0.075

Output: $0.300

Pricing

Input (1M): $0.075
Output (1M): $0.300
Blended (3:1): $0.131

Performance

Tokens / s: 188.1
TTFT (token): 0.27s
TTFT (answer): 0.27s

Benchmarks

MMLU Pro 72.4%

GPQA 53.5%

HLE 3.6%

AIME 27.7%

LiveCodeBench 18.5%

SciCode 25.0%

Math 500 87.3%

AA Indexes

Intelligence

26.8%

Coding

—

Math

—

Llama Nemotron Super 49B v1.5 (Non-reasoning)

NVIDIA

Input: $0.100

Output: $0.400

Pricing

Input (1M): $0.100
Output (1M): $0.400
Blended (3:1): $0.175

Performance

Tokens / s: 68.7
TTFT (token): 0.23s
TTFT (answer): 0.23s

Benchmarks

MMLU Pro 69.2%

GPQA 48.1%

HLE 4.3%

AIME 13.7%

LiveCodeBench 29.0%

SciCode 23.8%

Math 500 77.0%

AA Indexes

Intelligence

26.6%

Coding

18.8%

Math

8.0%

Qwen3 30B A3B (Non-reasoning)

Alibaba

Input: $0.200

Output: $0.800

Pricing

Input (1M): $0.200
Output (1M): $0.800
Blended (3:1): $0.350

Performance

Tokens / s: 72.6
TTFT (token): 1.19s
TTFT (answer): 1.19s

Benchmarks

MMLU Pro 71.0%

GPQA 51.5%

HLE 4.6%

AIME 26.0%

LiveCodeBench 32.2%

SciCode 26.4%

Math 500 86.3%

AA Indexes

Intelligence

26.5%

Coding

21.6%

Math

21.7%

Qwen3 32B (Non-reasoning)

Alibaba

Input: $0.700

Output: $2.800

Pricing

Input (1M): $0.700
Output (1M): $2.800
Blended (3:1): $1.225

Performance

Tokens / s: 90.8
TTFT (token): 1.10s
TTFT (answer): 1.10s

Benchmarks

MMLU Pro 72.7%

GPQA 53.5%

HLE 4.3%

AIME 30.3%

LiveCodeBench 28.8%

SciCode 28.0%

Math 500 86.9%

AA Indexes

Intelligence

26.4%

Coding

—

Math

19.7%

GPT-4o (May '24)

OpenAI

Input: $5.000

Output: $15.000

Pricing

Input (1M): $5.000
Output (1M): $15.000
Blended (3:1): $7.500

Performance

Tokens / s: 115.6
TTFT (token): 0.63s
TTFT (answer): 0.63s

Benchmarks

MMLU Pro 74.0%

GPQA 52.6%

HLE 2.8%

AIME 11.0%

LiveCodeBench 33.4%

SciCode 30.9%

Math 500 79.1%

AA Indexes

Intelligence

26.3%

Coding

24.2%

Math

—

Gemini 2.0 Flash-Lite (Preview)

Google

Input: $0.075

Output: $0.300

Pricing

Input (1M): $0.075
Output (1M): $0.300
Blended (3:1): $0.131

Performance

Tokens / s: 184.5
TTFT (token): 0.27s
TTFT (answer): 0.27s

Benchmarks

MMLU Pro —

GPQA 54.2%

HLE 4.4%

AIME 30.3%

LiveCodeBench 17.9%

SciCode 24.7%

Math 500 87.3%

AA Indexes

Intelligence

26.3%

Coding

—

Math

—

GLM-4.5V (Non-reasoning)

Z AI

Input: $0.600

Output: $1.800

Pricing

Input (1M): $0.600
Output (1M): $1.800
Blended (3:1): $0.900

Performance

Tokens / s: 74.3
TTFT (token): 1.05s
TTFT (answer): 1.05s

Benchmarks

MMLU Pro 75.1%

GPQA 57.3%

HLE 3.6%

AIME —

LiveCodeBench 35.2%

SciCode 18.8%

Math 500 —

AA Indexes

Intelligence

26.0%

Coding

20.1%

Math

15.3%

Reka Flash 3

Reka AI

Input: $0.200

Output: $0.800

Pricing

Input (1M): $0.200
Output (1M): $0.800
Blended (3:1): $0.350

Performance

Tokens / s: 51.8
TTFT (token): 1.32s
TTFT (answer): 39.97s

Benchmarks

MMLU Pro 66.9%

GPQA 52.9%

HLE 5.1%

AIME 51.0%

LiveCodeBench 43.5%

SciCode 26.7%

Math 500 89.3%

AA Indexes

Intelligence

25.9%

Coding

23.4%

Math

33.7%

Qwen3 4B (Reasoning)

Alibaba

Input: $0.110

Output: $1.260

Pricing

Input (1M): $0.110
Output (1M): $1.260
Blended (3:1): $0.398

Performance

Tokens / s: 84.0
TTFT (token): 1.08s
TTFT (answer): 24.89s

Benchmarks

MMLU Pro 69.6%

GPQA 52.2%

HLE 5.1%

AIME 65.7%

LiveCodeBench 46.5%

SciCode 3.5%

Math 500 93.3%

AA Indexes

Intelligence

25.6%

Coding

—

Math

22.3%

Claude 3.5 Sonnet (June '24)

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 75.1%

GPQA 56.0%

HLE 3.7%

AIME 9.7%

LiveCodeBench —

SciCode 31.6%

Math 500 69.5%

AA Indexes

Intelligence

25.4%

Coding

26.0%

Math

—

GPT-4o (ChatGPT)

OpenAI

Input: $5.000

Output: $15.000

Pricing

Input (1M): $5.000
Output (1M): $15.000
Blended (3:1): $7.500

Performance

Tokens / s: 266.8
TTFT (token): 0.46s
TTFT (answer): 0.46s

Benchmarks

MMLU Pro 77.3%

GPQA 51.1%

HLE 3.7%

AIME 10.3%

LiveCodeBench —

SciCode 33.4%

Math 500 79.7%

AA Indexes

Intelligence

25.3%

Coding

—

Math

—

Pixtral Large

Mistral

Input: $2.000

Output: $6.000

Pricing

Input (1M): $2.000
Output (1M): $6.000
Blended (3:1): $3.000

Performance

Tokens / s: 36.7
TTFT (token): 0.61s
TTFT (answer): 0.61s

Benchmarks

MMLU Pro 70.1%

GPQA 50.5%

HLE 3.6%

AIME 7.0%

LiveCodeBench 26.1%

SciCode 29.2%

Math 500 71.4%

AA Indexes

Intelligence

25.0%

Coding

—

Math

2.3%

Nova Pro

Amazon

Input: $0.800

Output: $3.200

Pricing

Input (1M): $0.800
Output (1M): $3.200
Blended (3:1): $1.400

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 69.1%

GPQA 49.9%

HLE 3.4%

AIME 10.7%

LiveCodeBench 23.3%

SciCode 20.8%

Math 500 78.6%

AA Indexes

Intelligence

25.0%

Coding

16.6%

Math

7.0%

Mistral Small 3.1

Mistral

Input: $0.100

Output: $0.300

Pricing

Input (1M): $0.100
Output (1M): $0.300
Blended (3:1): $0.150

Performance

Tokens / s: 160.5
TTFT (token): 0.31s
TTFT (answer): 0.31s

Benchmarks

MMLU Pro 65.9%

GPQA 45.4%

HLE 4.8%

AIME 9.3%

LiveCodeBench 21.2%

SciCode 26.5%

Math 500 70.7%

AA Indexes

Intelligence

24.9%

Coding

18.3%

Math

3.7%

Grok 2 (Dec '24)

xAI

Input: $2.000

Output: $10.000

Pricing

Input (1M): $2.000
Output (1M): $10.000
Blended (3:1): $4.000

Performance

Tokens / s: 94.6
TTFT (token): 0.60s
TTFT (answer): 0.60s

Benchmarks

MMLU Pro 70.9%

GPQA 51.0%

HLE 3.8%

AIME 13.3%

LiveCodeBench 26.7%

SciCode 28.5%

Math 500 77.8%

AA Indexes

Intelligence

24.7%

Coding

—

Math

—

GPT-4 Turbo

OpenAI

Input: $10.000

Output: $30.000

Pricing

Input (1M): $10.000
Output (1M): $30.000
Blended (3:1): $15.000

Performance

Tokens / s: 38.4
TTFT (token): 0.78s
TTFT (answer): 0.78s

Benchmarks

MMLU Pro 69.4%

GPQA —

HLE 3.3%

AIME 15.0%

LiveCodeBench 29.1%

SciCode 31.9%

Math 500 73.7%

AA Indexes

Intelligence

24.2%

Coding

21.5%

Math

—

Hermes 4 - Llama-3.1 70B (Non-reasoning)

Nous Research

Input: $0.130

Output: $0.400

Pricing

Input (1M): $0.130
Output (1M): $0.400
Blended (3:1): $0.198

Performance

Tokens / s: 66.8
TTFT (token): 0.65s
TTFT (answer): 0.65s

Benchmarks

MMLU Pro 66.4%

GPQA 49.1%

HLE 3.6%

AIME —

LiveCodeBench 26.9%

SciCode 27.7%

Math 500 —

AA Indexes

Intelligence

23.8%

Coding

18.2%

Math

11.3%

Llama 3.1 Nemotron Instruct 70B

NVIDIA

Input: $0.600

Output: $0.600

Pricing

Input (1M): $0.600
Output (1M): $0.600
Blended (3:1): $0.600

Performance

Tokens / s: 38.7
TTFT (token): 0.34s
TTFT (answer): 0.34s

Benchmarks

MMLU Pro 69.0%

GPQA 46.5%

HLE 4.6%

AIME 24.7%

LiveCodeBench 16.9%

SciCode 23.3%

Math 500 73.3%

AA Indexes

Intelligence

23.6%

Coding

14.8%

Math

11.0%

Qwen3 8B (Non-reasoning)

Alibaba

Input: $0.180

Output: $0.700

Pricing

Input (1M): $0.180
Output (1M): $0.700
Blended (3:1): $0.310

Performance

Tokens / s: 81.6
TTFT (token): 1.18s
TTFT (answer): 1.18s

Benchmarks

MMLU Pro 64.3%

GPQA 45.2%

HLE 2.8%

AIME 24.3%

LiveCodeBench 20.2%

SciCode 16.8%

Math 500 82.8%

AA Indexes

Intelligence

22.9%

Coding

13.0%

Math

24.3%

Granite 4.0 H Small

IBM

Input: $0.060

Output: $0.250

Pricing

Input (1M): $0.060
Output (1M): $0.250
Blended (3:1): $0.107

Performance

Tokens / s: 395.0
TTFT (token): 8.86s
TTFT (answer): 8.86s

Benchmarks

MMLU Pro 62.4%

GPQA 41.6%

HLE 3.7%

AIME —

LiveCodeBench 25.1%

SciCode 20.9%

Math 500 —

AA Indexes

Intelligence

22.7%

Coding

16.1%

Math

13.7%

Phi-4

Microsoft Azure

Input: $0.125

Output: $0.500

Pricing

Input (1M): $0.125
Output (1M): $0.500
Blended (3:1): $0.219

Performance

Tokens / s: 11.5
TTFT (token): 0.53s
TTFT (answer): 0.53s

Benchmarks

MMLU Pro 71.4%

GPQA 57.5%

HLE 4.1%

AIME 14.3%

LiveCodeBench 23.1%

SciCode 26.0%

Math 500 81.0%

AA Indexes

Intelligence

22.7%

Coding

17.6%

Math

18.0%

Llama 3.1 Instruct 70B

Meta

Input: $0.560

Output: $0.560

Pricing

Input (1M): $0.560
Output (1M): $0.560
Blended (3:1): $0.560

Performance

Tokens / s: 35.1
TTFT (token): 0.52s
TTFT (answer): 0.52s

Benchmarks

MMLU Pro 67.6%

GPQA 40.9%

HLE 4.6%

AIME 17.3%

LiveCodeBench 23.2%

SciCode 26.7%

Math 500 64.9%

AA Indexes

Intelligence

22.6%

Coding

17.6%

Math

4.0%

Qwen3 1.7B (Reasoning)

Alibaba

Input: $0.110

Output: $1.260

Pricing

Input (1M): $0.110
Output (1M): $1.260
Blended (3:1): $0.398

Performance

Tokens / s: 123.6
TTFT (token): 1.00s
TTFT (answer): 17.18s

Benchmarks

MMLU Pro 57.0%

GPQA 35.6%

HLE 4.8%

AIME 51.0%

LiveCodeBench 30.8%

SciCode 4.3%

Math 500 89.4%

AA Indexes

Intelligence

22.4%

Coding

11.7%

Math

38.7%

Mistral Large 2 (Jul '24)

Mistral

Input: $2.000

Output: $6.000

Pricing

Input (1M): $2.000
Output (1M): $6.000
Blended (3:1): $3.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 68.3%

GPQA 47.2%

HLE 3.2%

AIME 9.3%

LiveCodeBench 26.7%

SciCode 27.1%

Math 500 71.4%

AA Indexes

Intelligence

22.3%

Coding

—

Math

—

CompactifAI Llama 4 Scout Slim

Multiverse Computing

Input: $0.070

Output: $0.100

Pricing

Input (1M): $0.070
Output (1M): $0.100
Blended (3:1): $0.077

Performance

Tokens / s: 116.7
TTFT (token): 0.31s
TTFT (answer): 0.31s

Benchmarks

MMLU Pro 70.3%

GPQA 42.6%

HLE 4.3%

AIME —

LiveCodeBench 18.6%

SciCode 17.9%

Math 500 —

AA Indexes

Intelligence

22.1%

Coding

12.9%

Math

10.0%

Qwen2.5 Coder Instruct 32B

Alibaba

Input: $0.130

Output: $0.175

Pricing

Input (1M): $0.130
Output (1M): $0.175
Blended (3:1): $0.141

Performance

Tokens / s: 54.1
TTFT (token): 0.64s
TTFT (answer): 0.64s

Benchmarks

MMLU Pro 63.5%

GPQA 41.7%

HLE 3.8%

AIME 12.0%

LiveCodeBench 29.5%

SciCode 27.1%

Math 500 76.7%

AA Indexes

Intelligence

21.8%

Coding

—

Math

—

Nova Lite

Amazon

Input: $0.060

Output: $0.240

Pricing

Input (1M): $0.060
Output (1M): $0.240
Blended (3:1): $0.105

Performance

Tokens / s: 150.5
TTFT (token): 0.36s
TTFT (answer): 0.36s

Benchmarks

MMLU Pro 59.0%

GPQA 43.3%

HLE 4.6%

AIME 10.7%

LiveCodeBench 16.7%

SciCode 13.9%

Math 500 76.5%

AA Indexes

Intelligence

21.5%

Coding

10.4%

Math

7.0%

GPT-4

OpenAI

Input: $30.000

Output: $60.000

Pricing

Input (1M): $30.000
Output (1M): $60.000
Blended (3:1): $37.500

Performance

Tokens / s: 30.1
TTFT (token): 0.80s
TTFT (answer): 0.80s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 —

AA Indexes

Intelligence

21.5%

Coding

13.1%

Math

—

Mistral Small 3

Mistral

Input: $0.100

Output: $0.300

Pricing

Input (1M): $0.100
Output (1M): $0.300
Blended (3:1): $0.150

Performance

Tokens / s: 176.8
TTFT (token): 0.38s
TTFT (answer): 0.38s

Benchmarks

MMLU Pro 65.2%

GPQA 46.2%

HLE 4.1%

AIME 8.0%

LiveCodeBench 25.2%

SciCode 23.6%

Math 500 71.5%

AA Indexes

Intelligence

21.2%

Coding

—

Math

4.3%

GPT-4o mini

OpenAI

Input: $0.150

Output: $0.600

Pricing

Input (1M): $0.150
Output (1M): $0.600
Blended (3:1): $0.263

Performance

Tokens / s: 54.5
TTFT (token): 0.56s
TTFT (answer): 0.56s

Benchmarks

MMLU Pro 64.8%

GPQA 42.6%

HLE 4.0%

AIME 11.7%

LiveCodeBench 23.4%

SciCode 22.9%

Math 500 78.9%

AA Indexes

Intelligence

21.2%

Coding

—

Math

14.7%

Jamba 1.7 Large

AI21 Labs

Input: $2.000

Output: $8.000

Pricing

Input (1M): $2.000
Output (1M): $8.000
Blended (3:1): $3.500

Performance

Tokens / s: 34.7
TTFT (token): 0.75s
TTFT (answer): 0.75s

Benchmarks

MMLU Pro 57.7%

GPQA 39.0%

HLE 3.8%

AIME 5.7%

LiveCodeBench 18.1%

SciCode 18.8%

Math 500 60.0%

AA Indexes

Intelligence

20.8%

Coding

13.0%

Math

2.3%

Qwen3 4B (Non-reasoning)

Alibaba

Input: $0.110

Output: $0.420

Pricing

Input (1M): $0.110
Output (1M): $0.420
Blended (3:1): $0.188

Performance

Tokens / s: 80.1
TTFT (token): 1.12s
TTFT (answer): 1.12s

Benchmarks

MMLU Pro 58.6%

GPQA 39.8%

HLE 3.7%

AIME 21.3%

LiveCodeBench 23.3%

SciCode 16.7%

Math 500 84.3%

AA Indexes

Intelligence

20.7%

Coding

—

Math

—

Claude 3 Opus

Anthropic

Input: $15.000

Output: $75.000

Pricing

Input (1M): $15.000
Output (1M): $75.000
Blended (3:1): $30.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 69.6%

GPQA 48.9%

HLE 3.1%

AIME 3.3%

LiveCodeBench 27.9%

SciCode 23.3%

Math 500 64.1%

AA Indexes

Intelligence

20.6%

Coding

19.5%

Math

—

Claude 3.5 Haiku

Anthropic

Input: $0.800

Output: $4.000

Pricing

Input (1M): $0.800
Output (1M): $4.000
Blended (3:1): $1.600

Performance

Tokens / s: 48.2
TTFT (token): 0.70s
TTFT (answer): 0.70s

Benchmarks

MMLU Pro 63.4%

GPQA 40.8%

HLE 3.5%

AIME 3.3%

LiveCodeBench 31.4%

SciCode 27.4%

Math 500 72.1%

AA Indexes

Intelligence

20.2%

Coding

—

Math

—

Codestral (Jan '25)

Mistral

Input: $0.300

Output: $0.900

Pricing

Input (1M): $0.300
Output (1M): $0.900
Blended (3:1): $0.450

Performance

Tokens / s: 235.0
TTFT (token): 0.42s
TTFT (answer): 0.42s

Benchmarks

MMLU Pro 44.6%

GPQA 31.2%

HLE 4.5%

AIME 4.3%

LiveCodeBench 24.3%

SciCode 24.7%

Math 500 60.7%

AA Indexes

Intelligence

20.1%

Coding

16.3%

Math

6.0%

Devstral Small (May '25)

Mistral

Input: $0.100

Output: $0.300

Pricing

Input (1M): $0.100
Output (1M): $0.300
Blended (3:1): $0.150

Performance

Tokens / s: 203.9
TTFT (token): 0.44s
TTFT (answer): 0.44s

Benchmarks

MMLU Pro 63.2%

GPQA 43.4%

HLE 4.0%

AIME 6.7%

LiveCodeBench 25.8%

SciCode 24.5%

Math 500 68.4%

AA Indexes

Intelligence

19.6%

Coding

—

Math

—

Reka Core

Reka AI

Input: $2.000

Output: $2.000

Pricing

Input (1M): $2.000
Output (1M): $2.000
Blended (3:1): $2.000

Performance

Tokens / s: 40.6
TTFT (token): 1.44s
TTFT (answer): 1.44s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 55.8%

AA Indexes

Intelligence

19.4%

Coding

—

Math

—

Qwen2.5 Turbo

Alibaba

Input: $0.050

Output: $0.200

Pricing

Input (1M): $0.050
Output (1M): $0.200
Blended (3:1): $0.087

Performance

Tokens / s: 74.2
TTFT (token): 1.25s
TTFT (answer): 1.25s

Benchmarks

MMLU Pro 63.3%

GPQA 41.0%

HLE 4.2%

AIME 12.0%

LiveCodeBench 16.3%

SciCode 15.3%

Math 500 80.5%

AA Indexes

Intelligence

19.1%

Coding

—

Math

—

Reka Flash (Sep '24)

Reka AI

Input: $0.200

Output: $0.800

Pricing

Input (1M): $0.200
Output (1M): $0.800
Blended (3:1): $0.350

Performance

Tokens / s: 68.8
TTFT (token): 1.31s
TTFT (answer): 1.31s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 52.9%

AA Indexes

Intelligence

19.1%

Coding

—

Math

—

Solar Mini

Upstage

Input: $0.150

Output: $0.150

Pricing

Input (1M): $0.150
Output (1M): $0.150
Blended (3:1): $0.150

Performance

Tokens / s: 79.2
TTFT (token): 1.04s
TTFT (answer): 1.04s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 33.1%

AA Indexes

Intelligence

18.9%

Coding

—

Math

—

Llama 3.2 Instruct 90B (Vision)

Meta

Input: $0.720

Output: $0.720

Pricing

Input (1M): $0.720
Output (1M): $0.720
Blended (3:1): $0.720

Performance

Tokens / s: 36.8
TTFT (token): 0.33s
TTFT (answer): 0.33s

Benchmarks

MMLU Pro 67.1%

GPQA 43.2%

HLE 4.9%

AIME 5.0%

LiveCodeBench 21.4%

SciCode 24.0%

Math 500 62.9%

AA Indexes

Intelligence

18.9%

Coding

—

Math

—

Reka Flash (Feb '24)

Reka AI

Input: $0.200

Output: $0.800

Pricing

Input (1M): $0.200
Output (1M): $0.800
Blended (3:1): $0.350

Performance

Tokens / s: 69.9
TTFT (token): 1.33s
TTFT (answer): 1.33s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 32.6%

AA Indexes

Intelligence

18.7%

Coding

—

Math

—

Reka Edge

Reka AI

Input: $0.100

Output: $0.100

Pricing

Input (1M): $0.100
Output (1M): $0.100
Blended (3:1): $0.100

Performance

Tokens / s: 63.3
TTFT (token): 1.30s
TTFT (answer): 1.30s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 21.6%

AA Indexes

Intelligence

18.5%

Coding

—

Math

—

Nova Micro

Amazon

Input: $0.035

Output: $0.140

Pricing

Input (1M): $0.035
Output (1M): $0.140
Blended (3:1): $0.061

Performance

Tokens / s: 263.9
TTFT (token): 0.33s
TTFT (answer): 0.33s

Benchmarks

MMLU Pro 53.1%

GPQA 35.8%

HLE 4.7%

AIME 8.0%

LiveCodeBench 14.0%

SciCode 9.4%

Math 500 70.3%

AA Indexes

Intelligence

17.7%

Coding

8.3%

Math

6.0%

Llama 3.1 Instruct 8B

Meta

Input: $0.100

Output: $0.100

Pricing

Input (1M): $0.100
Output (1M): $0.100
Blended (3:1): $0.100

Performance

Tokens / s: 174.3
TTFT (token): 0.35s
TTFT (answer): 0.35s

Benchmarks

MMLU Pro 47.6%

GPQA 25.9%

HLE 5.1%

AIME 7.7%

LiveCodeBench 11.6%

SciCode 13.2%

Math 500 51.9%

AA Indexes

Intelligence

16.9%

Coding

8.5%

Math

4.3%

CompactifAI Mistral Small 3.1 Slim

Multiverse Computing

Input: $0.050

Output: $0.080

Pricing

Input (1M): $0.050
Output (1M): $0.080
Blended (3:1): $0.058

Performance

Tokens / s: 121.6
TTFT (token): 0.31s
TTFT (answer): 0.31s

Benchmarks

MMLU Pro 53.8%

GPQA 32.9%

HLE 5.1%

AIME —

LiveCodeBench 16.8%

SciCode 12.8%

Math 500 —

AA Indexes

Intelligence

16.5%

Coding

10.1%

Math

100.0%

CompactifAI Llama 3.3 70B Slim

Multiverse Computing

Input: $0.160

Output: $0.310

Pricing

Input (1M): $0.160
Output (1M): $0.310
Blended (3:1): $0.198

Performance

Tokens / s: 130.7
TTFT (token): 0.30s
TTFT (answer): 0.30s

Benchmarks

MMLU Pro 57.1%

GPQA 35.5%

HLE 2.6%

AIME —

LiveCodeBench 21.0%

SciCode 3.2%

Math 500 —

AA Indexes

Intelligence

16.5%

Coding

8.1%

Math

3.3%

Llama 3.2 Instruct 11B (Vision)

Meta

Input: $0.160

Output: $0.160

Pricing

Input (1M): $0.160
Output (1M): $0.160
Blended (3:1): $0.160

Performance

Tokens / s: 69.3
TTFT (token): 0.37s
TTFT (answer): 0.37s

Benchmarks

MMLU Pro 46.4%

GPQA 22.1%

HLE 5.2%

AIME 9.3%

LiveCodeBench 11.0%

SciCode 11.2%

Math 500 51.6%

AA Indexes

Intelligence

15.5%

Coding

7.7%

Math

1.7%

Gemma 3n E4B Instruct

Google

Input: $0.020

Output: $0.040

Pricing

Input (1M): $0.020
Output (1M): $0.040
Blended (3:1): $0.025

Performance

Tokens / s: 41.3
TTFT (token): 0.51s
TTFT (answer): 0.51s

Benchmarks

MMLU Pro 48.8%

GPQA 29.6%

HLE 4.4%

AIME 13.7%

LiveCodeBench 14.6%

SciCode 8.1%

Math 500 77.1%

AA Indexes

Intelligence

15.5%

Coding

8.3%

Math

14.3%

Granite 3.3 8B (Non-reasoning)

IBM

Input: $0.030

Output: $0.250

Pricing

Input (1M): $0.030
Output (1M): $0.250
Blended (3:1): $0.085

Performance

Tokens / s: 458.0
TTFT (token): 7.32s
TTFT (answer): 7.32s

Benchmarks

MMLU Pro 46.8%

GPQA 33.8%

HLE 4.2%

AIME 4.7%

LiveCodeBench 12.7%

SciCode 10.1%

Math 500 66.5%

AA Indexes

Intelligence

15.2%

Coding

7.6%

Math

6.7%

Jamba 1.7 Mini

AI21 Labs

Input: $0.200

Output: $0.400

Pricing

Input (1M): $0.200
Output (1M): $0.400
Blended (3:1): $0.250

Performance

Tokens / s: 153.5
TTFT (token): 0.68s
TTFT (answer): 0.68s

Benchmarks

MMLU Pro 38.8%

GPQA 32.2%

HLE 4.5%

AIME 1.3%

LiveCodeBench 6.1%

SciCode 9.3%

Math 500 25.8%

AA Indexes

Intelligence

14.8%

Coding

5.1%

Math

30.0%

Jamba 1.5 Large

AI21 Labs

Input: $2.000

Output: $8.000

Pricing

Input (1M): $2.000
Output (1M): $8.000
Blended (3:1): $3.500

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 57.2%

GPQA 42.7%

HLE 4.0%

AIME 4.7%

LiveCodeBench 14.3%

SciCode 16.3%

Math 500 60.6%

AA Indexes

Intelligence

14.8%

Coding

—

Math

—

Hermes 3 - Llama-3.1 70B

Nous Research

Input: $0.300

Output: $0.300

Pricing

Input (1M): $0.300
Output (1M): $0.300
Blended (3:1): $0.300

Performance

Tokens / s: 37.8
TTFT (token): 0.28s
TTFT (answer): 0.28s

Benchmarks

MMLU Pro 57.1%

GPQA 40.1%

HLE 4.1%

AIME 2.3%

LiveCodeBench 18.8%

SciCode 23.1%

Math 500 53.8%

AA Indexes

Intelligence

14.7%

Coding

—

Math

—

OLMo 2 32B

Allen Institute for AI

Input: $0.200

Output: $0.350

Pricing

Input (1M): $0.200
Output (1M): $0.350
Blended (3:1): $0.237

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 51.1%

GPQA 32.8%

HLE 3.7%

AIME —

LiveCodeBench 6.8%

SciCode 8.0%

Math 500 —

AA Indexes

Intelligence

14.4%

Coding

4.9%

Math

3.3%

Phi-3 Medium Instruct 14B

Microsoft Azure

Input: $0.170

Output: $0.680

Pricing

Input (1M): $0.170
Output (1M): $0.680
Blended (3:1): $0.297

Performance

Tokens / s: 42.3
TTFT (token): 0.41s
TTFT (answer): 0.41s

Benchmarks

MMLU Pro 54.3%

GPQA 32.6%

HLE 4.5%

AIME 1.3%

LiveCodeBench 15.0%

SciCode 11.8%

Math 500 46.3%

AA Indexes

Intelligence

14.4%

Coding

8.9%

Math

1.3%

Qwen3 1.7B (Non-reasoning)

Alibaba

Input: $0.110

Output: $0.420

Pricing

Input (1M): $0.110
Output (1M): $0.420
Blended (3:1): $0.188

Performance

Tokens / s: 115.3
TTFT (token): 1.07s
TTFT (answer): 1.07s

Benchmarks

MMLU Pro 41.1%

GPQA 28.3%

HLE 5.2%

AIME 9.7%

LiveCodeBench 12.6%

SciCode 6.9%

Math 500 71.7%

AA Indexes

Intelligence

14.4%

Coding

6.5%

Math

7.3%

Jamba 1.6 Large

AI21 Labs

Input: $2.000

Output: $8.000

Pricing

Input (1M): $2.000
Output (1M): $8.000
Blended (3:1): $3.500

Performance

Tokens / s: 34.9
TTFT (token): 0.83s
TTFT (answer): 0.83s

Benchmarks

MMLU Pro 56.5%

GPQA 38.7%

HLE 4.0%

AIME 4.7%

LiveCodeBench 17.2%

SciCode 18.4%

Math 500 58.0%

AA Indexes

Intelligence

14.3%

Coding

—

Math

—

Qwen3 0.6B (Reasoning)

Alibaba

Input: $0.110

Output: $1.260

Pricing

Input (1M): $0.110
Output (1M): $1.260
Blended (3:1): $0.398

Performance

Tokens / s: 201.3
TTFT (token): 1.00s
TTFT (answer): 10.94s

Benchmarks

MMLU Pro 34.7%

GPQA 23.9%

HLE 5.7%

AIME 10.0%

LiveCodeBench 12.1%

SciCode 2.8%

Math 500 75.0%

AA Indexes

Intelligence

14.2%

Coding

5.0%

Math

18.0%

Aya Expanse 32B

Cohere

Input: $0.500

Output: $1.500

Pricing

Input (1M): $0.500
Output (1M): $1.500
Blended (3:1): $0.750

Performance

Tokens / s: 41.9
TTFT (token): 0.31s
TTFT (answer): 0.31s

Benchmarks

MMLU Pro 37.7%

GPQA 23.0%

HLE 4.5%

AIME —

LiveCodeBench 13.7%

SciCode 14.9%

Math 500 44.9%

AA Indexes

Intelligence

13.6%

Coding

9.8%

Math

2.3%

Claude 3 Sonnet

Anthropic

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 57.9%

GPQA 40.0%

HLE 3.8%

AIME 4.7%

LiveCodeBench 17.5%

SciCode 22.9%

Math 500 41.4%

AA Indexes

Intelligence

13.3%

Coding

—

Math

—

Llama 3 Instruct 70B

Meta

Input: $0.650

Output: $0.880

Pricing

Input (1M): $0.650
Output (1M): $0.880
Blended (3:1): $0.880

Performance

Tokens / s: 40.5
TTFT (token): 0.41s
TTFT (answer): 0.41s

Benchmarks

MMLU Pro 57.4%

GPQA 37.9%

HLE 4.4%

AIME —

LiveCodeBench 19.8%

SciCode 18.9%

Math 500 48.3%

AA Indexes

Intelligence

13.0%

Coding

—

Math

—

Mistral Small (Sep '24)

Mistral

Input: $0.200

Output: $0.600

Pricing

Input (1M): $0.200
Output (1M): $0.600
Blended (3:1): $0.300

Performance

Tokens / s: 84.9
TTFT (token): 0.36s
TTFT (answer): 0.36s

Benchmarks

MMLU Pro 52.9%

GPQA 38.1%

HLE 4.3%

AIME 6.3%

LiveCodeBench 14.1%

SciCode 15.6%

Math 500 56.3%

AA Indexes

Intelligence

13.0%

Coding

—

Math

—

Phi-3 Mini Instruct 3.8B

Microsoft Azure

Input: $0.130

Output: $0.520

Pricing

Input (1M): $0.130
Output (1M): $0.520
Blended (3:1): $0.228

Performance

Tokens / s: 69.1
TTFT (token): 0.36s
TTFT (answer): 0.36s

Benchmarks

MMLU Pro 43.5%

GPQA 31.9%

HLE 4.4%

AIME 4.0%

LiveCodeBench 11.6%

SciCode 9.0%

Math 500 45.7%

AA Indexes

Intelligence

12.7%

Coding

6.9%

Math

30.0%

Ministral 8B

Mistral

Input: $0.100

Output: $0.100

Pricing

Input (1M): $0.100
Output (1M): $0.100
Blended (3:1): $0.100

Performance

Tokens / s: 195.9
TTFT (token): 0.36s
TTFT (answer): 0.36s

Benchmarks

MMLU Pro 38.9%

GPQA 27.6%

HLE 4.9%

AIME 3.7%

LiveCodeBench 11.2%

SciCode 11.5%

Math 500 57.1%

AA Indexes

Intelligence

12.4%

Coding

7.6%

Math

3.0%

Mistral Large (Feb '24)

Mistral

Input: $4.000

Output: $12.000

Pricing

Input (1M): $4.000
Output (1M): $12.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 51.5%

GPQA 35.1%

HLE 3.4%

AIME —

LiveCodeBench 17.8%

SciCode 20.8%

Math 500 52.7%

AA Indexes

Intelligence

11.9%

Coding

—

Math

—

Llama 2 Chat 7B

Meta

Input: $0.050

Output: $0.250

Pricing

Input (1M): $0.050
Output (1M): $0.250
Blended (3:1): $0.100

Performance

Tokens / s: 112.2
TTFT (token): 4.37s
TTFT (answer): 4.37s

Benchmarks

MMLU Pro 16.4%

GPQA 22.7%

HLE 5.8%

AIME —

LiveCodeBench 0.2%

SciCode —

Math 500 5.9%

AA Indexes

Intelligence

11.3%

Coding

—

Math

—

CompactifAI Llama 3.1 8B Slim

Multiverse Computing

Input: $0.050

Output: $0.070

Pricing

Input (1M): $0.050
Output (1M): $0.070
Blended (3:1): $0.055

Performance

Tokens / s: 228.4
TTFT (token): 0.24s
TTFT (answer): 0.24s

Benchmarks

MMLU Pro 32.1%

GPQA 22.1%

HLE 5.5%

AIME —

LiveCodeBench 11.3%

SciCode 6.2%

Math 500 —

AA Indexes

Intelligence

11.2%

Coding

5.9%

Math

6.7%

Llama 3.2 Instruct 3B

Meta

Input: $0.060

Output: $0.060

Pricing

Input (1M): $0.060
Output (1M): $0.060
Blended (3:1): $0.060

Performance

Tokens / s: 114.4
TTFT (token): 0.48s
TTFT (answer): 0.48s

Benchmarks

MMLU Pro 34.7%

GPQA 25.5%

HLE 5.2%

AIME 6.7%

LiveCodeBench 8.3%

SciCode 5.2%

Math 500 48.9%

AA Indexes

Intelligence

11.2%

Coding

—

Math

3.3%

Qwen3 0.6B (Non-reasoning)

Alibaba

Input: $0.110

Output: $0.420

Pricing

Input (1M): $0.110
Output (1M): $0.420
Blended (3:1): $0.188

Performance

Tokens / s: 191.3
TTFT (token): 1.04s
TTFT (answer): 1.04s

Benchmarks

MMLU Pro 23.1%

GPQA 23.1%

HLE 5.2%

AIME 1.7%

LiveCodeBench 7.3%

SciCode 4.1%

Math 500 52.1%

AA Indexes

Intelligence

11.0%

Coding

3.8%

Math

10.3%

Ministral 3B

Mistral

Input: $0.040

Output: $0.040

Pricing

Input (1M): $0.040
Output (1M): $0.040
Blended (3:1): $0.040

Performance

Tokens / s: 272.6
TTFT (token): 0.35s
TTFT (answer): 0.35s

Benchmarks

MMLU Pro 33.9%

GPQA 26.0%

HLE 5.5%

AIME —

LiveCodeBench 6.9%

SciCode 9.4%

Math 500 53.7%

AA Indexes

Intelligence

10.9%

Coding

5.4%

Math

30.0%

Aya Expanse 8B

Cohere

Input: $0.500

Output: $1.500

Pricing

Input (1M): $0.500
Output (1M): $1.500
Blended (3:1): $0.750

Performance

Tokens / s: 80.9
TTFT (token): 0.23s
TTFT (answer): 0.23s

Benchmarks

MMLU Pro 31.2%

GPQA 24.7%

HLE 5.1%

AIME —

LiveCodeBench 7.0%

SciCode 7.8%

Math 500 32.1%

AA Indexes

Intelligence

10.0%

Coding

4.9%

Math

—

Claude 3 Haiku

Anthropic

Input: $0.250

Output: $1.250

Pricing

Input (1M): $0.250
Output (1M): $1.250
Blended (3:1): $0.500

Performance

Tokens / s: 116.9
TTFT (token): 0.39s
TTFT (answer): 0.39s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME 1.0%

LiveCodeBench 15.4%

SciCode 18.6%

Math 500 39.4%

AA Indexes

Intelligence

9.6%

Coding

—

Math

—

Llama 3.2 Instruct 1B

Meta

Input: $0.053

Output: $0.055

Pricing

Input (1M): $0.053
Output (1M): $0.055
Blended (3:1): $0.053

Performance

Tokens / s: 74.6
TTFT (token): 0.43s
TTFT (answer): 0.43s

Benchmarks

MMLU Pro 20.0%

GPQA 19.6%

HLE 5.3%

AIME —

LiveCodeBench 1.9%

SciCode 1.7%

Math 500 14.0%

AA Indexes

Intelligence

8.9%

Coding

1.2%

Math

—

Pixtral 12B (2409)

Mistral

Input: $0.150

Output: $0.150

Pricing

Input (1M): $0.150
Output (1M): $0.150
Blended (3:1): $0.150

Performance

Tokens / s: 144.9
TTFT (token): 0.43s
TTFT (answer): 0.43s

Benchmarks

MMLU Pro 47.3%

GPQA 34.3%

HLE 5.3%

AIME —

LiveCodeBench 11.5%

SciCode 13.5%

Math 500 45.8%

AA Indexes

Intelligence

8.9%

Coding

—

Math

—

Mistral Small (Feb '24)

Mistral

Input: $1.000

Output: $3.000

Pricing

Input (1M): $1.000
Output (1M): $3.000
Blended (3:1): $1.500

Performance

Tokens / s: 163.9
TTFT (token): 0.29s
TTFT (answer): 0.29s

Benchmarks

MMLU Pro 41.9%

GPQA 30.2%

HLE 4.4%

AIME 0.7%

LiveCodeBench 11.1%

SciCode 13.4%

Math 500 56.2%

AA Indexes

Intelligence

8.5%

Coding

—

Math

—

Mistral Medium

Mistral

Input: $2.750

Output: $8.100

Pricing

Input (1M): $2.750
Output (1M): $8.100
Blended (3:1): $4.088

Performance

Tokens / s: 64.3
TTFT (token): 0.44s
TTFT (answer): 0.44s

Benchmarks

MMLU Pro 49.1%

GPQA 34.9%

HLE 3.4%

AIME 3.7%

LiveCodeBench 9.9%

SciCode 11.8%

Math 500 40.5%

AA Indexes

Intelligence

8.4%

Coding

—

Math

—

GPT-3.5 Turbo

OpenAI

Input: $0.500

Output: $1.500

Pricing

Input (1M): $0.500
Output (1M): $1.500
Blended (3:1): $0.750

Performance

Tokens / s: 84.9
TTFT (token): 0.43s
TTFT (answer): 0.43s

Benchmarks

MMLU Pro 46.2%

GPQA 29.7%

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 44.1%

AA Indexes

Intelligence

8.3%

Coding

10.7%

Math

—

Gemma 2 9B

Google

Input: $0.030

Output: $0.090

Pricing

Input (1M): $0.030
Output (1M): $0.090
Blended (3:1): $0.045

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 49.5%

GPQA 31.1%

HLE 3.9%

AIME —

LiveCodeBench 12.6%

SciCode 0.7%

Math 500 51.7%

AA Indexes

Intelligence

7.8%

Coding

—

Math

—

Command-R+ (Aug '24)

Cohere

Input: $2.500

Output: $10.000

Pricing

Input (1M): $2.500
Output (1M): $10.000
Blended (3:1): $4.375

Performance

Tokens / s: 20.4
TTFT (token): 0.54s
TTFT (answer): 0.54s

Benchmarks

MMLU Pro 42.7%

GPQA 33.7%

HLE 5.0%

AIME —

LiveCodeBench 11.1%

SciCode 12.2%

Math 500 40.2%

AA Indexes

Intelligence

7.1%

Coding

—

Math

—

Llama 3 Instruct 8B

Meta

Input: $0.045

Output: $0.155

Pricing

Input (1M): $0.045
Output (1M): $0.155
Blended (3:1): $0.070

Performance

Tokens / s: 67.6
TTFT (token): 0.40s
TTFT (answer): 0.40s

Benchmarks

MMLU Pro 40.5%

GPQA 29.6%

HLE 5.1%

AIME —

LiveCodeBench 9.6%

SciCode 11.9%

Math 500 49.9%

AA Indexes

Intelligence

7.0%

Coding

—

Math

—

Command-R+ (Apr '24)

Cohere

Input: $3.000

Output: $15.000

Pricing

Input (1M): $3.000
Output (1M): $15.000
Blended (3:1): $6.000

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 43.2%

GPQA 32.3%

HLE 4.5%

AIME 0.7%

LiveCodeBench 12.2%

SciCode 11.8%

Math 500 27.9%

AA Indexes

Intelligence

5.5%

Coding

—

Math

—

Mistral NeMo

Mistral

Input: $0.150

Output: $0.150

Pricing

Input (1M): $0.150
Output (1M): $0.150
Blended (3:1): $0.150

Performance

Tokens / s: 188.3
TTFT (token): 0.35s
TTFT (answer): 0.35s

Benchmarks

MMLU Pro 39.9%

GPQA 31.4%

HLE 4.4%

AIME 0.3%

LiveCodeBench 5.7%

SciCode 10.4%

Math 500 39.5%

AA Indexes

Intelligence

5.2%

Coding

—

Math

—

Jamba 1.5 Mini

AI21 Labs

Input: $0.200

Output: $0.400

Pricing

Input (1M): $0.200
Output (1M): $0.400
Blended (3:1): $0.250

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 37.1%

GPQA 30.2%

HLE 5.1%

AIME 1.0%

LiveCodeBench 6.2%

SciCode 8.0%

Math 500 35.7%

AA Indexes

Intelligence

4.0%

Coding

—

Math

—

Jamba 1.6 Mini

AI21 Labs

Input: $0.200

Output: $0.400

Pricing

Input (1M): $0.200
Output (1M): $0.400
Blended (3:1): $0.250

Performance

Tokens / s: 150.6
TTFT (token): 0.69s
TTFT (answer): 0.69s

Benchmarks

MMLU Pro 36.7%

GPQA 30.0%

HLE 4.6%

AIME 3.3%

LiveCodeBench 7.1%

SciCode 10.1%

Math 500 25.7%

AA Indexes

Intelligence

3.3%

Coding

—

Math

—

Mixtral 8x7B Instruct

Mistral

Input: $0.540

Output: $0.600

Pricing

Input (1M): $0.540
Output (1M): $0.600
Blended (3:1): $0.540

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 38.7%

GPQA 29.2%

HLE 4.5%

AIME —

LiveCodeBench 6.6%

SciCode 2.8%

Math 500 29.9%

AA Indexes

Intelligence

2.6%

Coding

—

Math

—

Command-R (Mar '24)

Cohere

Input: $0.500

Output: $1.500

Pricing

Input (1M): $0.500
Output (1M): $1.500
Blended (3:1): $0.750

Performance

Tokens / s: —
TTFT (token): —
TTFT (answer): —

Benchmarks

MMLU Pro 33.8%

GPQA 28.4%

HLE 4.8%

AIME 0.7%

LiveCodeBench 4.8%

SciCode 6.2%

Math 500 16.4%

AA Indexes

Intelligence

100.0%

Coding

—

Math

—

Command-R (Aug '24)

Cohere

Input: $0.150

Output: $0.600

Pricing

Input (1M): $0.150
Output (1M): $0.600
Blended (3:1): $0.263

Performance

Tokens / s: 58.9
TTFT (token): 0.27s
TTFT (answer): 0.27s

Benchmarks

MMLU Pro 33.7%

GPQA 28.9%

HLE 5.1%

AIME 0.3%

LiveCodeBench 4.4%

SciCode 8.7%

Math 500 14.9%

AA Indexes

Intelligence

100.0%

Coding

—

Math

—

Mistral 7B Instruct

Mistral

Input: $0.250

Output: $0.250

Pricing

Input (1M): $0.250
Output (1M): $0.250
Blended (3:1): $0.250

Performance

Tokens / s: 119.4
TTFT (token): 0.35s
TTFT (answer): 0.35s

Benchmarks

MMLU Pro 24.5%

GPQA 17.7%

HLE 4.3%

AIME —

LiveCodeBench 4.6%

SciCode 2.4%

Math 500 12.1%

AA Indexes

Intelligence

100.0%

Coding

—

Math

—

Cogito v2.1 (Reasoning)

Deep Cogito

Input: $1.250

Output: $1.250

Pricing

Input (1M): $1.250
Output (1M): $1.250
Blended (3:1): $1.250

Performance

Tokens / s: 75.3
TTFT (token): 0.31s
TTFT (answer): 26.86s

Benchmarks

MMLU Pro 84.9%

GPQA 76.8%

HLE 11.0%

AIME —

LiveCodeBench 68.8%

SciCode 41.0%

Math 500 —

AA Indexes

Intelligence

—

Coding

41.8%

Math

72.7%

DeepSeek-OCR

DeepSeek

Input: $0.030

Output: $0.100

Pricing

Input (1M): $0.030
Output (1M): $0.100
Blended (3:1): $0.048

Performance

Tokens / s: 305.1
TTFT (token): 0.19s
TTFT (answer): 0.19s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 —

AA Indexes

Intelligence

—

Coding

—

Math

—

Grok 3 mini Reasoning (Low)

xAI

Input: $0.300

Output: $0.500

Pricing

Input (1M): $0.300
Output (1M): $0.500
Blended (3:1): $0.350

Performance

Tokens / s: 110.8
TTFT (token): 0.56s
TTFT (answer): 18.62s

Benchmarks

MMLU Pro —

GPQA —

HLE —

AIME —

LiveCodeBench —

SciCode —

Math 500 —

AA Indexes

Intelligence

—

Coding

—

Math

—

Model	Provider	Input	Output	MMLU	GPQA	TPS
Gemini 3 Pro Preview (high)	Google	$2.000	$12.000	89.8%	90.8%	-
Claude Opus 4.5 (Reasoning)	Anthropic	$5.000	$25.000	89.5%	86.6%	44.7
GPT-5.1 (high)	OpenAI	$1.250	$10.000	87.0%	87.3%	147.0
GPT-5 Codex (high)	OpenAI	$1.250	$10.000	86.5%	83.7%	232.9
GPT-5 (high)	OpenAI	$1.250	$10.000	87.1%	85.4%	-
Kimi K2 Thinking	Moonshot AI	$0.600	$2.500	84.8%	83.8%	89.8
GPT-5 (medium)	OpenAI	$1.250	$10.000	86.7%	84.2%	-
o3	OpenAI	$2.000	$8.000	85.3%	82.7%	212.4
Grok 4	xAI	$3.000	$15.000	86.6%	87.7%	54.3
o3-pro	OpenAI	$20.000	$80.000	—	84.5%	49.9
Gemini 3 Pro Preview (low)	Google	$2.000	$12.000	89.5%	88.7%	-
GPT-5 mini (high)	OpenAI	$0.250	$2.000	83.7%	82.8%	73.2
Grok 4.1 Fast (Reasoning)	xAI	$0.200	$0.500	85.4%	85.3%	84.4
Claude 4.5 Sonnet (Reasoning)	Anthropic	$3.000	$15.000	87.5%	83.4%	69.5
GPT-5 (low)	OpenAI	$1.250	$10.000	86.0%	80.8%	129.5
MiniMax-M2	MiniMax	$0.300	$1.200	82.0%	77.7%	108.4
GPT-5 mini (medium)	OpenAI	$0.250	$2.000	82.8%	80.3%	79.0
gpt-oss-120B (high)	OpenAI	$0.150	$0.600	80.8%	78.2%	353.7
Grok 4 Fast (Reasoning)	xAI	$0.200	$0.500	85.0%	84.7%	226.6
Claude Opus 4.5 (Non-reasoning)	Anthropic	$5.000	$25.000	88.9%	81.0%	65.3
Gemini 2.5 Pro	Google	$1.250	$10.000	86.2%	84.4%	41.8
o4-mini (high)	OpenAI	$1.100	$4.400	83.2%	78.4%	-
Claude 4.1 Opus (Reasoning)	Anthropic	$15.000	$75.000	88.0%	80.9%	38.9
DeepSeek V3.1 Terminus (Reasoning)	DeepSeek	$0.400	$2.000	85.1%	79.2%	-
Qwen3 235B A22B 2507 (Reasoning)	Alibaba	$0.700	$8.400	84.3%	79.0%	80.8
Grok 3 mini Reasoning (high)	xAI	$0.300	$0.500	82.8%	79.1%	-
Doubao Seed Code	ByteDance Seed	$0.170	$1.120	85.4%	76.4%	-
DeepSeek V3.2 Exp (Reasoning)	DeepSeek	$0.280	$0.420	85.0%	79.7%	29.1
Claude 4 Sonnet (Reasoning)	Anthropic	$3.000	$15.000	84.2%	77.7%	65.7
GLM-4.6 (Reasoning)	Z AI	$0.600	$2.200	82.9%	78.0%	112.0
Qwen3 Max Thinking	Alibaba	$1.200	$6.000	82.4%	77.6%	36.1
Qwen3 Max	Alibaba	$1.200	$6.000	84.1%	76.4%	27.9
Claude 4.5 Haiku (Reasoning)	Anthropic	$1.000	$5.000	76.0%	67.2%	83.9
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)	Google	$0.300	$2.500	84.2%	79.3%	142.3
Qwen3 VL 235B A22B (Reasoning)	Alibaba	$0.700	$8.400	83.6%	77.2%	43.8
Qwen3 Next 80B A3B (Reasoning)	Alibaba	$0.500	$6.000	82.4%	75.9%	-
Claude 4 Opus (Reasoning)	Anthropic	$15.000	$75.000	87.3%	79.6%	41.2
Gemini 2.5 Pro Preview (Mar' 25)	Google	$1.250	$10.000	85.8%	83.6%	39.8
DeepSeek V3.1 (Reasoning)	DeepSeek	$0.425	$1.340	85.1%	77.9%	-
Gemini 2.5 Pro Preview (May' 25)	Google	$1.250	$10.000	83.7%	82.2%	40.7
gpt-oss-20B (high)	OpenAI	$0.070	$0.200	74.8%	68.8%	236.6
Magistral Medium 1.2	Mistral	$2.000	$5.000	81.5%	73.9%	102.6
DeepSeek R1 0528 (May '25)	DeepSeek	$1.350	$4.000	84.9%	81.3%	-
Qwen3 VL 32B (Reasoning)	Alibaba	$0.700	$8.400	81.8%	73.3%	51.1
Seed-OSS-36B-Instruct	ByteDance Seed	$0.210	$0.570	81.5%	72.6%	27.1
GLM-4.5 (Reasoning)	Z AI	$0.575	$2.195	83.5%	78.2%	51.3
Gemini 2.5 Flash (Reasoning)	Google	$0.300	$2.500	83.2%	79.0%	146.5
GPT-5 nano (high)	OpenAI	$0.050	$0.400	78.0%	67.6%	-
o3-mini (high)	OpenAI	$1.100	$4.400	80.2%	77.3%	143.0
Kimi K2 0905	Moonshot AI	$0.990	$2.500	81.9%	76.7%	93.9
Claude 3.7 Sonnet (Reasoning)	Anthropic	$3.000	$15.000	83.7%	77.2%	-
Claude 4.5 Sonnet (Non-reasoning)	Anthropic	$3.000	$15.000	86.0%	72.7%	72.4
GPT-5 nano (medium)	OpenAI	$0.050	$0.400	77.2%	67.0%	-
GLM-4.5-Air	Z AI	$0.200	$1.100	81.5%	73.3%	104.8
Grok Code Fast 1	xAI	$0.200	$1.500	79.3%	72.7%	209.0
Qwen3 Max (Preview)	Alibaba	$1.200	$6.000	83.8%	76.4%	30.1
Kimi K2	Moonshot AI	$0.600	$2.500	82.4%	76.6%	55.9
o3-mini	OpenAI	$1.100	$4.400	79.1%	74.8%	132.2
o1-pro	OpenAI	$150.000	$600.000	—	—	-
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)	Google	$0.100	$0.400	80.8%	70.9%	6.2
gpt-oss-120B (low)	OpenAI	$0.150	$0.595	77.5%	67.2%	337.9
o1	OpenAI	$15.000	$60.000	84.1%	74.7%	189.5
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)	Google	$0.300	$2.500	83.6%	76.6%	231.1
Qwen3 30B A3B 2507 (Reasoning)	Alibaba	$0.200	$2.400	80.5%	70.7%	187.6
DeepSeek V3.2 Exp (Non-reasoning)	DeepSeek	$0.280	$0.420	83.6%	73.8%	26.8
MiniMax M1 80k	MiniMax	$0.400	$2.100	81.6%	69.7%	-
DeepSeek V3.1 Terminus (Non-reasoning)	DeepSeek	$0.400	$1.680	83.6%	75.1%	-
Qwen3 235B A22B 2507 Instruct	Alibaba	$0.700	$2.800	82.8%	75.3%	49.5
Grok 3	xAI	$3.000	$15.000	79.9%	69.3%	53.3
Qwen3 VL 30B A3B (Reasoning)	Alibaba	$0.200	$2.400	80.7%	72.0%	106.5
Llama Nemotron Super 49B v1.5 (Reasoning)	NVIDIA	$0.100	$0.400	81.4%	74.8%	77.7
o1-preview	OpenAI	$16.500	$66.000	—	—	-
Qwen3 Next 80B A3B Instruct	Alibaba	$0.500	$2.000	81.9%	73.8%	158.6
DeepSeek V3.1 (Non-reasoning)	DeepSeek	$0.560	$1.660	83.3%	73.5%	-
Ling-1T	InclusionAI	$0.570	$2.280	82.2%	71.9%	-
GLM-4.6 (Non-reasoning)	Z AI	$0.600	$2.200	78.4%	63.2%	44.2
Claude 4.1 Opus (Non-reasoning)	Anthropic	$15.000	$75.000	—	—	38.7
Claude 4 Sonnet (Non-reasoning)	Anthropic	$3.000	$15.000	83.7%	68.3%	60.0
gpt-oss-20B (low)	OpenAI	$0.070	$0.200	71.8%	61.1%	234.0
Qwen3 VL 235B A22B Instruct	Alibaba	$0.700	$2.800	82.3%	71.2%	34.5
DeepSeek R1 (Jan '25)	DeepSeek	$1.350	$4.000	84.4%	70.8%	-
GPT-5 (minimal)	OpenAI	$1.250	$10.000	80.6%	67.3%	81.1
GPT-4.1	OpenAI	$2.000	$8.000	80.6%	66.6%	123.2
Magistral Small 1.2	Mistral	$0.500	$1.500	76.8%	66.3%	194.8
GPT-5.1 (Non-reasoning)	OpenAI	$1.250	$10.000	80.1%	64.3%	89.8
EXAONE 4.0 32B (Reasoning)	LG AI Research	$0.600	$1.000	81.8%	73.9%	106.4
GPT-4.1 mini	OpenAI	$0.400	$1.600	78.1%	66.4%	77.8
Claude 4 Opus (Non-reasoning)	Anthropic	$15.000	$75.000	86.0%	70.1%	40.7
Qwen3 Coder 480B A35B Instruct	Alibaba	$1.500	$7.500	78.8%	61.8%	49.3
GPT-5 (ChatGPT)	OpenAI	$1.250	$10.000	82.0%	68.6%	165.8
Ring-1T	InclusionAI	$0.570	$2.280	80.6%	59.5%	-
Qwen3 235B A22B (Reasoning)	Alibaba	$0.700	$8.400	82.8%	70.0%	59.9
Claude 4.5 Haiku (Non-reasoning)	Anthropic	$1.000	$5.000	80.0%	64.6%	86.6
GPT-5 mini (minimal)	OpenAI	$0.250	$2.000	77.5%	68.7%	72.2
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)	Google	$0.100	$0.400	79.6%	65.1%	487.7
Hermes 4 - Llama-3.1 405B (Reasoning)	Nous Research	$1.000	$3.000	82.9%	72.7%	36.1
DeepSeek V3 0324	DeepSeek	$1.140	$1.250	81.9%	65.5%	-
Claude 3.7 Sonnet (Non-reasoning)	Anthropic	$3.000	$15.000	80.3%	65.6%	-
Qwen3 VL 32B Instruct	Alibaba	$0.700	$2.800	79.1%	67.1%	44.7
Gemini 2.5 Flash (Non-reasoning)	Google	$0.300	$2.500	80.9%	68.3%	253.5
Gemini 2.5 Flash-Lite (Reasoning)	Google	$0.100	$0.400	75.9%	62.5%	-
MiniMax M1 40k	MiniMax	$0.400	$2.100	80.8%	68.2%	-
Qwen3 Omni 30B A3B (Reasoning)	Alibaba	$0.250	$0.970	79.2%	72.6%	96.7
Ring-flash-2.0	InclusionAI	$0.140	$0.570	79.3%	72.5%	52.6
Hermes 4 - Llama-3.1 70B (Reasoning)	Nous Research	$0.130	$0.400	81.1%	69.9%	75.0
Qwen3 32B (Reasoning)	Alibaba	$0.700	$8.400	79.8%	66.8%	98.3
Grok 4 Fast (Non-reasoning)	xAI	$0.200	$0.500	73.0%	60.6%	216.5
Qwen3 VL 30B A3B Instruct	Alibaba	$0.200	$0.800	76.4%	69.5%	92.4
Llama 3.1 Nemotron Ultra 253B v1 (Reasoning)	NVIDIA	$0.600	$1.800	82.5%	72.8%	38.4
Grok 4.1 Fast (Non-reasoning)	xAI	$0.200	$0.500	74.3%	63.7%	68.7
Ling-flash-2.0	InclusionAI	$0.140	$0.570	77.7%	65.7%	54.6
QwQ 32B	Alibaba	$0.430	$0.600	76.4%	59.3%	28.4
Solar Pro 2 (Reasoning)	Upstage	$0.500	$0.500	80.5%	68.7%	111.3
NVIDIA Nemotron Nano 9B V2 (Reasoning)	NVIDIA	$0.040	$0.160	74.2%	57.0%	98.3
GLM-4.5V (Reasoning)	Z AI	$0.550	$1.750	78.8%	68.4%	74.8
Qwen3 30B A3B 2507 Instruct	Alibaba	$0.200	$0.800	77.7%	65.9%	61.1
Qwen3 30B A3B (Reasoning)	Alibaba	$0.200	$2.400	77.7%	61.6%	79.5
NVIDIA Nemotron Nano 9B V2 (Non-reasoning)	NVIDIA	$0.040	$0.160	73.9%	55.7%	96.8
Qwen3 14B (Reasoning)	Alibaba	$0.350	$4.200	77.4%	60.4%	58.6
Llama 4 Maverick	Meta	$0.270	$0.850	80.9%	67.1%	130.0
GPT-4o (March 2025, chatgpt-4o-latest)	OpenAI	$5.000	$15.000	80.3%	65.5%	272.6
Mistral Medium 3.1	Mistral	$0.400	$2.000	68.3%	58.8%	59.6
Sonar Reasoning	Perplexity	$1.000	$5.000	—	62.3%	69.1
Gemini 2.0 Flash (Feb '25)	Google	$0.100	$0.400	77.9%	62.3%	181.8
Mistral Medium 3	Mistral	$0.400	$2.000	76.0%	57.8%	33.3
Qwen3 Coder 30B A3B Instruct	Alibaba	$0.450	$2.250	70.6%	51.6%	88.7
Magistral Medium 1	Mistral	$2.000	$5.000	75.3%	67.9%	148.3
ERNIE 4.5 300B A47B	Baidu	$0.280	$1.100	77.6%	81.1%	24.4
DeepSeek R1 Distill Qwen 32B	DeepSeek	$0.285	$0.285	73.9%	61.5%	92.1
Hermes 4 - Llama-3.1 405B (Non-reasoning)	Nous Research	$1.000	$3.000	72.9%	53.6%	32.2
DeepSeek V3 (Dec '24)	DeepSeek	$0.400	$0.890	75.2%	55.7%	-
Nova Premier	Amazon	$2.500	$12.500	73.3%	56.9%	57.2
Qwen3 VL 8B (Reasoning)	Alibaba	$0.180	$2.100	74.9%	57.9%	63.7
Magistral Small 1	Mistral	$0.500	$1.500	74.6%	64.1%	259.0
DeepSeek R1 0528 Qwen3 8B	DeepSeek	$0.060	$0.090	73.9%	61.2%	78.6
Qwen2.5 Max	Alibaba	$1.600	$6.400	76.2%	58.7%	28.3
EXAONE 4.0 32B (Non-reasoning)	LG AI Research	$0.600	$1.000	76.8%	62.8%	87.7
Solar Pro 2 (Non-reasoning)	Upstage	$0.500	$0.500	75.0%	56.1%	107.6
Qwen3 Omni 30B A3B Instruct	Alibaba	$0.250	$0.970	72.5%	62.0%	91.1
Gemini 2.5 Flash-Lite (Non-reasoning)	Google	$0.100	$0.400	72.4%	47.4%	263.0
Qwen3 235B A22B (Non-reasoning)	Alibaba	$0.700	$2.800	76.2%	61.3%	55.9
DeepSeek R1 Distill Llama 70B	DeepSeek	$0.800	$1.050	79.5%	40.2%	102.2
Claude 3.5 Sonnet (Oct '24)	Anthropic	$3.000	$15.000	77.2%	59.9%	-
DeepSeek R1 Distill Qwen 14B	DeepSeek	$0.150	$0.150	74.0%	48.4%	41.9
Qwen3 14B (Non-reasoning)	Alibaba	$0.350	$1.400	67.5%	47.0%	53.5
Mistral Small 3.2	Mistral	$0.100	$0.300	68.1%	50.5%	134.6
GPT-5 nano (minimal)	OpenAI	$0.050	$0.400	55.6%	42.8%	118.8
GPT-4o (Aug '24)	OpenAI	$2.500	$10.000	—	52.1%	118.0
Sonar	Perplexity	$1.000	$1.000	68.9%	47.1%	75.6
Qwen3 8B (Reasoning)	Alibaba	$0.180	$2.100	74.3%	58.9%	86.6
MiniMax-Text-01	MiniMax	$0.200	$1.100	75.9%	57.8%	27.6
Sonar Pro	Perplexity	$3.000	$15.000	75.5%	57.8%	85.8
Llama 3.1 Instruct 405B	Meta	$3.750	$6.750	73.2%	51.5%	24.4
Llama 4 Scout	Meta	$0.140	$0.545	75.2%	58.7%	136.6
QwQ 32B-Preview	Alibaba	$0.120	$0.180	64.8%	55.7%	108.7
Llama 3.3 Instruct 70B	Meta	$0.540	$0.710	71.3%	49.8%	99.2
Devstral Medium	Mistral	$0.400	$2.000	70.8%	49.2%	111.5
Ling-mini-2.0	InclusionAI	$0.070	$0.280	67.1%	56.2%	150.3
GPT-4.1 nano	OpenAI	$0.100	$0.400	65.7%	51.2%	149.8
Devstral Small (Jul '25)	Mistral	$0.100	$0.300	62.2%	41.4%	241.5
Qwen3 VL 8B Instruct	Alibaba	$0.180	$0.700	68.6%	42.7%	90.0
GPT-4o (Nov '24)	OpenAI	$2.500	$10.000	74.8%	54.3%	189.6
Command A	Cohere	$2.500	$10.000	71.2%	52.7%	55.0
Mistral Large 2 (Nov '24)	Mistral	$2.000	$6.000	69.7%	48.6%	38.1
Gemini 2.0 Flash-Lite (Feb '25)	Google	$0.075	$0.300	72.4%	53.5%	188.1
Llama Nemotron Super 49B v1.5 (Non-reasoning)	NVIDIA	$0.100	$0.400	69.2%	48.1%	68.7
Qwen3 30B A3B (Non-reasoning)	Alibaba	$0.200	$0.800	71.0%	51.5%	72.6
Qwen3 32B (Non-reasoning)	Alibaba	$0.700	$2.800	72.7%	53.5%	90.8
GPT-4o (May '24)	OpenAI	$5.000	$15.000	74.0%	52.6%	115.6
Gemini 2.0 Flash-Lite (Preview)	Google	$0.075	$0.300	—	54.2%	184.5
GLM-4.5V (Non-reasoning)	Z AI	$0.600	$1.800	75.1%	57.3%	74.3
Reka Flash 3	Reka AI	$0.200	$0.800	66.9%	52.9%	51.8
Qwen3 4B (Reasoning)	Alibaba	$0.110	$1.260	69.6%	52.2%	84.0
Claude 3.5 Sonnet (June '24)	Anthropic	$3.000	$15.000	75.1%	56.0%	-
GPT-4o (ChatGPT)	OpenAI	$5.000	$15.000	77.3%	51.1%	266.8
Pixtral Large	Mistral	$2.000	$6.000	70.1%	50.5%	36.7
Nova Pro	Amazon	$0.800	$3.200	69.1%	49.9%	-
Mistral Small 3.1	Mistral	$0.100	$0.300	65.9%	45.4%	160.5
Grok 2 (Dec '24)	xAI	$2.000	$10.000	70.9%	51.0%	94.6
GPT-4 Turbo	OpenAI	$10.000	$30.000	69.4%	—	38.4
Hermes 4 - Llama-3.1 70B (Non-reasoning)	Nous Research	$0.130	$0.400	66.4%	49.1%	66.8
Llama 3.1 Nemotron Instruct 70B	NVIDIA	$0.600	$0.600	69.0%	46.5%	38.7
Qwen3 8B (Non-reasoning)	Alibaba	$0.180	$0.700	64.3%	45.2%	81.6
Granite 4.0 H Small	IBM	$0.060	$0.250	62.4%	41.6%	395.0
Phi-4	Microsoft Azure	$0.125	$0.500	71.4%	57.5%	11.5
Llama 3.1 Instruct 70B	Meta	$0.560	$0.560	67.6%	40.9%	35.1
Qwen3 1.7B (Reasoning)	Alibaba	$0.110	$1.260	57.0%	35.6%	123.6
Mistral Large 2 (Jul '24)	Mistral	$2.000	$6.000	68.3%	47.2%	-
CompactifAI Llama 4 Scout Slim	Multiverse Computing	$0.070	$0.100	70.3%	42.6%	116.7
Qwen2.5 Coder Instruct 32B	Alibaba	$0.130	$0.175	63.5%	41.7%	54.1
Nova Lite	Amazon	$0.060	$0.240	59.0%	43.3%	150.5
GPT-4	OpenAI	$30.000	$60.000	—	—	30.1
Mistral Small 3	Mistral	$0.100	$0.300	65.2%	46.2%	176.8
GPT-4o mini	OpenAI	$0.150	$0.600	64.8%	42.6%	54.5
Jamba 1.7 Large	AI21 Labs	$2.000	$8.000	57.7%	39.0%	34.7
Qwen3 4B (Non-reasoning)	Alibaba	$0.110	$0.420	58.6%	39.8%	80.1
Claude 3 Opus	Anthropic	$15.000	$75.000	69.6%	48.9%	-
Claude 3.5 Haiku	Anthropic	$0.800	$4.000	63.4%	40.8%	48.2
Codestral (Jan '25)	Mistral	$0.300	$0.900	44.6%	31.2%	235.0
Devstral Small (May '25)	Mistral	$0.100	$0.300	63.2%	43.4%	203.9
Reka Core	Reka AI	$2.000	$2.000	—	—	40.6
Qwen2.5 Turbo	Alibaba	$0.050	$0.200	63.3%	41.0%	74.2
Reka Flash (Sep '24)	Reka AI	$0.200	$0.800	—	—	68.8
Solar Mini	Upstage	$0.150	$0.150	—	—	79.2
Llama 3.2 Instruct 90B (Vision)	Meta	$0.720	$0.720	67.1%	43.2%	36.8
Reka Flash (Feb '24)	Reka AI	$0.200	$0.800	—	—	69.9
Reka Edge	Reka AI	$0.100	$0.100	—	—	63.3
Nova Micro	Amazon	$0.035	$0.140	53.1%	35.8%	263.9
Llama 3.1 Instruct 8B	Meta	$0.100	$0.100	47.6%	25.9%	174.3
CompactifAI Mistral Small 3.1 Slim	Multiverse Computing	$0.050	$0.080	53.8%	32.9%	121.6
CompactifAI Llama 3.3 70B Slim	Multiverse Computing	$0.160	$0.310	57.1%	35.5%	130.7
Llama 3.2 Instruct 11B (Vision)	Meta	$0.160	$0.160	46.4%	22.1%	69.3
Gemma 3n E4B Instruct	Google	$0.020	$0.040	48.8%	29.6%	41.3
Granite 3.3 8B (Non-reasoning)	IBM	$0.030	$0.250	46.8%	33.8%	458.0
Jamba 1.7 Mini	AI21 Labs	$0.200	$0.400	38.8%	32.2%	153.5
Jamba 1.5 Large	AI21 Labs	$2.000	$8.000	57.2%	42.7%	-
Hermes 3 - Llama-3.1 70B	Nous Research	$0.300	$0.300	57.1%	40.1%	37.8
OLMo 2 32B	Allen Institute for AI	$0.200	$0.350	51.1%	32.8%	-
Phi-3 Medium Instruct 14B	Microsoft Azure	$0.170	$0.680	54.3%	32.6%	42.3
Qwen3 1.7B (Non-reasoning)	Alibaba	$0.110	$0.420	41.1%	28.3%	115.3
Jamba 1.6 Large	AI21 Labs	$2.000	$8.000	56.5%	38.7%	34.9
Qwen3 0.6B (Reasoning)	Alibaba	$0.110	$1.260	34.7%	23.9%	201.3
Aya Expanse 32B	Cohere	$0.500	$1.500	37.7%	23.0%	41.9
Claude 3 Sonnet	Anthropic	$3.000	$15.000	57.9%	40.0%	-
Llama 3 Instruct 70B	Meta	$0.650	$0.880	57.4%	37.9%	40.5
Mistral Small (Sep '24)	Mistral	$0.200	$0.600	52.9%	38.1%	84.9
Phi-3 Mini Instruct 3.8B	Microsoft Azure	$0.130	$0.520	43.5%	31.9%	69.1
Ministral 8B	Mistral	$0.100	$0.100	38.9%	27.6%	195.9
Mistral Large (Feb '24)	Mistral	$4.000	$12.000	51.5%	35.1%	-
Llama 2 Chat 7B	Meta	$0.050	$0.250	16.4%	22.7%	112.2
CompactifAI Llama 3.1 8B Slim	Multiverse Computing	$0.050	$0.070	32.1%	22.1%	228.4
Llama 3.2 Instruct 3B	Meta	$0.060	$0.060	34.7%	25.5%	114.4
Qwen3 0.6B (Non-reasoning)	Alibaba	$0.110	$0.420	23.1%	23.1%	191.3
Ministral 3B	Mistral	$0.040	$0.040	33.9%	26.0%	272.6
Aya Expanse 8B	Cohere	$0.500	$1.500	31.2%	24.7%	80.9
Claude 3 Haiku	Anthropic	$0.250	$1.250	—	—	116.9
Llama 3.2 Instruct 1B	Meta	$0.053	$0.055	20.0%	19.6%	74.6
Pixtral 12B (2409)	Mistral	$0.150	$0.150	47.3%	34.3%	144.9
Mistral Small (Feb '24)	Mistral	$1.000	$3.000	41.9%	30.2%	163.9
Mistral Medium	Mistral	$2.750	$8.100	49.1%	34.9%	64.3
GPT-3.5 Turbo	OpenAI	$0.500	$1.500	46.2%	29.7%	84.9
Gemma 2 9B	Google	$0.030	$0.090	49.5%	31.1%	-
Command-R+ (Aug '24)	Cohere	$2.500	$10.000	42.7%	33.7%	20.4
Llama 3 Instruct 8B	Meta	$0.045	$0.155	40.5%	29.6%	67.6
Command-R+ (Apr '24)	Cohere	$3.000	$15.000	43.2%	32.3%	-
Mistral NeMo	Mistral	$0.150	$0.150	39.9%	31.4%	188.3
Jamba 1.5 Mini	AI21 Labs	$0.200	$0.400	37.1%	30.2%	-
Jamba 1.6 Mini	AI21 Labs	$0.200	$0.400	36.7%	30.0%	150.6
Mixtral 8x7B Instruct	Mistral	$0.540	$0.600	38.7%	29.2%	-
Command-R (Mar '24)	Cohere	$0.500	$1.500	33.8%	28.4%	-
Command-R (Aug '24)	Cohere	$0.150	$0.600	33.7%	28.9%	58.9
Mistral 7B Instruct	Mistral	$0.250	$0.250	24.5%	17.7%	119.4
Cogito v2.1 (Reasoning)	Deep Cogito	$1.250	$1.250	84.9%	76.8%	75.3
DeepSeek-OCR	DeepSeek	$0.030	$0.100	—	—	305.1
Grok 3 mini Reasoning (Low)	xAI	$0.300	$0.500	—	—	110.8

AI Stats

No models found