Compare AI Models
Build a practical shortlist by reading benchmark movement, price, latency, context, and public catalog coverage in the same view.
- Loaded
- 369
- Selected
- 3
- Metrics
- 60
How to read this view
Compare model cost and benchmark movement before choosing a production default. Start with a performance metric, select comparable models, then read price beside score so a cheaper model, faster model, or larger-context model can be judged against the job it needs to handle. AI Stats keeps the comparison focused on pricing, context length, catalog capabilities, and public benchmark data instead of a single leaderboard rank.
For a practical shortlist, compare models from the same task family first. Coding, math, reasoning, long-context review, and agent tool use can point to different winners. The charts are meant to narrow the field before a live test, not replace evaluation with your own prompts, latency needs, and budget limits. Recheck candidates when providers update pricing, context limits, or benchmark submissions because small catalog changes can shift the best default. Treat each result as a deployment shortlist, not a trophy.
Price Comparison
Performance Comparison
Public catalog metadata Matched OpenRouter usage, route detail, Hugging Face, and LiteLLM fields for selected models.
| Model | Catalog match | Context | Public price | OR usage/routes | HF signal | Capabilities |
|---|---|---|---|---|---|---|
| GPT-5.5 (xhigh) | openai/gpt-5.5 | 1.1M | $5.000 | 2 routes | - | tools, vision, reasoning, cache, web |
| GPT-5.5 (high) | openai/gpt-5.5 | 1.1M | $5.000 | 2 routes | - | tools, vision, reasoning, cache, web |
| Claude Opus 4.7 (Adaptive Reasoning, Max Effort) | anthropic/claude-opus-4.7-fast | 1M | $30.000 | - | - | tools, vision, reasoning, cache |