Selection lab

Compare AI Models

Build a practical shortlist by reading benchmark movement, price, latency, context, and public catalog coverage in the same view.

Loaded
369
Selected
3
Metrics
60
How to read this view

Compare model cost and benchmark movement before choosing a production default. Start with a performance metric, select comparable models, then read price beside score so a cheaper model, faster model, or larger-context model can be judged against the job it needs to handle. AI Stats keeps the comparison focused on pricing, context length, catalog capabilities, and public benchmark data instead of a single leaderboard rank.

For a practical shortlist, compare models from the same task family first. Coding, math, reasoning, long-context review, and agent tool use can point to different winners. The charts are meant to narrow the field before a live test, not replace evaluation with your own prompts, latency needs, and budget limits. Recheck candidates when providers update pricing, context limits, or benchmark submissions because small catalog changes can shift the best default. Treat each result as a deployment shortlist, not a trophy.

Price Comparison

Performance Comparison

Public catalog metadata Matched OpenRouter usage, route detail, Hugging Face, and LiteLLM fields for selected models.
Model Catalog match Context Public price OR usage/routes HF signal Capabilities
GPT-5.5 (xhigh) openai/gpt-5.5 1.1M $5.000 2 routes - tools, vision, reasoning, cache, web
GPT-5.5 (high) openai/gpt-5.5 1.1M $5.000 2 routes - tools, vision, reasoning, cache, web
Claude Opus 4.7 (Adaptive Reasoning, Max Effort) anthropic/claude-opus-4.7-fast 1M $30.000 - - tools, vision, reasoning, cache