Skip to content
LIVE
GODOT24.3%0.3
ANXIETY71EXTREME FEAR
ADOPTION34.7%2.1
DISPLACED-18.4M312K/wk
CLAUDE 3.794.22.1
GPT-4o91.80.3
INCIDENTS124 MTD

Every frontier model, ranked live

Capability scoring across 47 AI models, updated from LMSYS Arena, MMLU, HumanEval and our weighted composite. 3 releases this week.

UPDATED · APR 17 · 10:00 UTC
Scoring methodology →
#ModelScore24h7d7d chartOrg valueQueries/dayCategory
1
Claude 3.7 SonnetNEW
Anthropic
94.22.1%5.8%
$8.5B142KFrontier
2
GPT-4o
OpenAI
91.80.3%1.2%
$157B891KFrontier
3
Gemini 2.0 Ultra
Google DeepMind
89.31.7%4.1%
$1.8T324KFrontier
4
Grok 3
xAI
86.14.2%9.3%
$24B67KFrontier
5
Llama 4 ScoutNEW
Meta AI
81.46.8%14.2%
$1.4T2.1MOpen Source
6
Mistral Large 3
Mistral AI
79.23.1%6.7%
$1.1B89KOpen Source
7
o4-miniNEW
OpenAI
78.18.4%22.1%
$157B234KReasoning
8
Claude 3.7 Haiku
Anthropic
76.41.1%3.2%
$8.5B412KFrontier
9
Gemini 2.0 Flash
Google DeepMind
74.80.4%0.9%
$1.8T1.1MFrontier
10
DeepSeek V3
DeepSeek
73.10.6%2.8%
$8B98KOpen Source
11
Qwen 3 72BNEW
Alibaba
71.94.3%11.2%
$210B124KOpen Source
12
o3-pro
OpenAI
70.61.2%0.8%
$157B34KReasoning
13
Command R+
Cohere
68.40.8%2.1%
$5.5B18KEnterprise
14
Phi-4
Microsoft
66.20.3%1.4%
$3.1T42KOpen Source
15
Claude 3.5 Sonnet
Anthropic
64.82.1%5.4%
$8.5B67KFrontier
Methodology

How the capability score is computed

Weighted composite of LMSYS Arena Elo (40%), MMLU (20%), HumanEval (15%), GPQA (10%), ARC-AGI (10%), and community benchmark reports (5%). Raw scores are min-max normalized to 0–100 across frontier and open-source tiers. Updated every 6 hours via cron.

  • · 24h movement reflects Elo delta since previous day’s snapshot.
  • · Queries/day is estimated from public API telemetry + partner data.
  • · Org value uses the latest known private/public valuation.
  • · Phase 1 mock data — real feeds go live in Phase 2.