Arena Leaderboard
Elo rankings from human & LLM judge votes
Judge
👑
| Rank | Model | Company | Elo | Win% | W / L / T | Battles |
|---|---|---|---|---|---|---|
| 1 | Nanonets OCR3 | 1144 | 79.5% | 16W / 3L / 3T | 22 | |
| 2 | Nanonets OCR2+ | 1123 | 75.0% | 18W / 5L / 3T | 26 | |
| 3 | GPT-5 Mini | 1038 | 57.9% | 8W / 5L / 6T | 19 | |
| 4 | GPT-5.2 | 1028 | 57.4% | 12W / 8L / 7T | 27 | |
| 5 | Gemini 2.5 Flash · Thinking | 1022 | 50.0% | 6W / 6L / 4T | 16 | |
| 6 | Claude Sonnet 4.6 | 1014 | 52.3% | 8W / 7L / 7T | 22 | |
| 7 | GPT-5.4 | 1009 | 46.9% | 3W / 4L / 9T | 16 | |
| 8 | GPT-5.4 · Low Reasoning | 1005 | 47.8% | 8W / 9L / 6T | 23 | |
| 9 | Gemini 3.1 Pro | 991 | 46.2% | 4W / 5L / 4T | 13 | |
| 10 | Gemini 2.5 Pro | 986 | 47.4% | 6W / 7L / 6T | 19 | |
| 11 | Claude Opus 4.6 · Low Thinking | 982 | 43.8% | 4W / 7L / 13T | 24 | |
| 12 | Claude Sonnet 4.6 · Thinking | 976 | 48.4% | 11W / 12L / 9T | 32 | |
| 13 | Claude Opus 4.6 | 946 | 40.9% | 4W / 8L / 10T | 22 | |
| 14 | GPT-4.1 | 945 | 35.0% | 4W / 10L / 6T | 20 | |
| 15 | Gemini 2.5 Flash | 945 | 38.2% | 4W / 8L / 5T | 17 | |
| 16 | GPT-5.4 · Medium Reasoning | 943 | 43.8% | 7W / 10L / 7T | 24 | |
| 17 | Gemini 3 Flash | 903 | 25.0% | 2W / 11L / 5T | 18 |