GPT-4.1

OpenAI1048K context$13.6/1K pages2025-04-14
Overall Rank
#18
of 27 models
Overall Score
69.5
avg across benchmarks
Best Task
Key Information Extraction
87.1
Weakest Task
Visual QA
63.0
Benchmark Performance
OlmOCR Benchv1.0
23/27
| Overall | ArXiv Math | H&F | Long/Tiny | Multi-Col | Old Scans | Scans Math | Tables |
|---|---|---|---|---|---|---|---|
| 54.0 | 58.2 | 32.1 | 53.4 | 66.3 | 37.6 | 71.4 | 59.2 |
OmniDocBenchv1.5
14/27
| Overall | Text Edit↓ | CDM↑ | TEDS↑ | TEDS-S↑ | Read Order↓ |
|---|---|---|---|---|---|
| 79.9 | 0.167 | 82.2 | 74.0 | 83.8 | 0.115 |
IDP Core Benchv1.0
12/27
| Overall | KIE | OCR | Table | VQA |
|---|---|---|---|---|
| 74.7 | 87.1 | 75.6 | 73.1 | 63.0 |
Capability Profile
Strength Analysis
Auto-generated from benchmark scores
Strengths
- Key Information Extraction87.1
- Text Extraction83.3
Weaknesses
- Visual QA63.0
- Table Understanding68.8