• Flagship GPT-4.1: top‑tier intelligence, full endpoints & premium features
• GPT-4.1-mini: balances performance, speed & cost
• GPT-4.1-nano: prioritizes throughput & low cost with streamlined capabilities
All share a 1 million‑token context window (vs 120–200k on 4o-o3/o1), excelling in instruction following, tool calls & coding.
Benchmarks vs prior models:
• AIME ’24: 48.1% vs 13.1% (~3.7× gain)
• MMLU: 90.2% vs 85.7% (+4.5 pp)
• Video‑MME: 72.0% vs 65.3% (+6.7 pp)
• SWE‑bench Verified: 54.6% vs 33.2% (+21.4 pp)