I played a round of Geoguessr against it and while it did a shockingly good job compared to what I was expecting, it still lags behind even novice human players.
The locations and its guesses were:
Bliss, Idaho - Burns, Oregon (273 miles away)
Quilleco, Biobio, Chile - Eugene, Oregon (6,411 miles away)
Dettighofen, Switzerland - Mühldorf, Germany (228 miles away)
Pretoria, South Africa - Johannesburg, South Africa (36 miles away)
Rockhampton, Australia - Gold Coast, Australia (437 miles away)
Okay, I decided to benchmark a bunch of AI models with geoguessr. One round each on diverse world, here's how they did out of 25,000:
Claude 3.7 Sonnet: 22,759
Qwen2.5-Max: 22,666
o3-mini-high: 22,159
Gemini 2.5 Pro: 18,479
Llama 4 Maverick: 14,316
mistral-large-latest: 10,405
Grok 3: 5,218
Deepseek R1: 0
command-a-03-2025: 0
Nova Pro: 0
Neat, thanks for doing this!
How does Google Lens compare?
I tried it but as far as I can tell Google Lens doesn't give you a location - it just describes generally what you're looking at.
What about 04-mini-high ?
OpenAI's naming confuses me but I ran o4-mini-2025-04-16 through a game and it got 23,885
Interesting. It supports what they said (this is the model with good visual reasoning)
I just took a picture from my own front porch of the street and the houses opposite. It said 'probably Australia but I'd need more info'.
I said, give me your best guess.
And it guessed Canberra, Australia. Where I'm sitting right now drinking a Martini. Pretty spectacular.