Hacker News

I've been digitising family photos using this. I scanned the photo itself and the text on it, then passed that to an LLM for OCR and used tools to get the caption verbatim, the location mentioned and the date in a standard format. That was going to be the end of it, but the OpenAI docs https://platform.openai.com/docs/guides/function-calling?lan... suggest letting the model guess coordinates instead of just grabbing names, so I did both and it was impressive. My favourite was taking a picture looking out to sea from a pier and pinpointing the exact pier.

imposterr a day ago [ - ]

Hmm, not sure I understand how you made use of OpenAI to guess the location oh a photo. Could you expand on that a bit? Thanks!

notsylver 14 hours ago [ - ]

I showed the model a picture and any text written on that picture and asked it to guess a latitude/longitude using the tool use API for structured outputs. That was in addition to having it transcribe the hand written text and extracting location names, which was my original goal until I saw how good it was at guessing exact coordinates. It would guess within ~200km on average, even on pictures with no information written on them.