Good point - house-number search isn't there yet in Corviont.
Right now the offline geocoder in the demo does place/street-level search + reverse, but street + house number ("Main St 12") isn't supported yet. It's explicitly on the near-term roadmap: richer geocoding output with house numbers and (optionally) street/area geometry instead of just centerpoints.
Why not package photon?
Photon is solid - but it comes with a very different operational profile than what I am aiming for.
Photon is built on Elasticsearch (Java) - so it tends to mean a heavier index + higher RAM/CPU expectations and more moving parts. That's fine on a beefy server, but it is a rough fit for the "drop-in appliance on small edge/on-prem boxes (amd64/arm64) + simple ops" goal.
Corviont's geocoder is intentionally "boring": a single SQLite file + an HTTP service, built from Nominatim-derived data. Fast startup, low RAM, easy to ship per-region, and it stays consistent with the rest of the stack.
That said - if there is demand for a "server-grade geocoder option" for people already comfortable running Elastic, I am not opposed to offering it as an alternative profile. The default is just optimized for constrained edge hardware and minimal moving parts.
Have you measured actual memory and disk requirements of a photon OpenSearch index vs your sqlite database?
No. When I looked at Photon and saw that it involves running Java plus an OpenSearch/Elasticsearch backend on the device, I assumed it would be heavier in terms of memory and moving parts than my setup (single SQLite file + small HTTP API).
Have you (or anyone here) actually run Photon on edge-class hardware? If you have real-world numbers, I'd be interested in seeing them. When I add house-number search, Photon might be an easier route than enhancing my current approach.
> involves running Java plus an OpenSearch/Elasticsearch backend on the device
photon is just a single process and opensearch runs inside it (but you can run photon and opensearch separately). Saying "Java" means more memory is in general wrong as the underlying technology Lucene is heavily memory optimized.
Well, trying out is better than a thousands words
Let's start it with index of whole Spain, 2.4gb download, 4gb on disk: https://gist.github.com/dkourilov/e243270684b5973f1fac005c78...
I'd say it's pretty usable to run a EU-sized country or several US states on any commodity PC. For embedded devices, it really depends what is the device. On Raspberry PI it should be fine for batch geocoding, realtime (typeahead) will definitely be lagging.
Thanks both - appreciate the clarification and the Spain datapoint.
Those numbers look pretty reasonable. I’ll keep Photon in mind and, as I get time to benchmark different approaches on a few representative regions/hardware, I’ll use the results to decide what the best way forward is - and I’ll publish the numbers when I do.