Just put Zvec vs LanceDB vs Qdrant through the paces on a 3 collection (text only) 10k per collection dataset.
Average latency across ~500 queries per collection per database:
Qdrant: 21.1ms LanceDB: 5.9ms Zvec: 0.8ms
Both Qdrant and LanceDB are running with Inverse Document Frequency enabled so that is a slight performance hit, Zvec running with HNSW.
Overlap of answers between the 3 is virtually identical with same default ranking.
So yes, Zvec is incredible, but the gotcha is that the reason zvec is fast is because it is primarily constrained by local disk performance and the data must be local disk, meaning you may have a central repository storing the data, but every instance running zvec needs to have a local (high perf) disk attached. I mounted blobfuse2 object storage to test and zvec numbers went to over 100ms, so disk is almost all that matters.
My take? Right now the way zvec behaves, it will be amazing for on-device vector lookups, not as helpful for cloud vectors.
Author here. Thanks for putting Zvec through its paces and sharing such detailed results—really appreciate the hands-on testing!
Just a bit of context on the storage behavior: Zvec currently uses memory-mapped files (mmap) by default, so once the relevant data is warmed up in the page cache, performance should be nearly identical regardless of whether the underlying storage is local disk or object storage—it's essentially in-memory at that point. The 100ms latency you observed with blobfuse2 likely reflects cold reads (data not yet cached), which can be slower than local disk in practice. Our published benchmarks are all conducted with sufficient RAM and full warmup, so the storage layer's latency isn't a factor in those numbers.
If you're interested in query performance on object storage, we're working on a buffer pool–based I/O mode that will leverage io_uring and object storage SDKs to improve cold-read performance. The trade-off is that in fully warmed‑up, memory‑rich scenarios, this new mode may be slightly slower than mmap, but it should offer more predictable latency when working with remote storage. Stay tuned—this is still under development!