Hacker News

This looks sick!

Did you build this for yourself?

I built this for myself because I hated running a large ElasticSearch instance at work and wanted something that would autoscale and something that allowed for reindexing data. I also had a lot of experience running a large BigTable/Elasticsearch custom graph database I thought could be unified into a single database to cut costs. Started adding an embedding index for fun based on some Google papers and now here we are!

perfmode 4 hours ago [ - ]

what google papers?

kingcauchy 4 hours ago [ - ]

Not strictly google but microsoft/bing too, here's the top ones from my notes:

https://arxiv.org/abs/2410.14452 spfresh, https://arxiv.org/abs/2111.08566 spann, https://arxiv.org/abs/2405.12497 rabitq, https://arxiv.org/abs/2509.06046 diskann,

I have a variety of blogs that I used too and reference implementations!

It's a Rabit[Q]uantized Hierchical Balanced Clustering algorithm we use for the vector index and we use a chunked segment index for the sparse index if you're curious! Happy to discuss more!

perfmode 4 hours ago [ - ]

Curious if you’re using any SIMD optimizations for numerical calculations.

kingcauchy 4 hours ago [ - ]

Yes we do use SIMD heavily! https://github.com/ajroetker/go-highway I also added SME support for Darwin for most algorithms. We use it in the full-text index, all over the vector indexes and heavily for the ml inference we do in go especially.

4 hours ago [ - ]

[deleted]