This post is amusing to me because after solving the problem in ~2 seconds the author boils the ocean to get that down further, then finally ends with questioning what the problem statement even is?

Classic software engineer pitfall. First gather the requirements!

Second, if their initial interpretation was correct, and it's a one-shot operation, then the initial solution solves it. Done! Why go any further?

I get that it's fun to muse over solutions to these types of problems but the absurdity of it all made me laugh. Jeff's answer was the best, because it describes a solution which makes the assumptions crystal clear while outlining a straightforward implementation. If you wanted something else, it's obvious you need to clarify.

They don't actually solve the problem in 2 seconds - at that point, they are running on a sample of only 3,000 vectors! Then they get it down further, but still find it will take a loooooong time to get through all 3B:

"With these small improvements, we’ve already sped up inference to ~13 seconds for 3 million vectors, which means for 3 billion, it would take 1000x longer, or ~3216 minutes." ...which is about two days.