So I'm guessing we passed HN through an LLM, looking for book mentions.

A number of posts here flagging disambiguation issues, I've run into this a lot.

I've been dealing with the problem using cosine distance between embeddings, but find it tricky to verify at scale.

Anyone else struggling with this?

That would be ruinous, it's probably just doing a simple full text search against a database of titles.