Hacker News

This is great, I'll be returning to this tool often. Thanks.

A few suggestions and ideas for futher projects.

-allow for "keyword", -negate operators and "mult word string" searches, [Pubmed](https://pubmed.ncbi.nlm.nih.gov/advanced) is what I'd consider an Ideal search interface

-allow for regex, or direct sql lookups with limited query time ratelimited by POW. for example, if the server is under load, require a token from something like [anubis](https://anubis.techaro.lol/) and lower the maximum DB query time

-Index the title of all discussion/forum type posts with a VectorDB for semantic search. And add an option to sort by replies. (Like [answer overflow](https://github.com/AnswerOverflow/AnswerOverflow))This would make it possible to find relevant discussions among ~60B messages. ScyllaDB doesn't support vector search, so I'd suggest something like [usearch](https://github.com/unum-cloud/usearch) for a detached index. Embedding models are faster and smaller than most people realize, pick whatever's on top of the [mteb leaderboard](https://huggingface.co/spaces/mteb/leaderboard) after deciding on size.

-calculate the jaccard similarity (user overlap) between discord server members, this would allow for searching in "similar" severs, and potentially, mapping discord. [github](https://anvaka.github.io/map-of-github) [reddit](https://anvaka.github.io/map-of-reddit)

-fix doxing. Searching by <@userid> is currently possible.

-expect the alternative to the cloudflare captcha to be abused, it's too simple for modern solvers.

-open source the stack? I'm interested in the scraper.