My favorite which is also up to date is the ClickHouse playground.
For example:
SELECT * FROM hackernews_history ORDER BY time DESC LIMIT 10;
https://gh-api.clickhouse.tech/play?user=play#U0VMRUNUICogRl...I subscribe to this issue to keep up with updates:
https://github.com/ClickHouse/ClickHouse/issues/29693#issuec...
And ofc, for those that don't know, the official API https://github.com/HackerNews/API
I didn't know there was an official API! This explains why the data is so readily available in many sources and formats. That's very cool.
With a more straightforward approach, the tool can be reproduced with just a few queries in ClickHouse.
1. Create a table with styles by authors:
2. Calculate and insert style vectors (the insert takes 27 seconds): 3. Find nearest authors (the query takes ~50 ms):