Hacker News

Personally I like this approach a lot

https://scikit-learn.org/stable/modules/generated/sklearn.ma...

I think other methods are more fashionable today

https://scikit-learn.org/stable/modules/manifold.html

particularly multi-dimension scaling, but personally I think tSNE plots are less pathological (they don't have as many of these crazy cusps that make me think it's projecting down from a higher-dimensional surface which is near-parallel to the page)

After processing documents with BERT I really like the clusters generated by the simple and old k-Means algorithm

https://scikit-learn.org/stable/modules/generated/sklearn.cl...

It has the problem that it always finds 20 clusters if you set k=20 and a cluster which really oughta be one big cluster might get treated as three little clusters but the clusters I get from it reflect the way I see things.