Hacker News

patcon 2 months ago [ - ]

I see that you're looking for clusters within PCA projections -- You should look for deeper structure with hot new dimensional reduction algorithms, like PaCMAP or LocalMAP!

I've been working on a project related to a sensemaking tool called Pol.is [1], but reprojecting its wiki survey data with these new algorithms instead of PCA, and it's amazing what new insight it uncovers with these new algorithms!

https://patcon.github.io/polislike-opinion-map-painting/

Painted groups: https://t.co/734qNlMdeh

(Sorry, only really works on desktop)

[1]: https://www.technologyreview.com/2025/04/15/1115125/a-small-...

brig90 2 months ago [ - ]

Thanks for pointing those out — I hadn’t seen PaCMAP or LocalMAP before, but that definitely looks like the kind of structure-preserving approach that would fit this data better than PCA. Appreciate the nudge — going to dig into those a bit more.

loxias 2 months ago [ - ]

Try TDA ("mapper", or really, anything based on kernel density computed connectivity), it's a whole new world.

This ain't your parents' "factor analysis".

patcon 2 months ago [ - ]

Ooooo I will definitely check it out! It's strangely hard to find any comparisons in youtube videos -- it seems TDA isn't actually a dimensional reduction algorithm, but something closely relayed, maybe?

khafra 2 months ago [ - ]

LLM model interpretability also uses Sparse Autoencoders to find concept representations (https://openai.com/index/extracting-concepts-from-gpt-4/), and, more recently, linear probes.

staticautomatic 2 months ago [ - ]

I’ve had much better luck with umap than PCA and t-sne for reducing embeddings.

patcon 2 months ago [ - ]

PaCMAP (and its descendant localmap) are comparable to t-sne at preserving both local and global structure (but without messing much with finicky hyperparameters)

https://youtu.be/sD-uDZ8zXkc