Nice LLM generated text.

Now go read https://transformer-circuits.pub/2024/scaling-monosemanticit... or https://arxiv.org/abs/2506.19382 to see why that text is outdated. Or read any paper in the entire field of mechanistic interpretability (from the past year or two), really.

Hint: the first paper is titled "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet" and you can ctrl-f for "We find three different safety-relevant code features: an unsafe code feature 1M/570621 which activates on security vulnerabilities, a code error feature 1M/1013764 which activates on bugs and exceptions"

Who said I want a discussion? I want ignorant people to STOP talking, instead of talking as if they knew everything.

Your entire argument is derived from a pseudoscientific field without any peer-reviewed research. Mechanistic interpretability is a joke invented by AI firms to sell chatbots.

Lol that's a stupid ass response, especially when half the papers are from universities from China. You think the chinese universities are trying to sell ChatGPT subscriptions? Ridiculous. You're just falling behind in tech knowledge.

And apparently you think peer reviewed papers presented at NeurIPS or other conferences are considered pseudoscience. (For the people not versed in ML, NeurIPS is where the 2017 paper "Attention is All You Need" that started the modern ML revolution was presented)

https://neurips.cc/virtual/2023/poster/72666

https://jmlr.org/beta/papers/v26/23-0058.html

https://proceedings.mlr.press/v267/palumbo25a.html

https://iclr.cc/virtual/2026/poster/10011755