This is a false dichotomy.

You introduce two categories, state that LLMs are not part of one category, and then conclude that it must be in the other. In reality, the distinction between the two classes is not so clear.

The transformer architecture is quite something, and the number of layers and nodes involved in a typical LLM is staggering. This goes way beyond linear regression.