Hacker News

Yeah it's just a semantic pet peeve. Let me ask you this: What is a "Language Model", if this is a "Large Language Model"? Inversely, if a 1.5B model is "Large" then what is the recent 1T param models? "Superlarge"?

In my own very humble opinion, it becomes "Large" when it's out of non-specialized hardware. So currently, a model which requires more than 32GB vram is large (as that's roughly where the high-end gaming GPUs cut off).

And btw, there is no way you can train a language model on a CPU, even with ddr5, lest you wait a whole week for a single training cycle. Give it a go! I know I did, it's a magnitude away from being feasible.

bachmeier 34 minutes ago [ - ]

> Yeah it's just a semantic pet peeve.

I'm not sure. Microsoft calls Phi-4 a small language model, so the distinction is considered meaningful to some people working in the space. My own view is that the term "LLM" implies something about the capabilities of the model in 2026. Maybe there's not a hard definition of the term, but whatever the definition is, the model in the article wouldn't make it.

joefourier 2 hours ago [ - ]

Calling anything "large" in computing is problematic since hardware keeps improving. GPT-1 was an LLM in 2017 and had 117M parameters, when did it stop being large?

GPT would have been a better term than LLM, but unfortunately became too associated with OpenAI. And then, what about non-transformer LLMs? And multimodal LLMs?

Maybe we should just give up, shrug and call it "AI".