This is the same llm-splaining that practically every enthusiast does. It's either "more hardware" or "more optimization" will solve all the problems and bring us into a new era of blah blah blah. But it won't. It won't solve LLM slop or hallucinations, which are inherent to how LLMs work, and why that bubble is already looking pretty unstable.
The goal isn't to build perfect assistants with LLM. The goal is to build something that is just good enough to be useful. It doesn't need to be even a full fledged conversational LLM. It could just be an LM.
Lots of tasks don't require knowledge to be encoded into the model. For example "summarize my emails" is a task that can be done with a fairly small model trained on just basic text.
There are also unexplored avenues. For example, if I had the hardware to do this, I would basically take an early version of GPT, and then start training it on additional data, and when the training run completes, I would diff the model with the original version, and use that as the training set of another model. Basically build a model on top of GPT that can automatically adjust parameter weights and encode it into the model, thus giving it persistent memory.