Hacker News

The goal isn't to build perfect assistants with LLM. The goal is to build something that is just good enough to be useful. It doesn't need to be even a full fledged conversational LLM. It could just be an LM.

Lots of tasks don't require knowledge to be encoded into the model. For example "summarize my emails" is a task that can be done with a fairly small model trained on just basic text.

There are also unexplored avenues. For example, if I had the hardware to do this, I would basically take an early version of GPT, and then start training it on additional data, and when the training run completes, I would diff the model with the original version, and use that as the training set of another model. Basically build a model on top of GPT that can automatically adjust parameter weights and encode it into the model, thus giving it persistent memory.