Hacker News

>But how different would SI and GPT4 appear in response to everyday chit-chat? What if we ask the SI-based sequence predictor how to cure cancer?

I suspect that a lot of LLM prompts that elicit useful capabilities out of imperfect sequence predictors like GPT-4 are in fact most likely to show up in the context of "prompting an LLM" rather than being likely to show up "in the wild".

As such, to predict the token after a prompt like that, an SI-based sequence predictor would want to predict the output of whatever language model was most likely to be prompted, conditional on the prompt/response pair making it into the training set.

If the answer to "what model was most likely to be prompted" was "the SI-based sequence predictor", then it needs to predict which of its own likely outputs are likely to make it into the training set, which requires it to have a probability distribution over its own output. I think the "did the model successfully predict the next token" reward function is underspecified in that case.

There are many cases like this where the behavior of the system in the limit of perfect performance at the objective is undesirable. Fortunately for us, we live in a finite universe and apply finite amounts of optimization power, and lots of things that are useless or malign in the limit are useful in the finite-but-potentially-quite-large regime.