this boring reductionist take on how LLMs work is so outdated that I'm getting second hand embarassment.
Sorry, I meant a very fancy next token predictor :)
Sorry, I meant a very fancy next token predictor :)