Except these models are not run prompt-to-prompt. The infra has to hold the entire context.