> If we give an LLM a prompt that reads “The following is a conversation between Julius Caesar and Genghis Khan,” it will generate a coherent dialogue between the two historical figures. But no matter how detailed the responses are, no matter how vividly they recount their respective historical accomplishments, we would never conclude that the LLM has conjured up digital re-creations of Julius Caesar and Genghis Khan, nor would we suggest that the historical figures are conscious

They might be in principle. It could be that the best way to generate a plausible dialogue is to bring up re-creations of the characters and have them act it out. LLMs definitely have been demonstrated to have world models in some cases. That helps generating text.