There used to be a word for this in generative AI: mode collapse. It's not that the model doesn't generate human-like responses, it's that it generates the same 0.0001% of possible human like responses every time. It's almost certainly the instruction tuning which is responsible, maybe some small part of blame could go to the rollout policy (I have no idea how rollout policy works these days).

The LLM has its context-window. When it gets over that I assume it starts more or less repeating itself. Whereas human context-window has memories and inputs from all of one's life. Therefore great authors don't repeat themselves.

Now even if an LLM has a large context-window it is probabably not the case it rememebers all of your previous prompts and all of its previous replies. If you ask it to write a book you should probabaly give it all the previous 50 books (or blog-posts) it has written for you so far and you should tell it not to repeat itself. But in practice the context-window and the cost of token would become too expensive for it to write 50 unique books.

Maybe the problem is "all-or-nothing" -nature of LLM context window. Humans don't remmeber everything from past but they remember something from ALL OF their past.