Hacker News

The whole point of the thesis is that because the cover image are very similar, therefore LLMs are bad at writing text?

I think it's that today's LLMs have access to poor/generic image generation models and people find it easier to ask ChatGPT or NanoBanana to make a cover instead of fine tuning a small SD model for the purpose.