> That's why an LLM (I tested this on Grok) can give you a summary of chapter 18 of Mary Shelley's Frankenstein, but cannot reproduce a paragraph from the same text verbatim.

Unfortunately, the reality is more boring. https://www.litcharts.com/lit/frankenstein/chapter-18 https://www.cliffsnotes.com/literature/frankenstein/chapter-... https://www.sparknotes.com/lit/frankenstein/sparklets/ https://www.sparknotes.com/lit/frankenstein/section9/ https://www.enotes.com/topics/frankenstein/chapter-summaries... https://www.bookey.app/freebook/frankenstein/chapter-18/summ... https://tcanotes.com/drama-frankenstein-ch-18-20-summary-ana... https://quizlet.com/content/novel-frankenstein-chapter-18 https://www.studypool.com/studyGuides/Frankenstein/Chapter_S... https://study.com/academy/lesson/frankenstein-chapter-18-sum... https://ivypanda.com/essays/frankenstein-by-mary-shelley-ana... https://www.shmoop.com/study-guides/frankenstein/chapter-18-... https://carlyisfrankenstein.weebly.com/chapters-18-19.html https://www.markedbyteachers.com/study-guides/frankenstein/c... https://www.studymode.com/essays/Frankenstein-Summary-Chapte... https://novelguide.com/frankenstein/summaries/chap17-18 https://www.ipl.org/essay/Frankenstein-Summary-Chapter-18-90...

I have not known an LLM to be able to summarise a book found in its training data, unless it had many summaries to plagiarise (in which case, actually having the book is unnecessary). I have no reason to believe the training process should result in "abstracting the data into more efficient forms". "Throwing away most of the training data" is an uncharitable interpretation (what they're doing is more sophisticated than that) but, I believe, a correct one.

I think you are probably right but it's hard to find an example of a piece of text that an LLM is willing to output verbatim (i.e. not subject to copyright guardrails) but also hasn't been widely studied and summarised by humans. Regardless, I think you could probably find many such examples especially if you had control of the LLM training process.