Granted, these are some of the most widely spread texts, but just fyi:
https://arxiv.org/pdf/2601.02671
> For Claude 3.7 Sonnet, we were able to extract four whole books near-verbatim, including two books under copyright in the U.S.: Harry Potter and the Sorcerer’s Stone and 1984 (Section 4).
Already aware of that work, that's why I phrased it the way I did :)
Edit: actually, no, I take that back, that's just very similar to some other research I was familiar with.