Hacker News

Training a model on a corpus which includes copyrighted images but which is not focussed primarily or exclusively on applications which violate copyright might be fair use in the US (so far, it seems that way.)

But that doesn't mean that producing outputs using the model so trained which are based on copyright-protected ones in ways which would violate copyright if produced by any other means doesn't still violate copyright. DMCA safe harbor might apply to the system owner (IIRC, the exact boundaries are fuzzy with UGC generated on the site by the provider’s systems rather than generated elsewhere and posted), so Google may not be liable for the infringement (though if it is actively searching for references online at generation and not relying on what is trained into the model, that would seem to weaken the case for that), but it's still an infringement.