> To my understanding, if the material is publicly available or obtained legally (i.e., not pirated), then training a model with it falls under fair use.

Is this legally settled?

Yes. There have been multiple court cases affirming fair use.