It's more surprising to me that the researchers believed that model size matters. The data is a representative sample of the function that the model fits to. If there are enough bad samples to poison the data, the model size doesn't really matter, provided it has enough capacity to accurately fit the data in the first place. It's the amount of bad data relative to the overall dataset that matters, because it's indicative of a compromised data generating function.
>It's the amount of bad data relative to the overall dataset that matters,
Isn't that the opposite of the findings here? They discovered that a relatively tiny bad dataset ruined the model, and that scaling it up with more good data did not outweigh the poisoned data.
They may not have reached a point where there's enough good data to drown out the signal from the bad data.