I wonder why they don't just prioritize the ~500 most popular of those content providers that are feeding them sludge articles, and write (AI-generate, even) logic to manually parse and transform said sludge into their format?

It'd be a big one-time lift; and of course there'd be annoying constant breakage to fix as sites update; but News.app could always fall back to rendering the original article URL if the News backend service's currently-deployed parser-transformer for a given site failed on the given article. It's make things no worse and often better than they are today.