I honestly can’t always distinguish AI slop from the formulaic corp-speak used in emails and memos and brochure websites and other marketing. I’m guessing that must be a large component of the training matter.

I don't think that's a coincidence. Right now a lot of the business proposition for LLM bots is selling it to corporations as the ultimate corporate yes-man.

I'd say the majority of the training data is reddit with zero care about whether it's from a "good" or "sarcastic" or "ironic" source.

That is because corp speak is usually management-slop. A content devoid of ... content whose whole purpose is to make the author look important.