Plenty of other companies enable this by default too, such as Github, Figma, Adobe, Vercel. I think it's fair to assume that if you ahve data stored within any company, they'll by default use it for training.

Maybe this will become The Year of the Self Hosted.

For stuff that I don't particularly care about privacy I've kept on the cloud (e.g. my blog, which is public anyway and as such is probably training bots regardless), but for stuff that I don't want to be used to train their models and/or sell to advertisers I have moved to be self hosted on my own network.

[deleted]

self hosting needs to be easier to set up for that to happen.

we're not far off it being good enough but it's not there yet.

Atlassian made self-hosting 'less easier' on purpose. They even discontinued their on-prem products.