At least with LLM providers, they have your prompts and output, and if they wanted to, they could identify what code was AI generated or not.
Maybe they can be subpoenaed, maybe they can sell the data to parties who care like legal teams, maybe they can make it service anyone can plug a GitHub repo into, etc.