Important details from the FAQ, emphasis mine:
> For users who access Kiro with Pro or Pro+ tiers once they are available, your content is not used to train any underlying foundation models (FMs). AWS might collect and use client-side telemetry and usage metrics for service improvement purposes. You can opt out of this data collection by adjusting your settings in the IDE. For the Kiro Free tier and during preview, your content, including code snippets, conversations, and file contents open in the IDE, unless explicitly opted out, may be used to enhance and improve the quality of FMs. Your content will not be used if you use the opt-out mechanism described in the documentation. If you have an Amazon Q Developer Pro subscription and access Kiro through your AWS account with the Amazon Q Developer Pro subscription, then Kiro will not use your content for service improvement. For more information, see Service Improvement.
To opt out of sharing your telemetry data in Kiro, use this procedure:
1. Open Settings in Kiro.
2. Switch to the User sub-tab.
3. Choose Application, and from the drop-down choose Telemetry and Content.
4. In the Telemetry and Content drop-down field, select Disabled to disable all product telemetry and user data collection.
source: https://kiro.dev/docs/reference/privacy-and-security/#opt-ou...
Is there a way to confirm this works or do we just have to trust that settings will be honored?
Just like using an AI model, you can’t actually know for sure that it won’t do anything malicious with what interfaces you give it access to. You just have to trust it.
You could place some unique strings in your code, and test it to see if they appear as completions in future foundation models? Maybe?
I am nowhere near being a lawyer, but I believe the promise would be more legally binding, and more likely to be adhered to, if money was exchanged. Maybe?
The "Amazon Q Developer Pro" sub they mention appears to be very inexpensive. https://aws.amazon.com/q/pricing/
This brings up a tangential question for me.
Clearly, companies view the context fed to these tools as valuable. And it certainly has value in the abstract, as information about how they're being used or could be improved.
But is it really useful as training data? Sure, some new codebases might be fed in... but after that, the way context works and the way people are "vibe coding", 95% of the novelty being input is just the output of previous LLMs.
While the utility of synthetic data proves that context collapse is not inevitable, it does seem to be a real concern... and I can say definitively based on my own experience that the _median_ quality of LLM-generated code is much worse than the _median_ quality of human-generated code. Especially since this would include all the code that was rejected during the development process.
Without substantial post-processing to filter out the bad input code, I question how valuable the context from coding agents is for training data. Again, it's probably quite useful for other things.
The human/computer interaction is probably more valuable than any code they could slurp up. Its basically CCTV of people using your product and live-correcting it, in a format you can feed back into the thing to tell it to improve. Maybe one day they will even learn to stop disabling tests to get them to pass.
There is company, maybe even a YC company, which I saw posting about wanting to pay people for private repos that died on the vine, and were never released as products. I believe they were asking for pre-2022 code to avoid LLM taint. This was to be used as training data.
This is all a fuzzy memory, I could have multiple details wrong.
I suspect the product telemetry would be more useful - things like success of interaction vs requiring subsequent editing, success from tool use, success from context & prompt tuning parameters would be for valuable to the product than just feeding more bits into the core model.
This is the inevitable decline where we all eventually don't care anymore. Sorry, I'm a privacy holdout too, but this isn't the interesting part of what's happening. I tried Kiro and it is on par with Claude or Crystal, nothing special at all.
Within the next couple of years there's going to be a 4-for-1 discount on software engineers. Welcome to The Matrix. You'd best find Morpheus.