We've repeatedly seen that these test-driven LLM rewrites consistently produce absolute garbage.
Got any specific examples? I believe you, I'd just like some concrete examples to show my coworkers.
Sure.
https://www.theregister.com/software/2026/02/13/anthropics-a...
https://pivot-to-ai.com/2026/01/27/cursor-lies-about-vibe-co...
Thanks!
Got any specific examples? I believe you, I'd just like some concrete examples to show my coworkers.
Sure.
https://www.theregister.com/software/2026/02/13/anthropics-a...
https://pivot-to-ai.com/2026/01/27/cursor-lies-about-vibe-co...
Thanks!