> I think tests should be rewritten as much as needed.

Yes, I agree. The nuance is that they need to be rewritten independently and without touching the code. You can't change both and expect to get a working system.

I'm speaking based on personal experience, by the way. Today's LLMs don't enforce correctness out of the box and agent mode has only one goal: getting things to work. I had agent mode flip invariants in tests when trying to fix unit tests it broke, and I'm talking about egregious changes such as flipping requirements in line with "normal users should not have access to the admin panel" to "normal users should have access to the admin panel". The worst part is that if agent mode is left unsupervised, it will even adjust the CSS to make sure normal users have a seamless experience going through the admin panel.

Agreed that's a concern.

There could be some visual language for how recently changes happened to the LLM-generated tests (or code for TDD mode).. then you'd be able to see that a test failed and was changed recently. Would that help?