On a task by task basis the code Claude generates is pretty good these days. The biggest issue I see is that it wants to rearchitect the code constantly and I have no faith in my tests anymore because Claude will just "fix" them

I think some tests should be considered to be part of the specification rather than the product.