Reviewing test code is arguably harder than reviewing implementation code because tests are enumerated success and failure scenarios. Some times the LOC of the tests is an order of magnitude larger than the implementation code.

The biggest place I've seen AI created code with tests produce a false positive is when a specific feature is being tested, but the test case overwrites a global data structure. Fixing the test reveals the implementation to be flawed.

Now imagine you get rewarded for shipping new features a test code, but are derided for refactoring old code. The person who goes to fix the AI slop is frowned upon while the AI slop driver gets recognition for being a great coder. This dynamic caused by AI coding tools is creating perverse workplace incentives.