This is the thing that gets me. The code compiles. Passes tests. So you stop reading it. Why wouldn't you.
Then three weeks later you're tracing some control flow that makes no sense and nobody knows why it's structured that way. Not you, not the model. I've been treating it like code from a contractor now, review every line same as a junior dev's PR. Gets tedious but the alternative is worse.
I’ve been treating it like a glorified autocomplete, or a glorified search and replace. Everything else is saxophone jazz when I’m writing for a string quartet: useful for inspiration, useful for understanding what isn’t clearly explained, sometimes it builds a decent first attempt, occasionally it gets shockingly close, but I’ve learned to never let my guard down. Go too far and untangling its slop becomes burdensome. Leave it to its own devices for more than a few rounds and it can become so unfixable it’s easier to start from scratch.