Right, I know what you mean. If the parties are only breezing over the motion then it looks great and 95% of the time you'll get away with it, even though really it's ethically dubious. And that's a super hard one for a human to catch when reviewing LLM output. Especially because (certainly for me) you tend to get lazier and lazier reviewing the LLM output as they get "smarter."
I'm assuming you've just used some off-the-shelf ones like Claude or GPT? All the lawyers I know are just using those. I'd love to know what Lexis and Westlaw and other companies are serving that might mitigate some of these issues with better custom tuning or a better harness.