Just because the citation exists, what the LLM says it stands for and what it actually stands for are not the same.

For testing, I've asked (admittedly last-gen) LLMs to generate legal opinions regarding issues in commercial English civil litigation, and I received back cases where the citation is real, but the area of law (family law) is not relevant as family courts apply a very different set of procedural rules.

(If you squint a bit, they sometimes might be relevant... and could be useful for a particularly creative litigator to make a novel argument on behalf of a very risk tolerant client. But you would very much want to go read those cases and think quite hard about them.)

Right, I know what you mean. If the parties are only breezing over the motion then it looks great and 95% of the time you'll get away with it, even though really it's ethically dubious. And that's a super hard one for a human to catch when reviewing LLM output. Especially because (certainly for me) you tend to get lazier and lazier reviewing the LLM output as they get "smarter."

I'm assuming you've just used some off-the-shelf ones like Claude or GPT? All the lawyers I know are just using those. I'd love to know what Lexis and Westlaw and other companies are serving that might mitigate some of these issues with better custom tuning or a better harness.