There is going to be so many horror stories that will come from this, ie. Claude overpaid/underpaid my employees, Claude hallucinated the tax code and now the IRS is seizing my assets. etc…

Murphy’s Law is undefeated. Add in a psycophantic hallucination black box to critical business data and you have a recipe for hilarity.

Normies cannot be trusted to hand off these functions to an LLM because they are mostly incapable of verifying the outputs. Worse yet - these tools are actually idiocratizing the masses to the point they don’t even think they need to.

And of course Anthropic will never have any liability for marketing and selling tools that are unfit for purpose.

To be fair we are already having these kind of stories because of human mistakes or lack of competence. The question is like autonomous driving, is the rate going to be higher or lower or same.

I follow a twitter account that is basically dedicated to lawyers getting sanctioned for submitting hallucinations. The fines are currently shockingly low for the potential harm.

This. Payroll mistakes seem to be a common issue in the many companies I've worked for. Still can't believe they screw it up so often and also do such a poor job of correcting their errors.

I would not trust LLMs with the final word on anything financial.

Not exactly accounting, but ChatGPT (whatever the paid model was in March) told me that paying down principal early would have virtually no effect on interest over the remainder of the loan. It was confused by the fact that it was a short balloon with payments amortized using a 30 year schedule. I did the math by hand to check and told it it was incorrect and it gave me the classic “oh yeah, sorry about that”. It’s the type of thing where for someone that is knowledgeable about the domain, it wouldn’t pass the sniff test. I am not sure if LLMs have a sniff test.

I can’t imagine how hard this will hallucinate when there are layers of accounting, tax codes, etc. But who will notice when it sounds so convinced it is right?

I doubt an LLM is calculating withholding. I presume 99.9% of the actual logic will still execute in QuickBooks or Paychex etc. Lots of this sounds like cross system orchestration against well defined APIs. Yes, there's still danger, it could use the APIs wrong, but humans can use the GUI wrong too

Aren’t all of these problems solved just by Claude asking the user to confirm that $X should be paid.

Tell me you've never run a business without telling me you've never run a business. You'd be surprised how hard it can be to answer that question, especially when it comes to taxes and other dues. :o)

Certified AI Auditor jobs incoming.

Yes. Will be interesting to see how this evolves. Depending on the task, wouldn't be surprised if, between the cost of an AI tool and the cost/effort of auditing it, you go full circle and don't actually get an efficiency gain

I do think there is going to be an entire risk market for insuring against AI mistakes.

Nobody in their right mind would take that bet.

It's okay, the employee won't be checking their paystubs either. Too complicated. They'll ask Claude to do it for them. "Looks good, bro". Then they go to the bank and apply for a mortgage and guess what, Claude is there too and they get vibe-qualified for a mortgage!

If you thought society was just an imaginary collective delusion before, now it can be collective hallucination too.