Most specifically, random wrong results. Some calculators have issues with rounding, but if you understand those issues, it's consistent.

Imagine driving your car, you turn right, but today turning right slams on the brakes, and 10 people rear end you! That's current AI.

This is why using LLMs to generate deterministic code (with human-verified tests) is a much better idea than including LLMs directly in runtime systems.